Distributed Computing Systems Group

Projects

Nebula: Using Distributed Voluntary Resources to Build Clouds
Current cloud infrastructures are important for their ease of use and performance. However, they suffer from several shortcomings. The main problem is inefficient data mobility due to the centralization of cloud resources. We believe such clouds are highly unsuited for dispersed-data-intensive applications, where the data may be spread at multiple geographical locations (e.g., distributed user blogs). Instead, we propose a new cloud model called Nebula: a dispersed, context-aware, and cost-effective cloud.

TIERa - A tiered cloud storage system
This project is exploring a policy-driven storage system that unifies all storage layers with a single interface, e.g. in the case of Amazon: Glacial, S3, EBS, Virtual disk, Memory, and Edge caches located outside of the cloud. In addition, our research is exploring application-level policies that decide which data to move, when to move it, and where to move it between the storage layers. An API is being designed to facilitate the construction of a wide-array of application-centric storage and caching policies. Policies may be applied to individual data objects or related data object collections. Such policies can be used to improve application and end-client performance and reliability.

Mobilizing the Cloud: Cloud-based Mobile Outsourcing
Mobile devices, such as smartphones and tablets, are becoming the universal interface to online services and applications. However, such devices have limited computational power, storage capacity, and battery life, which limits their ability to execute rich, resource-intensive applications. In this project, we are exploring the use of the cloud as a mobile application outsourcing platform, i.e., using cloud resources to execute the resource- and data-intensive components of mobile applications. Our research is exploring different ways to leverage the cloud for scalability, elasticity, and multi-user code/data sharing across a variety of applications.

Distributed Data-intensive Computing: Efficient Computing for Highly-Distributed Data
In a variety of commercial, scientific and social networking domains, data are increasingly being generated and stored in a geographically distributed manner. In order to generate knowledge quickly from the data, many modern applications need to process large amounts of such highly distributed data with low latency. However, a key question for efficient processing is how and where to carry out the computation, leading to questions about optimizing data placement, provisioning, and scheduling. This project is exploring such questions using a model-driven as well as an implementation approach using the Hadoop/MapReduce platform.