Open Source DataTurbine Meets Virtualization
We are designing Data Center cyberinfrastructure (CI) for streaming data applications. The Data Center cyberinfrastructure is designed according to modern enterprise computing concepts adapted to the requirements, challenges, and design principles described earlier. It is an open architecture, with modular open-source software components. It provides well-defined services that are largely autonomous, employing concepts from Service-Oriented Architecture design and Cloud Computing. The major modules in the Data Center architecture are implemented using mature open-source software packages and are managed in a Cloud-Computing environment. Cloud computing (i.e., the integration of virtualization, infrastructure as a service (IaaS), software as a service (SaaS), and utility computing) provides a scalable and cost-efficient computing paradigm. We are utilizing Xen hypervisor to provide the virtualization aspect of the cloud-computing environment. Open Source DataTurbine is running on a server that employs Xen for virtualization.
Xen is a state-of-the-art open source virtual machine monitor (hypervisor) that allows one physical machine to run multiple virtual machines. Virtualization offers several benefits, including reduction in machine idle time (better use of existing hardware), reduction in IT infrastructure costs, simplified system administration (less physical machines to manage), and increased uptime and faster failure recovery (since virtual machines are not bound to any physical machine, they are more portable and can be efficiently migrated across physical machines). Xen supports applications that require elasticity, i.e., the ability to acquire and release compute resources as demand waxes and wanes. For example, under periods of increased demand, virtual machines can be migrated from low-capacity nodes to high-capacity nodes. As the demand is reduced, the system can migrate back and free up the high-capacity nodes. This type of behaviour is a good match for operational scenarios where demand fluctuates in response to (1) environmental events, e.g., increased sampling or model activation based on observed phenomena, and (2) periodic increases in user demand due to collaborative experiments or training sessions.