Philip Guo (Phil Guo, Philip J. Guo, Philip Jia Guo, pgbovine)

CDE: Run Any Linux Application On-Demand Without Installation

research paper summary
CDE: Using System Call Interposition to Automatically Create Portable Software Packages. Philip J. Guo and Dawson Engler. USENIX Annual Technical Conference, short paper, 2011.
It can be painfully hard to take software that runs on one person's machine and get it to run on another machine. Online forums and mailing lists are filled with discussions of users' troubles with compiling, installing, and configuring software and their myriad of dependencies. To eliminate this dependency problem, we created a system called CDE that uses system call interposition to monitor the execution of x86-Linux programs and package up the Code, Data, and Environment required to run them on other x86-Linux machines. Creating a CDE package is completely automatic, and running programs within a package requires no installation, configuration, or root permissions. Hundreds of people in both academia and industry have used CDE to distribute software, demo prototypes, make their scientific experiments reproducible, run software natively on older Linux distributions, and deploy experiments to compute clusters.
@inproceedings{GuoCdeUsenix2011,
 author = {Guo, Philip J. and Engler, Dawson},
 title = {{CDE}: Using System Call Interposition to Automatically Create Portable Software Packages},
 booktitle = {Proceedings of the 2011 USENIX Annual Technical Conference},
 series = {USENIX'11},
 year = {2011},
 location = {Portland, OR},
 url = {http://dl.acm.org/citation.cfm?id=2002181.2002202},
 acmid = {2002202},
 publisher = {USENIX Association},
 address = {Berkeley, CA, USA},
}
CDE: Run Any Linux Application On-Demand Without Installation. Philip J. Guo. USENIX Large Installation System Administration Conference (LISA), 2011.
There is a huge ecosystem of free software for Linux, but since each Linux distribution (distro) contains a different set of pre-installed shared libraries, filesystem layout conventions, and other environmental state, it is difficult to create and distribute software that works without hassle across all distros. Online forums and mailing lists are filled with discussions of users' troubles with compiling, installing, and configuring Linux software and their myriad of dependencies. To address this ubiquitous problem, we have created an open-source tool called CDE that automatically packages up the Code, Data, and Environment required to run a set of x86-Linux programs on other x86-Linux machines. Creating a CDE package is as simple as running the target application under CDE's monitoring, and executing a CDE package requires no installation, configuration, or root permissions. CDE enables Linux users to instantly run any application on-demand without encountering "dependency hell".
@inproceedings{GuoCdeLisa2011,
 author = {Guo, Philip J.},
 title = {{CDE}: Run Any {Linux} Application On-demand Without Installation},
 booktitle = {Proceedings of the 25th International Conference on Large Installation System Administration},
 series = {LISA'11},
 year = {2011},
 location = {Boston, MA},
 url = {http://dl.acm.org/citation.cfm?id=2208488.2208490},
 acmid = {2208490},
 publisher = {USENIX Association},
 address = {Berkeley, CA, USA},
}

(This summary was adapted from the CDE project home page, which was created in late 2010, and a 2013 CACM blog post.)

People now use thousands of different releases of hundreds of Linux distros (distributions). Each individual distro release contains a different set of pre-installed software, libraries, and kernel configurations. A ubiquitous problem is that software created on one distro often fail to run on other distros – and even on other releases of the same distro – due to incompatibilities in software, library, and kernel versions.

For example, if you are a scientist writing a piece of research software on a Linux-based OS, it's often difficult for your colleagues to install, configure, and run that software on their computers unless they are using the exact same release of the same Linux distro that you're using. This lack of software portability makes it harder for colleagues to reproduce and extend your work, thus hindering scientific progress.

I created a piece of software called CDE that alleviates such cross-distro software incompatibility problems. You can use CDE to package up your software and all of its dependencies in a portable format that runs on just about any Linux distro released within approximately five years of the distro you're using.

Specifically, CDE automatically packages up the Code, Data, and Environment required to deploy and run your Linux programs on other machines without any installation or configuration. It's the easiest way to completely eliminate dependency hell.

To use CDE, download it and follow these three steps:

Step 1. Package

Prepend any set of Linux commands with the "cde" command, and CDE will run them and automatically package up all files (e.g., executables, libraries, plug-ins, config/data files) accessed during execution.

Step 2. Deliver

A package is simply a directory that can be compressed and delivered to any x86-Linux machine. It contains all the files and environment variables required to run your original commands. Packages can range from 10 to 100 MB in size.

Step 3. Run

After receiving the package, the user can now run those same commands from within the package on any modern x86-Linux distro. The user does not need to first compile, install, or configure anything.


CDE implements a form of lightweight application virtualization that allows you to easily distribute portable software, to deploy applications to the cloud, to make computational experiments reproducible, and to run software on non-native Linux distros without conflicts.

CDE delivers on one simple promise: If you can run a set of commands on your Linux machine, then CDE allows others to easily re-run those same commands on their Linux machines.

To enable Windows and Mac users to run your CDE packages, you can embed them within a virtual machine (e.g., using a lightweight distro like Tiny Core Linux).

An astute reader will notice that CDE packages might be incomplete since they only contain the files accessed on executed paths. It's easy to manually augment packages with additional files to make them complete.

Demos

This 4-minute screencast shows what CDE can do:


Read the papers for details:

CDE: Using System Call Interposition to Automatically Create Portable Software Packages. Philip J. Guo and Dawson Engler. USENIX Annual Technical Conference, short paper, 2011.
It can be painfully hard to take software that runs on one person's machine and get it to run on another machine. Online forums and mailing lists are filled with discussions of users' troubles with compiling, installing, and configuring software and their myriad of dependencies. To eliminate this dependency problem, we created a system called CDE that uses system call interposition to monitor the execution of x86-Linux programs and package up the Code, Data, and Environment required to run them on other x86-Linux machines. Creating a CDE package is completely automatic, and running programs within a package requires no installation, configuration, or root permissions. Hundreds of people in both academia and industry have used CDE to distribute software, demo prototypes, make their scientific experiments reproducible, run software natively on older Linux distributions, and deploy experiments to compute clusters.
@inproceedings{GuoCdeUsenix2011,
 author = {Guo, Philip J. and Engler, Dawson},
 title = {{CDE}: Using System Call Interposition to Automatically Create Portable Software Packages},
 booktitle = {Proceedings of the 2011 USENIX Annual Technical Conference},
 series = {USENIX'11},
 year = {2011},
 location = {Portland, OR},
 url = {http://dl.acm.org/citation.cfm?id=2002181.2002202},
 acmid = {2002202},
 publisher = {USENIX Association},
 address = {Berkeley, CA, USA},
}
CDE: Run Any Linux Application On-Demand Without Installation. Philip J. Guo. USENIX Large Installation System Administration Conference (LISA), 2011.
There is a huge ecosystem of free software for Linux, but since each Linux distribution (distro) contains a different set of pre-installed shared libraries, filesystem layout conventions, and other environmental state, it is difficult to create and distribute software that works without hassle across all distros. Online forums and mailing lists are filled with discussions of users' troubles with compiling, installing, and configuring Linux software and their myriad of dependencies. To address this ubiquitous problem, we have created an open-source tool called CDE that automatically packages up the Code, Data, and Environment required to run a set of x86-Linux programs on other x86-Linux machines. Creating a CDE package is as simple as running the target application under CDE's monitoring, and executing a CDE package requires no installation, configuration, or root permissions. CDE enables Linux users to instantly run any application on-demand without encountering "dependency hell".
@inproceedings{GuoCdeLisa2011,
 author = {Guo, Philip J.},
 title = {{CDE}: Run Any {Linux} Application On-demand Without Installation},
 booktitle = {Proceedings of the 25th International Conference on Large Installation System Administration},
 series = {LISA'11},
 year = {2011},
 location = {Boston, MA},
 url = {http://dl.acm.org/citation.cfm?id=2208488.2208490},
 acmid = {2208490},
 publisher = {USENIX Association},
 address = {Berkeley, CA, USA},
}
Related pages tagged as research paper summary:
Related pages tagged as data science:
Related pages tagged as software: