Projects
The ZeptoOS project has several parts. Some are packages that are distributed and supported on several HPC platforms, and others are just whiteboard designs waiting for caffeine-addicted programmers. An overview of each of the ZeptoOS projects is provided below.ZeptoOS I/O Node Kernel
On machines such as IBM's BG/L and Cray's XT3, I/O nodes handle the I/O needs of the compute nodes, and in some cases perform significant management activities, such as booting the compute node kernels or handling system calls not supported by the compute nodes.At this time, we have a ZeptoOS I/O node kernel and ramdisk for BG/L systems. At Argonne, we are running ZeptoOS on our 1024-node BG/L. Several other sites are also running ZeptoOS in production. ZeptoOS for BG/L includes a drop-in replacement I/O Node (ION) Linux kernel that you can compile from scratch (or simply use our pre-compiled version), an enhanced yet smaller and faster-booting ramdisk, and lots of tools.
Running ZeptoOS components on BlueGene/L makes your system easier to debug, customize, and enhance.
The Selfish Benchmark Suite
Massively parallel computers use compute node operating systems that are either special purpose light-weight kernels or mostly commodity kernels that have been downsized to reduce extraneous activity. For operating systems that support multi-processing or interrupts, the user's application may share the CPU cycles with other processes, kernel tasks, and device drivers. From the user's perspective, any CPU cycles diverted from their application reduce the maximum achievable performance and processor efficiency. We call these "detours". In some cases, they can dramatically affect the performance of collective operations.The Selfish Benchmark Suite is designed to measure the detours, the fraction of time the CPU spends executing instructions not part of the user's application. The info can be "recorded", like a TiVo, and played back, inserting detours into an application to explore system performance.
ZOID
Petascale architectures decompose functions across multiple nodes. Compute nodes can't do everything on their own - they need to delegate some system calls and file or I/O operations to specialized I/O nodes and management nodes. Zoid is an Open Source function call forwarding software that can be optimized for collective behavior and adjustable consistency semantics.On BlueGene, Zoid can act as a functional replacement for IBM's closed source CIOD, in some cases offering significant performance improvements thanks to its high-performance, multithreaded architecture. Zoid is also easily extensible, making it possible to forward custom function calls between compute nodes and I/O nodes. This capability can be used for, e.g., real-time data streaming.
ZeptoOS: The Small Linux for Big Computers