High-Performance I/O

Today's scientific applications demand that high-performance I/O be part of their operating environment. These applications access datasets of many gigabytes or terabytes, checkpoint frequently, and create large volumes of visualization data. Such applications are hamstrung by bottlenecks anywhere in the I/O path, including the storage hardware, file system, low-level I/O middleware, application level interface, and in some cases the mechanism used for Grid I/O access.

Further, most of the parallel I/O software systems in use today have inherent scalability limitations in their designs. Even in cases where specifications do not preclude performance at ultra scale, implementations do not in many cases leverage opportunities inherent in the specifications. However, with increased scale comes the specter of increased hardware and software faults and management complexity.

Our work in high-performance I/O addresses both the immediate issue of providing efficient I/O solutions that meet the needs of scientific applications today and investigates solutions necessary to scale to the next generation of machines and applications. We do not constrain our work to a particular level in the I/O software hierarchy; rather we explore the use of carefully chosen abstractions and architectures throughout the I/O software stack that allow components to compliment each other.

Projects on High-Performance I/O

ROMIO
ROMIO is a portable, high-performance implementation of the I/O part of the MPI-2 standard (also known as MPI-IO). ROMIO is used by many MPI implementations, including MPICH, LAM/MPI, Cray, HP, SGI, and others.
PVFS
The Parallel Virtual File System is a joint project with Clemson University and is developing a high-performance, parallel file system for Linux clusters.
Scientific Data Management SciDAC
This is a multi-institution effort as part of the DOE "Scientific Discovery through Advanced Computation" program that seeks to vastly improve the ability of application scientists to manage and analyze their data.
Parallel NetCDF
The Parallel NetCDF effort is a joint project with Northwestern University to develop an API and implementation for high-performance, parallel access to netCDF datasets.
Parallel Benchmarking Consortium
The Parallel Benchmarking Consortium is a multi-institution effort to develop a common parallel I/O benchmarking suite for use in I/O system analysis.

Software Downloads

These links will take you to the download pages for software developed by the MCS Division
ROMIO
PVFS
Parallel NetCDF

Papers on High-Performance I/O

Parallel netCDF: A Scientific High-Performance I/O Interface (BibTex)
Jianwei Li, Wei-keng Liao, Alok Choudhary, Robert Ross, Rajeev Thakur, William Gropp, and Rob Latham. Submitted to SC03.

Working Notes

Noncontiguous MPI-IO Performance on PVFS
Rob Latham and Rob Ross

Other References

This category includes papers about our software and projects written by others.
The Parallel Virtual File System for High-Performance Computing Clusters
Monica Kashyap; Jenwei Hsieh, Ph.D.; Christopher Stanton; and Rizwan Ali. Review of PVFS published by Dell.