Ravi is a computational scientist in the Mathematics and Computer Science division at Argonne National Laboratory and a senior research fellow at the Computation Institute at the University of Chicago. In particular, he leads the Globus Genomics project (www.globus.org/genomics), which is widely used for genomics, proteomics, and other biomedical computations on Amazon cloud and other platforms. See https://www.globus.org/genomics/publications for some relevant publications. He also architected the Globus Galaxies platform that underpins Globus Genomics and several other cloud-based gateways. Ravi Madduri was a key member in multiple NIH, NSF and DoE projects

Publications

Journal Articles (Life Sciences)


1. Reproducible Big Data Science: A Case Study in Continuous FAIRness
R Madduri et al. PLoS One 14 (4), e0213013. 2019. PMID 30973881.
Big biomedical data create exciting opportunities for discovery, but make it difficult to capture analyses and outputs in forms that are findable, accessible, interoperab …

2. Developing a Framework for Digital Objects in the Big Data to Knowledge (BD2K) Commons: Report From the Commons Framework Pilots Workshop
KM Jagodnik et al. J Biomed Inform 71, 49-57. Jul 2017. PMID 28501646.
The volume and diversity of data in biomedical research have been rapidly increasing in recent years. While such data hold significant promise for accelerating discovery, …

3. A Novel MERTK Mutation Causing Retinitis Pigmentosa
H Al-Khersan et al. Graefes Arch Clin Exp Ophthalmol 255 (8), 1613-1619. Aug 2017. PMID 28462455.
Our study identifies a novel nonsense mutation in MERTK in a family with RP and no prior molecular diagnosis. The present study also demonstrates the clinical value of ex …

4.Predictive Big Data Analytics: A Study of Parkinson's Disease Using Large, Complex, Heterogeneous, Incongruent, Multi-Source and Incomplete Observations
ID Dinov et al. PLoS One 11 (8), e0157077. 2016. PMID 27494614.
Model-free Big Data machine learning-based classification methods (e.g., adaptive boosting, support vector machines) can outperform model-based techniques in terms of pre …

5. LINE1 Insertions as a Genomic Risk Factor for Schizophrenia: Preliminary Evidence From an Affected Family
G Guffanti et al. Am J Med Genet B Neuropsychiatr Genet 171 (4), 534-45. Jun 2016. PMID 26990047. - Case Reports
Recent studies show that human-specific LINE1s (L1HS) play a key role in the development of the central nervous system (CNS) and its disorders, and that their transpositi …

6.Models and Simulations as a Service: Exploring the Use of Galaxy for Delivering Computational Models
MA Walker et al. Biophys J 110 (5), 1038-43. 2016. PMID 26958881.
We describe the ways in which Galaxy, a web-based reproducible research platform, can be used for web-based sharing of complex computational models. Galaxy allows users t …

7. A Case Study for Cloud Based High Throughput Analysis of NGS Data Using the Globus Genomics System
K Bhuvaneshwar et al. Comput Struct Biotechnol J 13, 64-74. 2014. PMID 26925205.
Next generation sequencing (NGS) technologies produce massive amounts of data requiring a powerful computational infrastructure, high quality bioinformatics software, and …

8. Big Biomedical Data as the Key Resource for Discovery Science
AW Toga et al. J Am Med Inform Assoc 22 (6), 1126-31. Nov 2015. PMID 26198305.
Modern biomedical data collection is generating exponentially more data in a multitude of formats. This flood of complex data poses significant opportunities to discover …

9.Experiences Building Globus Genomics: A Next-Generation Sequencing Analysis Service Using Galaxy, Globus, and Amazon Web Services
RK Madduri et al. Concurr Comput 26 (13), 2266-2279. 2014. PMID 25342933.
We describe Globus Genomics, a system that we have developed for rapid analysis of large quantities of next-generation sequencing (NGS) genomic data. This system achieves …

10. Consensus Genotyper for Exome Sequencing (CGES): Improving the Quality of Exome Variant Genotypes
V Trubetskoy et al. Bioinformatics 31 (2), 187-93. 2015. PMID 25270638.
We apply CGES to 132 samples sequenced at the Hudson Alpha Institute for Biotechnology (HAIB, Huntsville, AL) using the Nimblegen Exome Capture and Illumina sequencing te …

11.Cloud-based Bioinformatics Workflow Platform for Large-Scale Next-Generation Sequencing Analyses
B Liu et al. J Biomed Inform 49, 119-33. Jun 2014. PMID 24462600.
Due to the upcoming data deluge of genome data, the need for storing and processing large-scale genome data, easy access to biomedical analyses tools, efficient data shar …

12.Enabling Collaborative Research Using the Biomedical Informatics Research Network (BIRN)
KG Helmer et al. J Am Med Inform Assoc 18 (4), 416-22. Jul-Aug 2011. PMID 21515543.
BIRN's mission is to provide capabilities and services related to data sharing to the biomedical research community. It does this by forming partnerships and solving spec …

13.CaGrid Workflow Toolkit: A Taverna Based Workflow Tool for Cancer Grid
W Tan et al. BMC Bioinformatics 11, 542. 2010. PMID 21044328.
By extending the Taverna Workbench, caGrid Workflow Toolkit provided a comprehensive solution to compose and coordinate services in caGrid, which would otherwise remain i …

14.Utilisation of a Thoracic Oncology Database to Capture Radiological and Pathological Images for Evaluation of Response to Chemotherapy in Patients With Malignant Pleural Mesothelioma
GB Carey et al. BMJ Open 2 (5). 2012. PMID 23103606.
The investigation described herein demonstrates the successful implementation of this novel tandem imaging database infrastructure, as well as the potential utility of in …

15. A Comparison of Using Taverna and BPEL in Building Scientific Workflows: The Case of caGrid
W Tan et al. Concurr Comput 22 (9), 1098-1117. 2010. PMID 20625534.
With the emergence of "service oriented science," the need arises to orchestrate multiple services to facilitate scientific investigation-that is, to create "science work …

16. e-Science, caGrid, and Translational Biomedical Research
J Saltz et al. Computer (Long Beach Calif) 41 (11), 58-66. Nov 2008. PMID 21311723.
Translational research projects target a wide variety of diseases, test many different kinds of biomedical hypotheses, and employ a large assortment of experimental metho …

17.caGrid 1.0: A Grid Enterprise Architecture for Cancer Research
S Oster et al. AMIA Annu Symp Proc 2007, 573-7. 2007. PMID 18693901.
caGrid is the core Grid architecture of the NCI-sponsored cancer Biomedical Informatics Grid (caBIG) program. The current release, caGrid version 1.0, is developed as the …

18. A Roadmap for caGrid, an Enterprise Grid Architecture for Biomedical Research
J Saltz et al. Stud Health Technol Inform 138, 224-37. 2008. PMID 18560123.
caGrid is a middleware system which combines the Grid computing, the service oriented architecture, and the model driven architecture paradigms to support development of …

19. caGrid 1.0: An Enterprise Grid Infrastructure for Biomedical Research
S Oster et al. J Am Med Inform Assoc 15 (2), 138-49. Mar-Apr 2008. PMID 18096909.
While caGrid 1.0 is designed to address use cases in cancer research, the requirements associated with discovery, analysis and integration of large scale data, and coordi …