Professor Chris Danforth (Principle Investigator) and four other ̽̽ Co-PIs (J.P. O'Neil-Dunne, J. Li, M.T. Niles, H. Garavan) were awarded a National Science Foundation Major Research Instrumentation grant for a project titled “Acquisition of a Massive Database to Accelerate Data Science Discovery”. The funding, over $1M from NSF and ̽̽, will enable construction of DataMountain, a machine with 64TB of RAM
The large-memory machine will enhance the Vermont Advanced Computing Core, a virtual laboratory supporting the research of over 500 scientists in the state of Vermont. With so many fields transitioning from data-scarce to data-rich environments, many important research areas will benefit from this new machine including research into addiction, mental illness, climate change, drug discovery, food systems, and the spread of online misinformation. DataMountain will allow for fast access to enormous datasets, supporting several projects that require computational power and speed to effectively analyze, describe, and explain rapidly growing datasets.
DataMountain will increase by nearly two orders of magnitude the largest random access memory machine available for computational research at ̽̽, accelerating large-scale data-driven research requiring rapid reading and writing, and facilitating a broad and diverse set of important scientific investigations not currently possible given the existing hardware. It will also enhance the functionality of the high performance computing clusters BlueMoon and DeepGreen, which are dedicated to parallel processing and machine learning respectively. For example, the machine will allow for interactive access to over 50 terabytes of social media data through and for timely analysis of changes related to the COVID-19 pandemic in population-scale physical and mental health data. In addition, DataMountain will allow for massive increases in the spatial and temporal resolution of computational chemistry simulations being performed for data-driven design of next-generation antimicrobial peptides to combat antibiotic resistance. DataMountain will also enable exploration of petabytes of fMRI, genetic, task performance, and survey data associated with 10,000 adolescents across the United States over the next decade. In addition, the machine will accelerate research using unmanned aerial surveillance imaging for tree canopy assessments, facilitate network science modeling of agricultural diversity of crops and nutritional outcomes globally, and help quantify the impacts of the COVID-19 pandemic on food insecurity.