The UNM Center for Advanced Research Computing (CARC), the UNM Information Technologies (IT) Department, the UNM Cancer Center (UNMCC), and UNM Health Sciences Center IT have collaborated to establish a dedicated, high-throughput, cross-campus link between the Cancer Research Facility (CRF) and CARC.

The new 10 Gbps link connects next-generation genome sequencers, located in the Analytic and Translational Genomics Core facility of the CRF, to advanced computers and the Research Storage Consortium (RSC) petascale disk array housed at CARC (1 petabyte is approximately 1 million gigabytes, or the equivalent of roughly 400 billion pages of text). The new link enables fast, reliable, and secure transfer of enormous genome sequence files from the UNM Cancer Center for analysis and subsequent data warehouse archiving.

Dr. Scott Ness, professor of Internal Medicine and UNM Cancer Center associate director for Shared Resources, said, “The UNM Cancer Center is entering an era where we will routinely generate massive sequencing datasets at unprecedented levels of genomic detail. The new dedicated network link will enable us to focus on the science of genomic analysis and the elucidation of cancer biology, unfettered by data transfer and computing constraints.”  

“This project is part of UNM’s larger direction to collaborate across campuses and expand network infrastructure for research here and statewide,” said Chief Information Officer Gil Gonzales. UNM IT works closely with departments and Centers at UNM, and with research institutions throughout New Mexico, to provide production, commodity, and research network services.

Genomic research data generated using UNM Cancer Center next-gen sequencers will be stored on the Research Storage Consortium HP x 9000/7400 mass storage array shown here, and analyzed on CARC supercomputers.
Genomic research data generated using UNM Cancer Center next-gen sequences will be stored on the Research Storage Consortium HP x 9000/7400 mass storage array shown here, and analyzed on CARC supercomputers.

This point-to-point connection is a first step toward establishing a campus-wide research network at UNM. The connection is based on the “Science DMZ” model formalized by the Department of Energy’s ESNet in 2010. The new link delivers a low-latency, high-bandwidth, unfiltered connection via UNM’s campus network. End-to-end Jumbo Frames and tuned servers at source and destination points enable large packet transfers and increased data transfer speeds. A firewall that HSC IT established at the CRF isolates all sequencer traffic from the Health Sciences Center production network.

Genomic research data generated using UNM Cancer Center next-gen sequencers will be stored on the Research Storage Consortium HP x 9000/7400 mass storage array shown here, and analyzed on CARC supercomputers

“Establishing this high-performance, dedicated link from the UNM Cancer Center to the campus supercomputer center will enable cutting-edge genomic research and paves the way for ‘big data’ links to support research in other science and engineering disciplines,” said CARC Director Susan Atlas.

The new link saw ‘first light’ earlier this month, clocking a 1.8 terabytes/hr transfer rate corresponding to the very high percentage of peak performance that will be necessary to support sustained transfer of terabyte genomic files. The genomic data is being analyzed on the Cancer Center ‘Deepthought’ cluster housed at CARC, and the Ulam parallel supercomputer, recently donated to CARC by the New Mexico Consortium.

For more information, contact:
UNM IT: Gil Gonzales, email - gonzgil@unm.edu; ph. 505.277.8125
CARC: Susan R. Atlas, email - susier@unm.edu; ph. 505.277.8249
UNMCC: Dr. Cheryl L. Willman, director, UNM Cancer Center, email - CWillman@salud.unm.edu; ph. 505.272.5622 or Dr. Scott Ness, email - SNess@salud.unm.edu; ph. 505.272.9883.