A s the promise of genomics starts to pay off in the form of better care and innovative treatment, how will the underlying compute infrastructure that powers bioinformaticsthe analysis and interpretation of genomics data—change?
That’s the question that was top of mind when I recently spoke to Professor Ioannis Xenarios, Director, and Roberto Fabbretti , IT Manager at Vital-IT Group—the team that provides bioinformatics infrastructure to the SIB Swiss Institute of Bioinformatics. SIB is an academic, non-profit foundation that includes some 60 bioinformatics research and service groups and some 700 scientists from the major Swiss schools of higher education and research institutes. Our conversation served as the basis for a customer story on our life sciences page detailing how the team works with Quantum to provide access to critical genomics data.
The scientists at SIB are leaders in applied bioinformatics—the use of genomic information to fight disease, develop personalized medicines, improve crop yields, and more. Talking with Ioannis and Roberto gave me insight into what it takes to support the cutting-edge genomic research that stands to improve our quality of life.
1. APPLIED BIOINFORMATICS HAS THE POTENTIAL TO HELP EVERYONE
With a simple blood draw from the mother at 11 weeks, we can sequence the genetic material of the fetus in utero. It’s less invasive—and much less risky than traditional amniocentesis.
We started our conversation by talking about the cutting-edge research at SIB. Ioannis and Roberto told me about a recent project that focuses on prenatal testing for conditions like Down syndrome. Today, most women undergo amniocentesis, a diagnostic test that uses a needle to remove a small amount of amniotic fluid from the uterus. However, the new type of procedure developed at SIB avoids the risk of amniocentesis by sequencing the genome of the fetus through a much safer blood draw. And this is just one small sample of how genomics research is giving way to practical applications of the science. Personalized medicine, population genetics, the biology behind flavor perception, methods for increasing crop yield—the work at SIB aims to improve quality of life by understanding life’s code.
2. FASTER GENOMIC SEQUENCING = MORE GENOMIC SEQUENCING
Over the last few years, sequencing has become much faster. That means we are doing more projects than ever and our data is growing very rapidly.
According to Ioannis and Roberto, dramatic declines in the cost and run times for genome sequencing have accelerated the applied bioinformatics work at SIB. More genomic sequencing means more data—and the more genomics data SIB researchers can get, the more they can learn. Advances in instrumentation and ever more powerful analysis software result in more data, too.
The six different sequencing centers managed by SIB and projects coming in from about 300 active research teams generate a vast amount of genomic data. Sequencing runs take days, and the team typically processes five separate research projects each week—at up to 30TB a week, data adds up quickly.
3. APPLIED BIOINFORMATICS RESEARCH FROM CRADLE TO GRAVE
To scale our bioinformatics efforts to support tens of thousands of patients, we need to look for cost-effective ways to preserve genomic data for 20, 30, or 40 years of time—effectively creating a view of a patient from before birth to death.
The team at SIB needs to store their data for very long periods of time. Ioannis and Roberto described SIB as a steward for the data its researchers collect. Patients participating in cancer and immunotherapy research generate large amounts of genomic data. As each patient returns for subsequent parts of the study—as often as weekly—all the previously sequenced data needs to made easily and quickly available to researchers.
For SIB, a focus on applied bioinformatics means ensuring that the organization’s storage infrastructure is ready to support the preservation of patient sequencing and analysis data—from before birth until after death.
4. MULTI-TIER STORAGE FROM QUANTUM STORNEXT KEEPS GENOMICS DATA AVAILABLE FOR STUDY
When we started looking into solutions, Quantum StorNext was the only solution that really added value to our way of working. We didn’t need to change our infrastructure, and a single full-time employee can manage the storage infrastructure. That’s a huge benefit in making sure our budget stays focused on supporting our researchers.
From an IT perspective, Ioannis and Roberto need storage to stay out of the way of research. While high-performance storage is key to processing sequencing data quickly, they don’t want to dedicate a lot of resources to manually managing the storage infrastructure as data accumulates. The team at SIB has developed a multi-tier approach based on StorNext scale-out storage that keeps active data on primary storage for analysis and then automatically moves data into the long-term archive as it ages. Over 600 users access the sequencing data locally by tapping into the network in one of the SIB-affiliated data centers, as well as remotely through a CIFS interface. The entire storage infrastructure for these users is managed by a single team member, regardless of how much genomics data researchers generate.
5. EMPOWERING BIOINFORMATICIANS WITH SELF-SERVICE ACCESS TO GENOMICS DATA
If you provide researchers with the right set of tools, they push the envelope. They sequence 1,000 people, and you get a massive load of 800TB in a short few months. StorNext tiered storage helps us take data in fast, quickly move it to archive, and keep it ready so bioinformaticians can continue their work.
As SIB’s data moves from high-performance disk to archive storage, data movement is invisible to the researchers that use the data. Ioannis and Roberto explained that they try to make it as easy as possible for scientists to locate the files they depend on. Regardless of where the file resides in the data center, it still appears where the researchers expect it to be in the file system they see at their desktops. When researchers need the file, they simply double-click it—no IT support ticket necessary.
6. HOW CAN BIOINFORMATICIANS PROTECT VALUABLE GENOMICS DATA FOR DECADES?
We are dealing with some of the most valuable data sets on earth. StorNext gives us a multi-petabyte archive capability, long-term data protection, and the ability to easily roll back file versions—it’s a critically important part of that strategy.
In talking with Ioannis and Roberto, it’s clear how seriously they take their role as data stewards. Sequencing data is archived immediately, with an extra copy vaulted for an extra layer of protection. When the team designed the storage system to support SIB’s applied bioinformatics research, they made sure to choose a solution that had data integrity features built in. By doing so, they ensure researchers get data when they need it without risking read errors that could compromise research.
7. BUILD IN FLEXIBILITY TO ADAPT TO FUTURE TECHNOLOGY INNOVATIONS
The data that our researchers capture and analyze provides important answers today, but it also has the potential to be useful months or years later when new analytic applications can extract information from the same raw sequences.
The team at SIB is knows that bioinformatics research continues to advance, and that each new development can result in old data generating new insights. Storage infrastructure continues to evolve. So it’s important to Ioannis and Roberto that the tools they choose be flexible enough to serve data up to scientists while incorporating new storage technologies. With object storage being called the future of scientific data and more organizations taking advantage of cloud, SIB needs to be ready to adapt to incorporate the latest technologies.
SWISS INSTITUTE OF BIOINFORMATICS FINDS INTELLIGENT DATA MANAGEMENT WITH QUANTUM
While every genomics organization is different, they all face the same pressure when it comes to managing vast amounts of data. How are you scaling storage alongside growth? Can scientists continue to access genomics data without manual work on behalf of IT? Is your storage infrastructure ready to incorporate new technologies? SIB may have more insight for you—read the full case study, then visit www.quantum.com/lifesciences to learn more about how Quantum can help you better manage your genomics and bioinformatics data.