September 2003


DNA research requires high transmission & storage capacity

Biomedical research is one of the leading edge areas of research and advanced education that increasingly requires the ability to send, receive and share massive amounts of data at very high speeds.

For example, Genome research involves the creation of large amounts of data, so much data, in fact, that much of it is discarded rather than stored because researchers don't have adequate storage or transmission capacity.

ORION's recent partnership with the SHARCNET project has the added advantage of not only moving more data faster it will also increase computational speed and storage capacity of SHARCNET.

"ORION gives SHARCNET the underlying communication capability to create large shared storage across its member institutions" explains Chair of the Department of Computer Science Mike Bauer, at the University of Western Ontario. "Once in place, it will provide users much greater flexibility in how data is stored and shared."

Researchers will be able to choose to work with gigabytes of information from a remote location, or choose to move that information to a more convenient location if it better suits the type of research they're doing.

"Without the communication infrastructure that ORION provides this type of storage development would not be possible," adds Bauer.

ORION will also greatly enhance SHARCNET's already superior computational speed. For example, comparing genome information, such as nucleotide sequences, involves very complex computational analysis and comparison of vast amounts of data. Understanding how genes work or how they interact at almost an atomic level is the kind of research that leads to pharmaceutical discoveries or medical breakthroughs.

"Before ORION, the process could take days," says Bauer. "With ORION's speed, the same computations will now take mere hours."

DNA research inherently requires massive amounts of data to be created, transferred and accessed beyond the confines of SHARCNET.

"ORION's high-bandwidth networking will provide the fast, efficient access to and transfer of the information needed for the collaborative nature of this research I do, " says Brian Golding, Professor of Biology at McMaster University.

Sharing of microarray data is one example of research that is undertaken at McMaster that requires high bandwidth to perform at peak efficiency.

Microarray research involves using tiny pins to spot a bit of DNA on to a slide. RNA with a fluorescent marker is then hybridized to the DNA on the slide. RNA can be added from healthy tissue with one fluorescent marker and RNA from cancerous tissue can be added with a different fluorescent marker. Differences in the intensities of the two markers show what genes are expressed more in one tissue source than in the other.

"Thousands of these spots are made on just one slide and over half-a-million data points can be accumulated in a single afternoon," explains, Golding. "The immediate result of this research is a massive amount of data on gene expression that describes a snapshot of the activity inside the cells.

Because the task of identifying and mapping the genome structure of all life is such an incredibly gargantuan task, a high level of cooperation is required among life scientists.

This includes a necessity that research data be made available to researchers and students from around the world. An example of this is the National Centre for Biotechnology Information at Bethesda Maryland which hosts a database of all DNA sequences that are currently known. It stores and provides everyone with access to over 30 billion nucleotides. The data can be searched and/or downloaded at anytime by anyone.

Obviously, some DNA research has a very high privacy factor.

"ORION provides a secure method of transferring hundreds of gigabytes of very sensitive data and applications very quickly between collaborating researchers," says University of Guelph's Prof. Deb Stacey, Department of Computing and Information Science. "Previous methods of transferring data, including the use of couriers to take CDs from one institution to another, was risky."

Digitized medical and biological data other than DNA research also involves terabytes of information¾each terabyte being equivalent to twenty thousand four-drawer filing cabinets of information.

"Some institutions have excellent libraries of information but the only way researchers from other institutions are able to use the information is to physically go to the host institution" observes Stacey.

"For example, a research project in mammography images ¾ especially one that studies the time-based images or images taken over a period of time ¾ may require one to two terabytes of data," explains Stacey. "The images have to be of a high enough resolution to be useful and to retain confidence.

"With ORION, data has become dramatically more accessible."


Back to Headlines