Customized Computational Environment for Investigations and Compression of Genomic Data
|
|
Author:
|
RAVEENDRA GUDODAGI, MOHAMMED RIYAZ AHMED, R.VENKATA SIVA REDDY
|
Abstract:
|
Because of the considerable amount of human genome sequence data files (from 30 GB to 200 GB subjected to exposure) Genomic data compression has received huge momentum and one of the major problems faced by genomics laboratories is storage costs. This situation calls for a new data compression technique, which not only reduces the storage but makes the process efficient. Few attempts have been made in this regard to solve this problem from both hardware and software domains independently. In this review we advocate the need of a tailor-made hardware and software ecosystem which will exploit the current stand-alone solutions to the fullest. It is only when the sophisticated software runs on a state-of-the-art hardware, the indispensable problem of huge storage can be solved. The three major steps of genomic data compression are extraction of data, storage of data, and retrieval of the data. Hence, we propose a novel scheme based on computational optimization techniques which will be efficient in all the three stages of data compression.
|
Keyword:
|
genomic data compression, large scale DNA sequencing, FASTA, FASTQ, BAM, gene identification, gene analysis.
|
EOI:
|
-
|
DOI:
|
https://doi.org/10.31838/ijpr/2020.SP2.423
|
Download:
|
Request For Article
|
|
|