天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

DNA序列比對(duì)結(jié)果的存儲(chǔ)與壓縮

發(fā)布時(shí)間:2018-04-08 17:00

  本文選題:DNA序列比對(duì)結(jié)果 切入點(diǎn):存儲(chǔ) 出處:《復(fù)旦大學(xué)》2012年碩士論文


【摘要】:隨著生物信息學(xué)、分子生物學(xué)等學(xué)科研究的深入,以及人類(lèi)基因計(jì)劃的完成,越來(lái)越多的人類(lèi)基因和其他模式生命體的基因被測(cè)序。序列比對(duì)是處理測(cè)序結(jié)果的方法,可以發(fā)現(xiàn)生物序列之間存在的結(jié)構(gòu)、功能和進(jìn)化的關(guān)系,是生物信息學(xué)的基礎(chǔ)。 隨著這些測(cè)序項(xiàng)目的展開(kāi),每天都有海量的DNA序列數(shù)據(jù)產(chǎn)生,DNA序列數(shù)據(jù)經(jīng)過(guò)序列比對(duì)處理,比對(duì)結(jié)果數(shù)據(jù)也隨之出現(xiàn)。雖然存儲(chǔ)設(shè)備的快速發(fā)展已經(jīng)在一定程度上緩解了相關(guān)數(shù)據(jù)量急劇膨脹的問(wèn)題。然而隨著比對(duì)研究的深入,單純依靠增加硬件設(shè)備已經(jīng)無(wú)法滿足DNA比對(duì)結(jié)果數(shù)據(jù)量快速增長(zhǎng)的需求,存儲(chǔ)和使用這些數(shù)據(jù)的成本也終將增加至無(wú)法承擔(dān)的規(guī)模。 下一代測(cè)序技術(shù)平臺(tái)(NGS)在很大程度上減少了測(cè)序的成本開(kāi)銷(xiāo),使得基因序列分析在實(shí)踐醫(yī)療場(chǎng)景之中的應(yīng)用成為可能。因此,不論是從存儲(chǔ)方面,還是應(yīng)用方面考慮,序列比對(duì)結(jié)果的壓縮在DNA數(shù)據(jù)的存儲(chǔ)、管理和傳輸中起到了重要作用。DNA序列數(shù)據(jù)的壓縮目前已經(jīng)引起了國(guó)內(nèi)外學(xué)術(shù)界的廣泛關(guān)注,然而,很少有學(xué)者研究如何在實(shí)際醫(yī)療場(chǎng)景下壓縮比對(duì)結(jié)果;虮葘(duì)結(jié)果的存儲(chǔ)在未來(lái)的發(fā)展中仍面臨著巨大挑戰(zhàn)。 在本文中,我們從醫(yī)療場(chǎng)景的應(yīng)用角度出發(fā),設(shè)計(jì)出滿足需求的存儲(chǔ)結(jié)構(gòu),并在此基礎(chǔ)上設(shè)計(jì)出兩種不同的壓縮策略,以降低空間存儲(chǔ)代價(jià)。實(shí)驗(yàn)數(shù)據(jù)表明,當(dāng)覆蓋率提升時(shí),我們的壓縮方案略微優(yōu)于RAR標(biāo)準(zhǔn)壓縮和ZIP標(biāo)準(zhǔn)壓縮;谝陨戏椒ㄍ瓿闪恕癉NA序列比對(duì)結(jié)果存儲(chǔ)與壓縮系統(tǒng)”,系統(tǒng)實(shí)現(xiàn)了對(duì)海量DNA比對(duì)結(jié)果的存儲(chǔ),并提供了圖形化界面。
[Abstract]:With the development of bioinformatics, molecular biology and other subjects, and the completion of human gene project, more and more genes of human genes and other model organisms have been sequenced.Although the rapid development of storage devices has to some extent alleviated the problem of the rapid expansion of related data.However, with the deepening of the comparative research, it is no longer possible to meet the demand of increasing the amount of data from DNA comparison results simply by increasing the hardware devices, and the cost of storing and using these data will eventually increase to an unaffordable scale.The next generation sequencing technology platform (NGS) greatly reduces the cost of sequencing, which makes the application of gene sequence analysis in practical medical scenarios possible.Therefore, whether in terms of storage or application, the compression of sequence alignment results in the storage of DNA data,The compression of DNA sequence data plays an important role in the field of management and transmission. At present, the compression of DNA sequence data has attracted extensive attention in academic circles at home and abroad. However, few scholars have studied how to compress the results in actual medical scenarios.The storage of gene comparison results is still facing great challenges in the future.In this paper, we design a storage structure to meet the requirements from the perspective of medical scenarios, and then design two different compression strategies to reduce the cost of space storage.Experimental data show that our compression scheme is slightly better than that of RAR standard and ZIP standard when coverage increases.Based on the above methods, a "DNA sequence alignment result storage and compression system" is completed. The system realizes the storage of massive DNA alignment results, and provides a graphical interface.
【學(xué)位授予單位】:復(fù)旦大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2012
【分類(lèi)號(hào)】:TP333

【參考文獻(xiàn)】

相關(guān)期刊論文 前1條

1 張春霆;生物信息學(xué)的現(xiàn)狀與展望[J];中國(guó)青年科技;2001年01期

,

本文編號(hào):1722517

資料下載
論文發(fā)表

本文鏈接:http://www.sikaile.net/kejilunwen/jisuanjikexuelunwen/1722517.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶(hù)9b994***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com