遺傳疾病突變的數(shù)據(jù)挖掘分析
[Abstract]:Because of the development of technology and the reduction of cost, genome sequencing has been applied in Mendelian genetic diseases, complex diseases, and cancer gene detection, and produced a large amount of sequencing data. These data are important for studying the pathogenesis, clinical diagnosis and individualized treatment of disease. The molecular pathogenesis of more than 4000 human genetic diseases is unclear. Studies have shown that the mechanism of genetic diseases is closely related to variable splicing, splicing site is one of the important regulatory elements of variable splicing mechanism. It is very important to study the pathogenesis of genetic diseases at splicing site level. In order to solve this problem, sequence pattern mining model is used to study the mutation of splicing sites in genetic diseases. Cancer is the greatest threat to human health. The identification of potential proto-oncogenes and tumor suppressor genes can not only improve our understanding of tumorigenesis and cancer progression, but also contribute to the development of personalized cancer therapy. Genome sequencing studies over the past few years have produced a lot of data on somatic mutations in cancer, but how to interpret this sequence information remains a huge challenge in previous studies. Many methods have been developed to identify the driving genes according to the function of the genes that carry the mutations. Although some computing tools are available to predict the functional impact of mutations, their role is limited. The common mutation of genetic disease and cancer somatic cell establishes the molecular mechanism that affects the function of protein. We assume that these genes shared the same mutation are cancer driving genes. We use overlapping mutations of genetic diseases and cancer somatic mutations to identify potential new types of cancer driven mutations. The main work of this paper is as follows: (1) the sequence pattern mining model is used to study the mutation of splicing site region in genetic diseases. The sequential pattern mining model used in this paper is a fusion model of frequent pattern mining algorithm and PSSM algorithm. The experimental results show that the model has a good classification effect in distinguishing genetic disease mutation from common mutation. The signal of splicing site is weakened by the variation of splicing site region of genetic diseases, which leads to the destruction of normal splicing and the occurrence of disease. (2) Identification of cancer proto-oncogene and tumor suppressor gene by genetic disease mutation. In this study, we identified potential oncogenes and tumor suppressor genes using overlapping mutations of Mendelian disease and somatic mutations. Since genetic disease mutations and somatic mutations share mutations have clear molecular mechanisms that affect the function of proteins, we assume that these mutations are more likely to be cancer-driven mutations. Our studies have shown that superposition mutations of cancer somatic mutations and genetic disease pathogenic mutations are more frequently mutated in cancer and are enriched in known cancer genes. We identify potential tumor suppressor genes according to the number of overlapped mutations. The results show that ion channels, collagen and Marfan syndrome related genes may be a new classification of tumor suppressor genes. Then in each specific cancer type we identify potential proto-oncogenes based on high recurrence rates and overlapping mutations that are mutually exclusive to oncogene mutations. In conclusion, our research suggests that new cancer genes can be found from a large number of cancer genome sequencing data using overlapping mutations of genetic disease and cancer somatic mutations.
【學(xué)位授予單位】:安徽大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:R596;TP311.13
【相似文獻(xiàn)】
相關(guān)期刊論文 前10條
1 舒音;;《人類遺傳與遺傳疾病》和作者[J];江蘇中醫(yī)雜志;1980年01期
2 馬沛然,談?dòng)褙?關(guān)于遺傳疾病問題[J];山東醫(yī)藥;1981年06期
3 周樹舜,黃希順,楊惠明,金啟建;成都市六歲以下兒童神經(jīng)遺傳疾病的調(diào)查[J];中國(guó)神經(jīng)精神疾病雜志;1984年01期
4 ;內(nèi)分泌、免疫、遺傳疾病[J];中國(guó)醫(yī)學(xué)文摘.內(nèi)科學(xué);1987年09期
5 ;內(nèi)分泌、免疫和遺傳疾病[J];中國(guó)醫(yī)學(xué)文摘.內(nèi)科學(xué);1988年06期
6 袁波;;一種新型遺傳疾病[J];國(guó)外醫(yī)學(xué)情報(bào);1990年23期
7 ;內(nèi)分泌和遺傳疾病[J];中國(guó)醫(yī)學(xué)文摘.內(nèi)科學(xué);1992年06期
8 鞏洋;;用試驗(yàn)方法鑒定可能把遺傳疾病傳給子女的父母[J];國(guó)外醫(yī)學(xué)情報(bào);1993年19期
9 ;內(nèi)分泌和遺傳疾病[J];中國(guó)醫(yī)學(xué)文摘.內(nèi)科學(xué);1993年02期
10 ;內(nèi)分泌和遺傳疾病[J];中國(guó)醫(yī)學(xué)文摘.內(nèi)科學(xué);1994年03期
相關(guān)會(huì)議論文 前2條
1 徐亞杰;熊家軍;楊利國(guó);張淑君;;奶牛遺傳疾病癥狀和發(fā)病機(jī)理[A];中國(guó)奶業(yè)協(xié)會(huì)第26次繁殖學(xué)術(shù)年會(huì)暨國(guó)家肉牛牦牛/奶牛產(chǎn)業(yè)技術(shù)體系第3屆全國(guó)牛病防治學(xué)術(shù)研討會(huì)論文集[C];2011年
2 孫東曉;張沅;張勝利;初芹;李艷華;孫藝;楊鳴洲;張松;張毅;;奶牛常見遺傳病的遺傳基礎(chǔ)和檢測(cè)方法[A];中國(guó)奶業(yè)協(xié)會(huì)年會(huì)論文集2009(上冊(cè))[C];2009年
相關(guān)重要報(bào)紙文章 前10條
1 記者 劉霞;研究發(fā)現(xiàn)所有遺傳疾病的基因有同一“祖先”[N];科技日?qǐng)?bào);2008年
2 記者 劉石磊;“一父兩母”技術(shù)助阻遺傳疾病[N];新華每日電訊;2013年
3 陳丹;分析全家福照片可識(shí)別遺傳疾病[N];科技日?qǐng)?bào);2014年
4 記者 耿倩;四百多種遺傳疾病一檢便知[N];科學(xué)導(dǎo)報(bào);2014年
5 劉霞;科學(xué)家成功操控蛋白質(zhì)制造中的信號(hào)閱讀[N];科技日?qǐng)?bào);2011年
6 張建松;國(guó)際醫(yī)學(xué)界首次以中國(guó)人姓氏命名遺傳疾病[N];中國(guó)中醫(yī)藥報(bào);2001年
7 記者 毛磊;其目的專家質(zhì)疑[N];新華每日電訊;2002年
8 王雪梅;有些農(nóng)藥“中毒”,,可能會(huì)遺傳四代[N];新華每日電訊;2005年
9 ;基因組中可能的致病基因[N];中國(guó)高新技術(shù)產(chǎn)業(yè)導(dǎo)報(bào);2002年
10 戴旬;科學(xué)家研究用人造染色體治療遺傳疾病[N];大眾科技報(bào);2004年
相關(guān)博士學(xué)位論文 前1條
1 張琪;基于二代測(cè)序技術(shù)的視網(wǎng)膜遺傳疾病分子診斷研究[D];浙江大學(xué);2016年
相關(guān)碩士學(xué)位論文 前1條
1 王暢暢;遺傳疾病突變的數(shù)據(jù)挖掘分析[D];安徽大學(xué);2017年
本文編號(hào):2231678
本文鏈接:http://www.sikaile.net/yixuelunwen/nfm/2231678.html