利用Hi-C技術高通量篩選介導染色質(zhì)相互作用的lncRNAs
發(fā)布時間:2018-07-17 21:34
【摘要】:細胞核是真核生物特有的,最大的細胞器。染色質(zhì)是遺傳和表觀遺傳信息的載體,并且是最大的生物大分子。大約2m長的染色質(zhì)被折疊進直徑小于10μm的細胞核。染色體構象捕獲(3C)技術以及一系列它的衍生技術(4C、5C、Hi-C)的發(fā)展促進了核結構的研究。這些技術揭示了一些蛋白如CTCF,Cohesin等在染色質(zhì)折疊和相互作用中發(fā)揮重要的作用。最近發(fā)現(xiàn)一些lncRNAs也參與了染色質(zhì)的相互作用。lncRNAs通過與DNA、蛋白質(zhì),甚至與RNA本身相互作用,參與了染色質(zhì)相互作用并調(diào)控了核結構的形成,比如XIST,Firre等。因為基因組的大部分編碼為ncRNA,我們猜測也許很多l(xiāng)ncRNA參與了核結構的調(diào)控。但是直到現(xiàn)在,也沒有任何系統(tǒng)的關于lncRNA參與染色質(zhì)相互作用的報道。為了在全基因組范圍內(nèi)篩選可能參與染色質(zhì)相互作用的lncRNAs,我們建立了一種基于Hi-C技術的高通量篩選的方法,它是通過對比RNase處理前后基因組范圍的染色質(zhì)相互作用來實現(xiàn)的。建立高質(zhì)量的Hi-C文庫是整個課題的關鍵。Hi-C技術是一個多步驟、耗時較長的分子生物學技術,需要多種試劑和儀器。這個技術還不是很成熟,到現(xiàn)在為止它的重復性還不是很好很穩(wěn)定。通過優(yōu)化復雜的Hi-C實驗中最核心的步驟如交聯(lián)、酶切、限制性內(nèi)切酶的失活和原位連接等,我們建立了一個成熟的、穩(wěn)定的Hi-C建庫流程。在初始Hi-C文庫進行擴增之后,將GM12878細胞的RNase處理前后的兩組生物學重復Hi-C文庫進行了高通量測序。在初步的生物信息學分析之后,文庫的質(zhì)量及生物學重復的重復性得到檢驗。平均來說,原始數(shù)據(jù)中大概有90%的比對率,72%的配對率。此外,在去除自連片段和dangling-ends后,能夠獲得超過96%的有效相互作用對。相關性分析顯示,兩組生物學重復的bin coverage和all bin pairs的相關性都極強。所有這些結果進一步證明了優(yōu)化了的Hi-C建庫流程是可靠并且穩(wěn)定的。在獲得了高質(zhì)量并且高度重復性的文庫后,我們進一步對照分析了RNase處理前后樣品文庫的結果。對比分析顯示,在RNase處理之后很多相互作用減弱甚至是消失了。在建庫過程中也發(fā)現(xiàn),和RNase處理組相比,同樣細胞量的正常組得到了1.58倍的初始Hi-C文庫(相互作用片段)。在兩組生物學重復的正常組和RNase處理組扣減后,我們選擇正值前10000對差異相互作用進行下一步的分析。最后發(fā)現(xiàn)在RNase處理后消失或減弱的染色質(zhì)相互作用位點附近存在4081個lncRNAs編碼基因。GO注釋顯示,這4081個lncRNAs編碼基因附近的基因主要和細胞膜、Pleckstrin homology-like domain、銨離子轉(zhuǎn)運、可變剪接等生物學結構或功能相關。篩選到的這4081個lncRNAs是潛在的可能參與染色質(zhì)相互作用的,這為進一步研究它們的分子機制和功能提供了一個很好的基礎。
[Abstract]:The nucleus is unique to eukaryotes and is the largest organelle. Chromatin is the carrier of genetic and epigenetic information and the largest biological macromolecule. About 2m long chromatin is folded into nuclei smaller than 10 渭 m in diameter. The development of chromosome conformation capture (3C) technique and a series of its derivation techniques (4Cn5CU Hi-C) have promoted the study of nuclear structure. These techniques reveal that some proteins, such as CTCF Cohesin, play an important role in chromatin folding and interaction. Recently, it has been found that some lncRNAs are also involved in chromatin interaction. LncRNAs interact with DNA, protein and even RNA itself, participate in chromatin interaction and regulate the formation of nuclear structure, such as XIST Firre. Since most of the genome encodes ncRNAs, we suspect that many lncRNAs may be involved in the regulation of nuclear structures. Until now, however, there have been no systematic reports of lncRNA involved in chromatin interactions. In order to screen lncRNAss which may be involved in chromatin interaction in the whole genome, we established a high-throughput screening method based on Hi-C technology, which was achieved by comparing the genome-wide chromatin interactions before and after RNase treatment. Establishing a high quality Hi-C library is the key of the whole project. Hi-C technology is a multi-step, time-consuming molecular biology technology, which requires a variety of reagents and instruments. The technology is not very mature, and so far its repeatability is not very good, very stable. By optimizing the core steps of the complex Hi-C experiment, such as crosslinking, restriction endonuclease inactivation and in-situ connection, we have established a mature and stable Hi-C database construction process. After the initial Hi-C library was amplified, two sets of biological repeat Hi-C libraries were sequenced before and after RNase treatment in GM12878 cells. After a preliminary bioinformatics analysis, the quality of the library and the repeatability of biological duplication were examined. On average, about 90% of the raw data were matched by 72%. In addition, after removing the self-connected fragments and dangling-ends, more than 96% of the effective interaction pairs can be obtained. Correlation analysis showed that there was a strong correlation between bin coverage and all bin pairs in both groups. All these results further prove that the optimized Hi-C library building process is reliable and stable. After the high quality and reproducibility of the library were obtained, we compared and analyzed the results of the sample library before and after RNase treatment. Comparative analysis showed that many interactions weakened or disappeared after RNase treatment. It was also found that the initial Hi-C library (interaction fragment) was 1.58 times higher than that in the RNase treated group. After deducting two groups of biologically duplicated normal and RNase treated groups, we selected the first 10000 positive values for the next analysis of the differential interactions. Finally, it was found that there were 4081 lncRNAs coding genes near the chromatin interaction sites that disappeared or weakened after RNase treatment. Go annotation showed that the genes near the 4081 LNRNAs coding genes were mainly related to the membrane Pleckstrin homology-like domain, ammonium ion transport. Variable splicing or other biological structures or functions are related. The selected 4081 lncRNAs are potentially involved in chromatin interaction, which provides a good basis for further study of their molecular mechanisms and functions.
【學位授予單位】:聊城大學
【學位級別】:碩士
【學位授予年份】:2017
【分類號】:Q78
本文編號:2131005
[Abstract]:The nucleus is unique to eukaryotes and is the largest organelle. Chromatin is the carrier of genetic and epigenetic information and the largest biological macromolecule. About 2m long chromatin is folded into nuclei smaller than 10 渭 m in diameter. The development of chromosome conformation capture (3C) technique and a series of its derivation techniques (4Cn5CU Hi-C) have promoted the study of nuclear structure. These techniques reveal that some proteins, such as CTCF Cohesin, play an important role in chromatin folding and interaction. Recently, it has been found that some lncRNAs are also involved in chromatin interaction. LncRNAs interact with DNA, protein and even RNA itself, participate in chromatin interaction and regulate the formation of nuclear structure, such as XIST Firre. Since most of the genome encodes ncRNAs, we suspect that many lncRNAs may be involved in the regulation of nuclear structures. Until now, however, there have been no systematic reports of lncRNA involved in chromatin interactions. In order to screen lncRNAss which may be involved in chromatin interaction in the whole genome, we established a high-throughput screening method based on Hi-C technology, which was achieved by comparing the genome-wide chromatin interactions before and after RNase treatment. Establishing a high quality Hi-C library is the key of the whole project. Hi-C technology is a multi-step, time-consuming molecular biology technology, which requires a variety of reagents and instruments. The technology is not very mature, and so far its repeatability is not very good, very stable. By optimizing the core steps of the complex Hi-C experiment, such as crosslinking, restriction endonuclease inactivation and in-situ connection, we have established a mature and stable Hi-C database construction process. After the initial Hi-C library was amplified, two sets of biological repeat Hi-C libraries were sequenced before and after RNase treatment in GM12878 cells. After a preliminary bioinformatics analysis, the quality of the library and the repeatability of biological duplication were examined. On average, about 90% of the raw data were matched by 72%. In addition, after removing the self-connected fragments and dangling-ends, more than 96% of the effective interaction pairs can be obtained. Correlation analysis showed that there was a strong correlation between bin coverage and all bin pairs in both groups. All these results further prove that the optimized Hi-C library building process is reliable and stable. After the high quality and reproducibility of the library were obtained, we compared and analyzed the results of the sample library before and after RNase treatment. Comparative analysis showed that many interactions weakened or disappeared after RNase treatment. It was also found that the initial Hi-C library (interaction fragment) was 1.58 times higher than that in the RNase treated group. After deducting two groups of biologically duplicated normal and RNase treated groups, we selected the first 10000 positive values for the next analysis of the differential interactions. Finally, it was found that there were 4081 lncRNAs coding genes near the chromatin interaction sites that disappeared or weakened after RNase treatment. Go annotation showed that the genes near the 4081 LNRNAs coding genes were mainly related to the membrane Pleckstrin homology-like domain, ammonium ion transport. Variable splicing or other biological structures or functions are related. The selected 4081 lncRNAs are potentially involved in chromatin interaction, which provides a good basis for further study of their molecular mechanisms and functions.
【學位授予單位】:聊城大學
【學位級別】:碩士
【學位授予年份】:2017
【分類號】:Q78
【參考文獻】
相關期刊論文 前1條
1 SONG XiaoWei;SHAN DongKai;CHEN Jian;JING Qing;;miRNAs and lncRNAs in vascular injury and remodeling[J];Science China(Life Sciences);2014年08期
,本文編號:2131005
本文鏈接:http://www.sikaile.net/shoufeilunwen/benkebiyelunwen/2131005.html
最近更新
教材專著