異構(gòu)地震資料處理集群的偏移效率研究

發(fā)布時間：2019-02-24 17:21

【摘要】：基于波動方程的疊前深度偏移能夠?qū)?fù)雜地質(zhì)區(qū)塊實現(xiàn)高質(zhì)量的偏移成像，是尋找油氣的重要手段。但疊前深度偏移數(shù)據(jù)量極大，對計算需求極高，限制了其實際應(yīng)用。CPU-GPU異構(gòu)集群在性能、功耗、造價、散熱等方面有著巨大優(yōu)勢，為疊前深度偏移的普及帶來了契機。但是，CPU-GPU異構(gòu)在系統(tǒng)組成、體系結(jié)構(gòu)、編程模型等方面與一致、串行、簡潔的傳統(tǒng)CPU模型有很大不同，高效利用異構(gòu)計算資源面臨著許多問題與挑戰(zhàn)。本文首先對非一致訪問和總線競爭所帶的影響進行了定性分析和定量測試，結(jié)果表明不合理的數(shù)據(jù)通路和總線競爭與飽和會對通信性能帶來顯著影響，可能成為I/O訪問頻繁的偏移處理的瓶頸。隨后討論了幾種避免瓶頸的策略，并結(jié)合偏移處理中常用的數(shù)值計算方法進行了實驗，，優(yōu)化后的應(yīng)用在性能和穩(wěn)定性方面得到了改善。為充分挖掘GPU的計算潛力，本文對CUDA模型進行了剖析，并認為多線程SIMD處理器的視角更有助于把握GPU本質(zhì)與開發(fā)高效的應(yīng)用。針對Fermi架構(gòu)，通過微基準測試探測了部分微體系結(jié)構(gòu)特性，為深度性能優(yōu)化提供支撐�？紤]到快速傅里葉變換在偏移處理中的廣泛應(yīng)用，本文隨后基于Fermi微體系結(jié)構(gòu)，對已經(jīng)優(yōu)化的GPU快速傅里葉變換例程進行深入分析，通過數(shù)據(jù)預(yù)取和指令調(diào)整，提高了指令級并行，雖然線程規(guī)模有所下降，但性能仍改進了12%。針對SIMD分支分歧會導(dǎo)致性能顯著下降的問題，本文提出了“聚合”與“提取”這兩種軟件級的優(yōu)化策略。測試結(jié)果表明，對合適的分支，“聚合”能夠提高每步SIMD執(zhí)行有效結(jié)果的比重，“提取”能夠降低SIMD分歧長度，使性能得到改善。最后，由實際偏移處理測試結(jié)果可以知道，合理的數(shù)據(jù)通路規(guī)劃帶來的加速效果最為顯著，對熱點GPU內(nèi)核的深入優(yōu)化同樣可以帶來一定的改進，而SIMD分支優(yōu)化對偏移提速的貢獻相對較小。
[Abstract]:Pre-stack depth migration based on wave equation can achieve high quality migration imaging of complex geological blocks, which is an important means to find oil and gas. However, CPU-GPU heterogeneous cluster has great advantages in performance, power consumption, cost, heat dissipation and so on, which brings an opportunity for the popularization of prestack depth migration. However, CPU-GPU isomerism is very different from the traditional CPU model in system composition, architecture, programming model and so on. The efficient use of heterogeneous computing resources is faced with many problems and challenges. In this paper, the effects of non-uniform access and bus competition are qualitatively analyzed and quantitatively tested. The results show that unreasonable data paths and bus competition and saturation will have a significant impact on communication performance. It may be the bottleneck of I / O frequent offset processing. Then, several strategies to avoid bottleneck are discussed, and the experiments are carried out by combining the numerical calculation methods commonly used in migration processing. The performance and stability of the optimized application are improved. In order to fully exploit the computing potential of GPU, this paper analyzes the CUDA model, and thinks that the view of multithreaded SIMD processor is more helpful to grasp the essence of GPU and develop efficient applications. For the Fermi architecture, some characteristics of the microarchitecture are detected by microbenchmark, which provides the support for the depth performance optimization. Considering the wide application of fast Fourier transform in migration processing, based on the Fermi microarchitecture, the optimized GPU fast Fourier transform routine is analyzed in depth, and the data prefetching and instruction adjusting are used. Improved instruction-level parallelism, although thread size has declined, but the performance is still improved 12. Aiming at the problem that branch bifurcation of SIMD can result in a significant degradation of performance, this paper proposes two software level optimization strategies, "aggregation" and "extraction". The test results show that "aggregation" can increase the proportion of effective results for each step of SIMD execution, and "extract" can reduce the bifurcation length of SIMD and improve the performance. Finally, from the test results of actual migration processing, we can know that the acceleration effect brought by reasonable data path planning is the most remarkable, and the deep optimization of the hot GPU kernel can also bring some improvement. The contribution of SIMD branch optimization to migration speed increase is relatively small.
【學(xué)位授予單位】：中國石油大學(xué)（華東）
【學(xué)位級別】：碩士
【學(xué)位授予年份】：2013
【分類號】：P631.44;TP332

【參考文獻】

相關(guān)期刊論文前6條

1 劉紅偉;李博;劉洪;佟小龍;劉欽;;地震疊前逆時偏移高階有限差分算法及GPU實現(xiàn)[J];地球物理學(xué)報;2010年07期

2 王握文;陳明;;“天河一號”超級計算機系統(tǒng)研制[J];國防科技;2009年06期

3 石穎;陸加敏;柯璇;田東升;王菲;;基于GPU并行加速的疊前逆時偏移方法[J];東北石油大學(xué)學(xué)報;2012年04期

4 劉偉峰;趙改善;孔祥寧;蔡杰雄;張兵;;基于多GPU的三維Kirchhoff積分法體偏移[J];華中科技大學(xué)學(xué)報(自然科學(xué)版);2011年S1期

5 張兵;趙改善;黃駿;李敏;劉偉峰;;地震疊前深度偏移在CUDA平臺上的實現(xiàn)[J];勘探地球物理進展;2008年06期

6 張向陽;馮超敏;文玲;;GPU加速逆時偏移技術(shù)的應(yīng)用和分析[J];計算機應(yīng)用與軟件;2012年08期

本文編號：2429761

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會員下載

Download by Member

本文鏈接：http://www.sikaile.net/kejilunwen/jisuanjikexuelunwen/2429761.html

上一篇：基于Flash的高速數(shù)據(jù)記錄裝置關(guān)鍵技術(shù)研究
下一篇：高職計算機基礎(chǔ)課程分級教學(xué)的研究與實踐

論文發(fā)表

·知網(wǎng)|萬方|維普|龍源|省級|國家級|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

異構(gòu)地震資料處理集群的偏移效率研究