天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

LAMOST科學(xué)計(jì)算云平臺(tái)系統(tǒng)的構(gòu)建與應(yīng)用

發(fā)布時(shí)間:2019-01-23 20:04
【摘要】:隨著探測(cè)器和空間技術(shù)的發(fā)展,天文觀測(cè)從可見(jiàn)光、射電波段擴(kuò)展到包括紅外、紫外、X射線和γ射線在內(nèi)的電磁波各個(gè)波段,形成了全波段天文學(xué),現(xiàn)發(fā)展到了一個(gè)全新的階段,即全波段-大樣本-巨信息量時(shí)期。天文學(xué)已然成為各學(xué)科中擁有海量數(shù)據(jù)的龍頭老大,由于天文數(shù)據(jù)量的龐大和增長(zhǎng)速度的迅猛,這些巡天項(xiàng)目產(chǎn)生的數(shù)據(jù)量通?梢赃_(dá)到TB甚至PB級(jí)。如斯隆數(shù)字巡天SDSS,用了十年時(shí)間來(lái)覆蓋8000平方度的天空,得到大約108個(gè)恒星、星系及類星體的大約40TB的成像及光譜數(shù)據(jù)。 隨著LAMOST巡天計(jì)劃的開(kāi)展,要完成對(duì)1000萬(wàn)個(gè)星系、100萬(wàn)個(gè)類星體及1000萬(wàn)顆恒星光譜的觀測(cè),將產(chǎn)生的數(shù)據(jù)將會(huì)是SDSS的十倍之多,對(duì)海量數(shù)據(jù)的存儲(chǔ)和處理將會(huì)是一個(gè)極大的挑戰(zhàn),本文針對(duì)LAMOST的需求,對(duì)海量光譜的數(shù)據(jù)存儲(chǔ)和處理構(gòu)建了一套適合天文數(shù)據(jù)處理的科學(xué)計(jì)算平臺(tái)并設(shè)計(jì)并實(shí)現(xiàn)了可定制的云儲(chǔ)存系統(tǒng)。 本文主要工作如下: 1、在LAMOST數(shù)據(jù)處理中心的24臺(tái)服務(wù)器上構(gòu)建了一套基于Hadoop開(kāi)源框架并適合天文數(shù)據(jù)處理的科學(xué)計(jì)算平臺(tái),其中包含NumPy、SciPy、PyFITS等常用的工具包。使用Python和Shell完成自動(dòng)部署的程序包,以方便快捷地添加刪除物理節(jié)點(diǎn)以及設(shè)置負(fù)載均衡。 2、基于Hadoop核心組件HDFS,設(shè)計(jì)并實(shí)現(xiàn)了多用戶的云存儲(chǔ)系統(tǒng),為用戶提供了新建文件夾、文件上傳、下載文件/文件夾、刪除文件/文件夾、回收站、記事本及個(gè)人信息管理等功能。另外,管理員角色擁有賬號(hào)管理(包括新增、修改、配額、刪除等操作)、單位管理及系統(tǒng)信息查詢功能等。用戶利用該平臺(tái)可以方便地存儲(chǔ)相關(guān)數(shù)據(jù)和處理結(jié)果等。 3、研究了科學(xué)計(jì)算平臺(tái)的核心組件MapReduce編程模型。在目前較完善的模板匹配算法基礎(chǔ)上,使用MapReduce編程規(guī)范完成模板匹配,使用KNN和卡方最小化算法對(duì)數(shù)據(jù)進(jìn)行了測(cè)試來(lái)驗(yàn)證改進(jìn)之后的算法,并分別在單機(jī)和集群環(huán)境下進(jìn)行了性能對(duì)比分析。
[Abstract]:With the development of detectors and space technology, astronomical observation extends from visible light, radio wave band to electromagnetic wave band including infrared, ultraviolet, X ray and 緯 ray, forming full band astronomy. Now it has reached a new stage, that is, the period of full-band-large sample-huge information. Astronomy has become the leader in the field of science with huge amounts of data. Due to the large amount of astronomical data and the rapid growth of astronomical data, the amount of data generated by these survey projects can usually reach TB or even PB level. For example, the Sloan Digital Sky Survey (SDSS,) took 10 years to cover 8000 square degrees of sky, obtaining about 108 stars, galaxies and quasars about 40TB imaging and spectral data. With the launch of the LAMOST survey program, the spectral observations of 10 million galaxies, 1 million quasars and 10 million stars will produce ten times as much data as SDSS, which will pose a great challenge to the storage and processing of massive data. In order to meet the requirements of LAMOST, a scientific computing platform for astronomical data processing is constructed and a customizable cloud storage system is designed and implemented. The main work of this paper is as follows: 1. A set of scientific computing platform based on Hadoop open source framework and suitable for astronomical data processing is built on 24 servers of LAMOST data processing center, which includes NumPy,SciPy,PyFITS and other commonly used toolkits. Use Python and Shell to complete automatic deployment packages to add and delete physical nodes and set load balancing quickly. 2. Based on Hadoop core component HDFS, a multi-user cloud storage system is designed and implemented, which provides users with new folder, file upload, download file / folder, delete file / folder, recycle bin, etc. Notepad and personal information management functions. In addition, the administrator role has account management (including new, modified, quota, delete and other operations), unit management and system information query functions. Users can conveniently store relevant data and process results by using the platform. 3. The MapReduce programming model of the core component of scientific computing platform is studied. On the basis of the current perfect template matching algorithm, we use MapReduce programming specification to complete template matching, and use KNN and chi-square minimization algorithm to test the data to verify the improved algorithm. The performance comparison and analysis are carried out in single machine and cluster environment respectively.
【學(xué)位授予單位】:山東大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2013
【分類號(hào)】:TP333

【參考文獻(xiàn)】

相關(guān)期刊論文 前7條

1 劉杰;潘景昌;韋鵬;劉猛;羅阿理;;基于光譜相似度的恒星大氣參數(shù)自動(dòng)測(cè)量方法[J];光譜學(xué)與光譜分析;2012年12期

2 姜斌;衣振萍;馬紹漢;;一種基于PCA和系統(tǒng)成團(tuán)法的聚類軟件設(shè)計(jì)[J];計(jì)算機(jī)科學(xué);2008年04期

3 李成華;張新訪;金海;向文;;MapReduce:新型的分布式并行計(jì)算編程模型[J];計(jì)算機(jī)工程與科學(xué);2011年03期

4 傅穎勛;羅圣美;舒繼武;;安全云存儲(chǔ)系統(tǒng)與關(guān)鍵技術(shù)綜述[J];計(jì)算機(jī)研究與發(fā)展;2013年01期

5 張彥霞,趙永恒,崔辰州;天文學(xué)中的數(shù)據(jù)挖掘和知識(shí)發(fā)現(xiàn)[J];天文學(xué)進(jìn)展;2002年04期

6 侯建;帥仁俊;侯文;;基于云計(jì)算的海量數(shù)據(jù)存儲(chǔ)模型[J];通信技術(shù);2011年05期

7 褚耀泉;;LAMOST科學(xué)觀測(cè)計(jì)劃[J];中國(guó)科學(xué)技術(shù)大學(xué)學(xué)報(bào);2007年06期

相關(guān)碩士學(xué)位論文 前3條

1 鄒彩輝;基于Hadoop平臺(tái)的自適應(yīng)局部超平面K近鄰算法的研究[D];華南理工大學(xué);2011年

2 董偉祥;LAMOST恒星大氣參數(shù)提取系統(tǒng)[D];山東大學(xué);2010年

3 劉杰;基于模板匹配的恒星大氣物理參數(shù)自動(dòng)測(cè)量的研究[D];山東大學(xué);2012年

,

本文編號(hào):2414149

資料下載
論文發(fā)表

本文鏈接:http://www.sikaile.net/kejilunwen/jisuanjikexuelunwen/2414149.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶36746***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com