天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當前位置:主頁 > 科技論文 > 軟件論文 >

基于Hadoop的電信大數(shù)據(jù)采集方案研究與實現(xiàn)

發(fā)布時間:2019-02-09 12:15
【摘要】:ETL是數(shù)據(jù)倉庫實施過程中一個非常重要的步驟,設計一個能夠?qū)Υ髷?shù)據(jù)進行有效處理的ETL流程以提高運營平臺的采集效率,具有重要的實際意義。首先簡單介紹某運營商大數(shù)據(jù)平臺采集的主要數(shù)據(jù)內(nèi)容。隨后,為提升海量數(shù)據(jù)采集效率,提出了Hadoop與Oracle混搭架構(gòu)解決方案。繼而,提出一種動態(tài)觸發(fā)式ETL調(diào)度流程與算法,與定時啟動的ETL流程調(diào)度方式相比,可有效縮短部分流程的超長等待時間;有效避免資源搶占擁堵現(xiàn)象。最后,根據(jù)Hadoop和Oracle的系統(tǒng)運行日志,比較分析了兩個平臺的采集效率與數(shù)據(jù)量之間的關系。實踐表明,混搭架構(gòu)的大數(shù)據(jù)平臺優(yōu)勢互補,可有效提升數(shù)據(jù)采集時效性,獲得比較好的應用效果。
[Abstract]:ETL is a very important step in the implementation of data warehouse. It is of great practical significance to design a ETL process that can deal with big data effectively in order to improve the collection efficiency of the operation platform. First of all, a brief introduction of the main data collected by big data platform. Then, in order to improve the efficiency of mass data acquisition, a solution of Hadoop and Oracle mashup architecture is proposed. Then, a dynamic trigger ETL scheduling process and algorithm is proposed, which can effectively shorten the long waiting time of some processes and avoid the congestion phenomenon of resource preemption compared with the scheduled ETL process scheduling mode. Finally, according to the system log of Hadoop and Oracle, the relationship between the collection efficiency and the data volume of the two platforms is compared and analyzed. The practice shows that the big data platform of the mashup architecture has complementary advantages, which can effectively improve the timeliness of data acquisition and obtain a better application effect.
【作者單位】: 中國聯(lián)合網(wǎng)絡通信有限公司上海市分公司;同濟大學軟件學院;
【分類號】:TP311.13
,

本文編號:2418947

資料下載
論文發(fā)表

本文鏈接:http://www.sikaile.net/kejilunwen/ruanjiangongchenglunwen/2418947.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權申明:資料由用戶e7859***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com