協(xié)同設(shè)計(jì)X86仿真指令集映射技術(shù)研究
本文選題:仿真技術(shù) 切入點(diǎn):協(xié)同設(shè)計(jì) 出處:《解放軍信息工程大學(xué)》2012年碩士論文 論文類(lèi)型:學(xué)位論文
【摘要】:仿真技術(shù)能夠有效地緩解處理器體系結(jié)構(gòu)差異帶來(lái)的軟件兼容性問(wèn)題,對(duì)RISC處理器特別是國(guó)產(chǎn)CPU的發(fā)展具有重要意義。在X86處理器占據(jù)較大市場(chǎng)份額、且擁有豐富軟件資源的情況下,國(guó)產(chǎn)CPU要面向市場(chǎng)、走向應(yīng)用就必須與X86平臺(tái)的軟件保持兼容。協(xié)同設(shè)計(jì)X86仿真技術(shù)兼顧了軟件和硬件的優(yōu)勢(shì),可獲得較好的仿真效率,已經(jīng)成為X86仿真技術(shù)發(fā)展的一個(gè)趨勢(shì)。本文在廣泛了解X86仿真技術(shù)研究現(xiàn)狀的基礎(chǔ)上,深入分析了當(dāng)前X86仿真技術(shù)發(fā)展的性能瓶頸,針對(duì)協(xié)同設(shè)計(jì)X86仿真指令集映射技術(shù)中的關(guān)鍵問(wèn)題進(jìn)行了探討,設(shè)計(jì)了指令翻譯部件和翻譯緩存部件。 論文針對(duì)X86指令長(zhǎng)度不定、指令格式多樣等問(wèn)題,提出了基于狀態(tài)分拆的兩級(jí)譯碼機(jī)制,與按字節(jié)譯碼方法相比,兩級(jí)譯碼機(jī)制將譯碼過(guò)程劃分為長(zhǎng)度譯碼和操作數(shù)譯碼兩個(gè)過(guò)程,,有效減少了X86指令譯碼帶來(lái)的時(shí)鐘開(kāi)銷(xiāo);依據(jù)Pentium微程序設(shè)計(jì)思想和QEMU微操作設(shè)計(jì)思想,設(shè)計(jì)了基于LUT技術(shù)的指令集映射表Trans_lib,并通過(guò)尋址入口和功能入口完成指令翻譯過(guò)程,有效減少了指令翻譯過(guò)程中的時(shí)間和功耗開(kāi)銷(xiāo);提出了硬件翻譯緩存管理策略HTCM,將翻譯緩存按比例劃分為熱代碼區(qū)和普通代碼區(qū),分別采用FIFO和全清空策略管理,有效地減少了緩存碎片的產(chǎn)生,盡可能地延長(zhǎng)了熱代碼在緩存中的駐留時(shí)間,提高了翻譯緩存的命中率。最后,采用Verilog HDL硬件描述語(yǔ)言設(shè)計(jì)并實(shí)現(xiàn)了指令翻譯部件和翻譯緩存部件,并對(duì)其主要端口和功能進(jìn)行了簡(jiǎn)要說(shuō)明。 驗(yàn)證和分析結(jié)果表明,論文所設(shè)計(jì)的指令翻譯部件和翻譯緩存部件可以成功地將X86指令集映射到Alpha指令集,并通過(guò)多種優(yōu)化措施提高了指令集映射的性能。論文提出的兩級(jí)譯碼機(jī)制相比于按字節(jié)譯碼機(jī)制最高可獲得15.79%的性能提升;提出的HTCM策略的命中率相比于全清空和FIFO最高可獲得17.43%和9.27%的性能提升。
[Abstract]:Simulation technology can effectively alleviate the software compatibility problem caused by the difference of processor architecture, and has great significance for the development of RISC processors, especially for domestic CPU. In the case of abundant software resources, if domestic CPU is to be market-oriented, it must be compatible with the software of X86 platform. The collaborative design of X86 simulation technology takes into account the advantages of software and hardware, and can obtain better simulation efficiency. X86 simulation technology has become a trend of development. On the basis of extensive understanding of the current research situation of X86 simulation technology, the performance bottleneck of current X86 simulation technology development is analyzed in depth. In this paper, the key problems in collaborative design X86 simulation instruction set mapping technology are discussed, and instruction translation components and translation cache components are designed. In order to solve the problems of variable length of X86 instructions and various instruction formats, a two-stage decoding mechanism based on state partition is proposed in this paper, which is compared with byte-by-byte decoding method. The two-stage decoding mechanism divides the decoding process into two processes: length decoding and Operand decoding, which effectively reduces the clock overhead brought by X86 instruction decoding, according to the idea of Pentium microprogramming and QEMU microoperation design. The instruction set mapping table based on LUT technology is designed, and the instruction translation process is completed by addressing entry and function entry, which effectively reduces the time and power consumption in instruction translation. A hardware translation cache management strategy is proposed. The translation cache is divided into hot code region and general code area according to the scale. FIFO and full emptying strategy are adopted respectively, which can effectively reduce the occurrence of cache fragments. The residence time of hot code in cache is prolonged as much as possible, and the hit rate of translation cache is improved. Finally, the instruction translation unit and translation cache component are designed and implemented by using Verilog HDL hardware description language. The main ports and functions are briefly described. The results of verification and analysis show that the instruction translation unit and the translation buffer unit designed in this paper can successfully map the X86 instruction set to the Alpha instruction set. The performance of instruction set mapping is improved by various optimization measures. Compared with the bytecode mechanism, the proposed two-stage decoding mechanism can achieve a maximum performance of 15.79%. The hit ratio of the proposed HTCM strategy is improved by 17.43% and 9.27% as compared with total emptying and FIFO.
【學(xué)位授予單位】:解放軍信息工程大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2012
【分類(lèi)號(hào)】:TP332
【參考文獻(xiàn)】
相關(guān)期刊論文 前10條
1 涂小玲;謝憬;毛志剛;胡哲琨;;基于ARM嵌入式應(yīng)用平臺(tái)的x86指令譯碼器設(shè)計(jì)[J];電子測(cè)量技術(shù);2008年10期
2 謝海斌,武成崗,張兆慶,馮曉兵;動(dòng)態(tài)二進(jìn)制翻譯中的代碼Cache管理策略[J];計(jì)算機(jī)工程;2005年10期
3 張激;李寧波;;基于二進(jìn)制翻譯的仿真器關(guān)鍵技術(shù)研究[J];計(jì)算機(jī)工程;2010年16期
4 郝云龍;趙榮彩;侯永生;朱嘉風(fēng);;反饋式編譯在循環(huán)級(jí)性能分析中的應(yīng)用[J];計(jì)算機(jī)工程;2011年09期
5 陳喬;蔣烈輝;董衛(wèi)宇;徐金龍;方明;;基于動(dòng)態(tài)二進(jìn)制翻譯技術(shù)的仿真器研究[J];計(jì)算機(jī)工程;2011年20期
6 徐金龍;蔣烈輝;董衛(wèi)宇;王立新;陳喬;;動(dòng)態(tài)二進(jìn)制翻譯緩存的分區(qū)管理機(jī)制研究[J];計(jì)算機(jī)工程;2012年02期
7 雨百;RISC妥協(xié)策略──仿真X86指令集[J];計(jì)算機(jī)工程;1995年05期
8 張駿;樊曉椏;張萌;;并行CISC指令譯碼器的設(shè)計(jì)與實(shí)現(xiàn)[J];計(jì)算機(jī)應(yīng)用研究;2007年11期
9 居曉波,李志斌,寧兆熙,程君俠,王永流;一種新型CISC微處理器指令譯碼設(shè)計(jì)方法[J];微電子學(xué);2003年02期
10 徐金龍;蔣烈輝;董衛(wèi)宇;方明;;動(dòng)態(tài)二進(jìn)制翻譯的多線程并行優(yōu)化研究[J];計(jì)算機(jī)工程與設(shè)計(jì);2011年07期
相關(guān)博士學(xué)位論文 前3條
1 曹宏嘉;面向微處理器設(shè)計(jì)的動(dòng)態(tài)二進(jìn)制翻譯技術(shù)研究[D];國(guó)防科學(xué)技術(shù)大學(xué);2005年
2 唐遇星;面向動(dòng)態(tài)二進(jìn)制翻譯的動(dòng)態(tài)優(yōu)化和微處理器體系結(jié)構(gòu)支撐技術(shù)研究[D];國(guó)防科學(xué)技術(shù)大學(xué);2005年
3 陳微;基于動(dòng)態(tài)二進(jìn)制翻譯的協(xié)同設(shè)計(jì)虛擬機(jī)關(guān)鍵技術(shù)研究[D];國(guó)防科學(xué)技術(shù)大學(xué);2010年
相關(guān)碩士學(xué)位論文 前3條
1 方明;X86架構(gòu)I/O子系統(tǒng)仿真技術(shù)研究與設(shè)計(jì)[D];解放軍信息工程大學(xué);2011年
2 包云程;構(gòu)建基于動(dòng)態(tài)二進(jìn)制翻譯技術(shù)的進(jìn)程虛擬機(jī)[D];上海交通大學(xué);2007年
3 劉博;基于軟硬件協(xié)同設(shè)計(jì)的虛擬機(jī)的并行性研究[D];上海交通大學(xué);2008年
本文編號(hào):1628749
本文鏈接:http://www.sikaile.net/kejilunwen/jisuanjikexuelunwen/1628749.html