天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁(yè) > 文藝論文 > 漢語言論文 >

語料庫(kù)結(jié)構(gòu)研究及其應(yīng)用

發(fā)布時(shí)間:2018-05-06 14:43

  本文選題:語料庫(kù) + 結(jié)構(gòu) ; 參考:《江南大學(xué)》2012年碩士論文


【摘要】:基于真實(shí)的語言數(shù)據(jù),語料庫(kù)語言學(xué)以概率的手段從宏觀角度進(jìn)行語言分析,越來越受到語言研究者的青睞。語料庫(kù)是語料庫(kù)語言學(xué)的研究基礎(chǔ),建設(shè)一個(gè)全面、具有代表性的語料庫(kù)對(duì)研究結(jié)果具有極其重要的意義。語料庫(kù)的建設(shè)需要考慮諸多因素,如建庫(kù)大小,語料的來源、類型等等。 語料庫(kù)具不具有代表性,語料是否能全面的代表所要研究領(lǐng)域,折射出語料庫(kù)的結(jié)構(gòu)是否合理。語料庫(kù)的結(jié)構(gòu)主要涉及語料的分層標(biāo)準(zhǔn)及其在語料庫(kù)中所占的相應(yīng)比例兩方面。本文由調(diào)查西方主要語料庫(kù)的結(jié)構(gòu)著手,借鑒系統(tǒng)功能語言學(xué),研究試回答語料庫(kù)在結(jié)構(gòu)安排上存在何種潛在規(guī)律。系統(tǒng)功能語言學(xué)創(chuàng)始人韓禮德對(duì)語言有過系統(tǒng)的闡述。他認(rèn)為語言整體上是一個(gè)連續(xù)體,口語和書面語處于連續(xù)體的兩端。并且特別的指出居于連續(xù)體中間的語體既有口語特征,也具有書面語特征,同時(shí)向兩端延伸演化為典型口語和書面語。連續(xù)體理論反對(duì)書面語第一位或口語第一位的論調(diào),從語體上全面、辯證統(tǒng)一的描述了語言。借助于該理論,作者發(fā)現(xiàn)SEU語料庫(kù)、Brown語料庫(kù)、LOB語料庫(kù)以及ICE-GB語料庫(kù)的結(jié)構(gòu)充分考慮了語體的因素,尤以SEU語料庫(kù)最為突出。SEU中采取written origin、scripted to be spoken、Spoken origin三大主劃分,語體從書面語逐步發(fā)展為口語。其中scripted to be spoken分層標(biāo)準(zhǔn)包括訪談、劇本、演講稿等,精確的體現(xiàn)了連續(xù)體的口語和書面語的連續(xù)。Brown、LOB語料庫(kù)未收錄口語語體,正因?yàn)槿绱?它對(duì)書面語的歸類具有示范性作用。參照連續(xù)體示意圖,文章把綜上分析結(jié)果以及各個(gè)主要分層比例一一對(duì)映于該坐標(biāo),最后得出了一個(gè)比較對(duì)稱的圖行,表明了這些語料庫(kù)具有較好的代表性。但是,語體的分層標(biāo)準(zhǔn)并不是唯一的分類理?yè)?jù),諸如BNC語料庫(kù)、LLELC語料庫(kù)、MCLC語料庫(kù)卻采用學(xué)科劃分標(biāo)準(zhǔn),比如applied science, social science, arts等等。進(jìn)一步的研究發(fā)現(xiàn)這兩類分層標(biāo)準(zhǔn)并不是孤立的,ICE-GB中的learned and the popular分類的子分支沿用了social sciences, natural sciences,這證實(shí)該語料庫(kù)同時(shí)采用了兩類分層模式。 以上兩種分層樣式是較常見的語料庫(kù)結(jié)構(gòu)安排策略。未囿于此,該研究以自建英語專業(yè)相關(guān)知識(shí)語料庫(kù)的結(jié)構(gòu)為例,從實(shí)際出發(fā),深入探討其結(jié)構(gòu)構(gòu)建。首先基于英語專業(yè)的實(shí)習(xí)日志數(shù)據(jù),分析學(xué)生所從事的行業(yè)以及英語用途,從而有效的表針社會(huì)對(duì)英語專業(yè)相關(guān)知識(shí)的需求。研究采用了2006屆102名畢業(yè)生的實(shí)習(xí)日志,經(jīng)過統(tǒng)計(jì),34名同學(xué)未從事英語相關(guān)的職業(yè)。根據(jù)每個(gè)學(xué)生實(shí)習(xí)日志所關(guān)注的重點(diǎn),剩余學(xué)生實(shí)習(xí)內(nèi)容主要涉及外貿(mào)英語、英語教學(xué)、英語翻譯、文秘英語、機(jī)械英語等行業(yè)。按照各個(gè)行業(yè)實(shí)際參入人數(shù),計(jì)算出相應(yīng)所占比例,從而得出各個(gè)層次的比重。借鑒學(xué)科分層模式,結(jié)合行業(yè)統(tǒng)計(jì),文章初步給出了外貿(mào)、機(jī)械、計(jì)算機(jī)、教學(xué)等分層參考樣式。每個(gè)分層之下,以外貿(mào)英語為例,本文運(yùn)用連續(xù)體理論下語料庫(kù)結(jié)構(gòu)分析成果,嘗試性的探討了如何進(jìn)行具體劃分和收集語料。 著眼于主要西方語料庫(kù)結(jié)構(gòu)分析,本文結(jié)合實(shí)例探討語料庫(kù)結(jié)構(gòu)劃分。但因研究時(shí)間、精力有限,本文仍然存在不少亟待完善之處。僅僅102名學(xué)生的日志并不能有效的代表所有英語專業(yè)相關(guān)知識(shí)范疇。例如,所有的學(xué)生可能未從事與法律有關(guān)的英語工作,但這不能說明英語專業(yè)相關(guān)知識(shí)就不囊括法律英語。因此,后期研究仍期望有待進(jìn)行。盡管如此,本文主要意在開拓一種新思路,為自建語料庫(kù),特別是語料庫(kù)的結(jié)構(gòu)安排提供建設(shè)性的借鑒。隨著小型語料庫(kù)不斷受到言語工作者的重視,希望本文對(duì)語料庫(kù)建設(shè)理論有所裨益。
[Abstract]:Corpus linguistics is becoming more and more popular with language researchers based on real language data. Corpus linguistics is becoming more and more popular with language researchers. Corpus is the foundation of corpus linguistics. Building a comprehensive and representative corpus is of great significance to the research results. Consider many factors, such as the size of the library, the source and type of the corpus.
The corpus is not representative. Whether the corpus can be fully represented is a reflection of the rationality of the structure of the corpus. The structure of the corpus mainly involves the stratification standard of the corpus and the corresponding proportion in the corpus of two aspects. This paper begins with the investigation of the structure of the main corpus in the West and draws on the functional language of the system. Hallidy, the founder of systemic functional linguistics, has a systematic exposition of language. He thinks that language is a continuum on the whole, spoken and written at both ends of the continuum. And it is particularly pointed out that the language in the middle of the continuum has spoken language features. It also has the characteristics of written language, and extends to the two ends as typical spoken and written language. Continuum theory is opposed to the first or the first spoken language of written language, which describes language comprehensively and dialectically. With the help of the theory, the author finds the structure of SEU corpus, Brown corpus, LOB corpus and ICE-GB corpus. Taking full consideration of the factors of the style of language, especially the SEU corpus is most prominent in.SEU, written origin, scripted to be spoken, Spoken origin are divided into three major divisions, and the style of language is gradually developed from written language to spoken language. The continuous.Brown, LOB corpus of the language is not included in the colloquial language. It is precisely because of this, it has a demonstration effect on the classification of the written language. Good representativeness. However, the stratification standard of the corpus is not the only classification principle, such as the BNC corpus, the LLELC corpus, the MCLC corpus and the discipline division standards, such as applied science, social science, arts and so on. Further studies have found that these two classes of stratification standards are not isolated, learned and the in ICE-GB. The sub branches of the classification follow the Social Sciences, natural sciences, which confirms that the corpus adopts two types of hierarchical models simultaneously.
The above two types of stratified styles are a more common corpus arrangement strategy. In this study, the structure of the self built English specialized knowledge corpus is taken as an example to explore its structure. First, it is based on the practice log data of English majors to analyze the profession and English use of the students. The need for English majors related knowledge. The study adopted an internship log of 2006 102 graduates. After statistics, 34 students did not engage in English related professions. According to the focus of each student's internship log, the remaining students' practice content mainly involved foreign trade English, English teaching, English translation, secretarial English, Mechanical English and other industries. According to the actual number of people in each industry, calculate the proportion of the corresponding, so as to draw the proportion of each level. Drawing on the subject stratification model, combined with industry statistics, the article gives a preliminary reference style of foreign trade, machinery, computer and teaching. Under each stratification, the example of foreign trade English is used in this article. Based on the results of corpus structure analysis, we attempt to explore how to divide and collect corpus.
In view of the structure analysis of the main western corpus, this article discusses the structure division of the corpus with an example. However, because of the time and the limited energy, there are still many problems to be perfected. Only 102 students' log can not effectively represent the domain of all English major related knowledge. For example, all the students may not be engaged in the law. The relevant English work, however, does not indicate that English major related knowledge is not included in legal English. Therefore, later research is still expected to be done. However, this article is intended to develop a new idea to provide a constructive reference for the self built corpus, especially the structure of a corpus. We hope that this article will benefit the corpus construction theory.

【學(xué)位授予單位】:江南大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2012
【分類號(hào)】:H08

【參考文獻(xiàn)】

相關(guān)期刊論文 前10條

1 衛(wèi)乃興;李文中;濮建忠;;COLSEC語料庫(kù)的設(shè)計(jì)原則與標(biāo)注方法[J];當(dāng)代語言學(xué);2007年03期

2 顧曰國(guó);語料庫(kù)與語言研究——兼編者的話[J];當(dāng)代語言學(xué);1998年01期

3 丁信善;語料庫(kù)語言學(xué)的發(fā)展及研究現(xiàn)狀[J];當(dāng)代語言學(xué);1998年01期

4 王海華;高洋;尚曉華;;語料庫(kù)語言學(xué)發(fā)展回顧及展望[J];大連海事大學(xué)學(xué)報(bào)(社會(huì)科學(xué)版);2009年03期

5 何安平;;口語語料庫(kù)、平行語料庫(kù)、學(xué)習(xí)者語料庫(kù)——第23屆國(guó)際語料庫(kù)語言學(xué)年會(huì)ICAME2002綜述[J];國(guó)外外語教學(xué);2003年01期

6 陳建生;語料庫(kù)語言學(xué)與英語教學(xué)[J];解放軍外國(guó)語學(xué)院學(xué)報(bào);2004年01期

7 謝家成;小型英漢平行語料庫(kù)的建立與運(yùn)用[J];解放軍外國(guó)語學(xué)院學(xué)報(bào);2004年03期

8 蔣林;金兵;;語料庫(kù)翻譯研究的代表性問題[J];中國(guó)科技翻譯;2007年01期

9 謝徐萍;口語與書面語的關(guān)系探討及其對(duì)英語教學(xué)的啟示[J];南通大學(xué)學(xué)報(bào)(教育科學(xué)版);2005年02期

10 李德俊;;語料庫(kù)的“代表性”問題及其對(duì)英漢翻譯語料庫(kù)建設(shè)的啟示[J];外語研究;2007年05期



本文編號(hào):1852719

資料下載
論文發(fā)表

本文鏈接:http://www.sikaile.net/wenyilunwen/hanyulw/1852719.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶6e825***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com