基于數(shù)據(jù)挖掘的量化選股策略的研究
本文選題:量化投資 + 選股策略 ; 參考:《天津商業(yè)大學(xué)》2017年碩士論文
【摘要】:近年來(lái),由于股票市場(chǎng)的不斷發(fā)展,量化投資技術(shù)越來(lái)越受到投資者的關(guān)注,我國(guó)的量化投資體系也逐漸走向成熟。隨著股市規(guī)則的不斷完善,上市股票的數(shù)量及與之相關(guān)的數(shù)據(jù)在不斷的增加,而股票的這些數(shù)據(jù)多且復(fù)雜,卻又隱含著很多有用的信息,那么如何從這些海量的數(shù)據(jù)中發(fā)現(xiàn)有用的信息,用常規(guī)的方法顯然已經(jīng)無(wú)法解決,而近些年發(fā)展起來(lái)的數(shù)據(jù)挖掘技術(shù)則可以幫助我們從那些海量的股票數(shù)據(jù)中挖掘出我們所需要的數(shù)據(jù)信息,通過(guò)對(duì)這些數(shù)據(jù)進(jìn)行分析、建模得到我們想要的信息。本文主要討論了基于數(shù)據(jù)挖掘的量化選股模型。首先我們根據(jù)兩個(gè)條件對(duì)2013年-2015年滬深市場(chǎng)類(lèi)全部A股的3000多支股票進(jìn)行初步篩選:一是連續(xù)3年凈資產(chǎn)收益率穩(wěn)定且不小于10%,并剔除ST等公司股票;二是主營(yíng)業(yè)務(wù)增長(zhǎng)率與凈利潤(rùn)增長(zhǎng)率基本一致并且在10%以上。經(jīng)過(guò)篩選,51支基本面較好的股票被保留。其次,我們選取了財(cái)務(wù)數(shù)據(jù)中能夠反映公司盈利、償債、成長(zhǎng)等能力的17個(gè)重要指標(biāo)作為數(shù)據(jù)分析的基礎(chǔ),考慮到因子之間存在重疊性、相關(guān)性,并且若模型解釋變量太多則容易出現(xiàn)主次不分等問(wèn)題,因此我們對(duì)這些指標(biāo)做了主成分分析。通過(guò)主成分分析,在保留原數(shù)據(jù)絕大部分信息的同時(shí),我們選出了無(wú)相關(guān)性的五個(gè)綜合指標(biāo),進(jìn)而達(dá)到了降維的目的。在眾多的數(shù)據(jù)挖掘的算法中,聚類(lèi)分析是特別容易理解而且已經(jīng)被證明在選股方面是很有效的一種方法,所以本文選擇了K均值聚類(lèi)來(lái)研究選股策略,并且對(duì)K的選取做了對(duì)比,通過(guò)R軟件選出了最優(yōu)的K,從而將選股問(wèn)題演變?yōu)檫x類(lèi)問(wèn)題。事實(shí)證明,針對(duì)我們的數(shù)據(jù),當(dāng)K取5時(shí)聚類(lèi)效果最好,因此我們選出了7支股票作為最終選股結(jié)果,通過(guò)wind平臺(tái)調(diào)出已選股票的歷史K線,發(fā)現(xiàn)所選的股票的整體走勢(shì)幾乎都可以跑贏大盤(pán),而且未來(lái)有上升的趨勢(shì),事實(shí)證明文章所做的工作對(duì)股票投資者分析選擇股票具有一定的參考作用。
[Abstract]:In recent years, due to the continuous development of the stock market, the quantitative investment technology has attracted more and more attention of investors, and the quantitative investment system of our country has gradually matured. As the rules of the stock market continue to improve, the number of listed stocks and their related data are constantly increasing, and these data of stocks are many and complex, but contain a lot of useful information. So, how to find useful information from these massive amounts of data is obviously not solved by conventional methods. The data mining technology developed in recent years can help us to mine the data information we need from the massive stock data. Through the analysis of these data, we can model the information we want. This paper mainly discusses the quantitative stock selection model based on data mining. Firstly, according to two conditions, we preliminarily screen more than 3000 A-share stocks in Shanghai and Shenzhen stock market from 2013 to 2015: first, the return of net assets is stable and not less than 10% for three consecutive years, and the stock of St and other companies are excluded; Second, the main business growth rate and net profit growth rate is basically consistent and above 10%. After screening, 51 stocks with better fundamentals were retained. Secondly, we select 17 important indicators in the financial data that can reflect the company's profitability, debt service, growth and so on as the basis of the data analysis, considering the overlap and correlation among the factors. And if there are too many variables explained by the model, the primary and secondary problems are easy to occur, so we do the principal component analysis of these indexes. Through principal component analysis, we select five uncorrelated synthetic indexes while retaining most of the original data, and then achieve the goal of dimensionality reduction. Among the many algorithms of data mining, clustering analysis is especially easy to understand and has been proved to be a very effective method in stock selection, so this paper chooses K-means clustering to study stock selection strategy. By comparing the selection of K, the optimal K is selected by R software, and the stock selection problem is transformed into a class selection problem. It turns out that for our data, when K takes 5, the clustering effect is the best, so we select 7 stocks as the final stock selection result, and through the wind platform, we call out the historical K line of the selected stock. It is found that the overall trend of the selected stocks can almost outperform the market, and there is an upward trend in the future. The facts show that the work done in this paper has a certain reference role for stock investors to analyze and select stocks.
【學(xué)位授予單位】:天津商業(yè)大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類(lèi)號(hào)】:TP311.13;F832.51
【參考文獻(xiàn)】
相關(guān)期刊論文 前5條
1 李磊;;基于spss的股票量化投資決策[J];北方經(jīng)貿(mào);2014年10期
2 郭茜;;股票市場(chǎng)中主成分分析及聚類(lèi)分析的綜合應(yīng)用[J];科技風(fēng);2013年11期
3 李建軍;虞躍;;基于主成分分析的股票投資策略[J];長(zhǎng)春師范學(xué)院學(xué)報(bào)(自然科學(xué)版);2009年02期
4 曹文平;;一種有效k-均值聚類(lèi)中心的選取方法[J];計(jì)算機(jī)與現(xiàn)代化;2008年03期
5 吳元奇,馮榮揚(yáng);聚類(lèi)分析計(jì)算方法的理論及結(jié)果比較[J];湛江海洋大學(xué)學(xué)報(bào);2002年01期
相關(guān)碩士學(xué)位論文 前6條
1 李慧蘭;基于數(shù)據(jù)挖掘的量化投資策略實(shí)證研究[D];浙江大學(xué);2014年
2 張利平;基于多因子模型的量化選股[D];河北經(jīng)貿(mào)大學(xué);2014年
3 何裕;基于數(shù)據(jù)挖掘組合模型的股價(jià)預(yù)測(cè)研究[D];西南財(cái)經(jīng)大學(xué);2014年
4 朱博雅;一種基于數(shù)據(jù)挖掘的量化投資系統(tǒng)的設(shè)計(jì)與實(shí)現(xiàn)[D];復(fù)旦大學(xué);2012年
5 石煜;基于數(shù)據(jù)挖掘的數(shù)量化模型選股分析平臺(tái)[D];電子科技大學(xué);2012年
6 劉毅;因子選股模型在中國(guó)市場(chǎng)的實(shí)證研究[D];復(fù)旦大學(xué);2012年
,本文編號(hào):1856257
本文鏈接:http://www.sikaile.net/jingjilunwen/huobiyinxinglunwen/1856257.html