基于文本傾向性分析的民航事件輿情趨勢預(yù)測方法研究
本文選題:網(wǎng)絡(luò)輿情 切入點:垃圾評論識別 出處:《中國民航大學》2017年碩士論文 論文類型:學位論文
【摘要】:隨著我國民航業(yè)的高速發(fā)展,大眾對民航行業(yè)的關(guān)注度越來越高。微博、論壇等新媒體使民航輿情事件被高度關(guān)注。網(wǎng)民會借助這些平臺發(fā)表自己關(guān)于民航事件的評論,但網(wǎng)民產(chǎn)生的評論中存在與話題無關(guān),甚至是虛假的垃圾評論,所以在對民航事件分析前,首先需處理垃圾評論。此外,當前網(wǎng)民評論的情感傾向會對未來網(wǎng)民對同一事件的態(tài)度產(chǎn)生影響,因此準確客觀的對評論進行情感分析并對發(fā)展趨勢做出預(yù)測,對評估民航事件輿情的發(fā)展趨勢并提前進行應(yīng)對,是非常重要的。針對垃圾評論的識別和過濾,本文界定了評論是否重復(fù)出現(xiàn)、評論中政府部門出現(xiàn)次數(shù)等六個指標作為識別垃圾評論的特征。采用信息增益算法對特征進行權(quán)重計算,并利用粒子群優(yōu)化的支持向量機模型(PSO-SVM)進行垃圾評論的識別和過濾。因獲取預(yù)測指標是網(wǎng)絡(luò)輿情情感趨勢預(yù)測的前提,本文提出了不同于以往的單純熱度指標(例如,關(guān)注度、評論回復(fù)數(shù)、轉(zhuǎn)發(fā)數(shù)等)的評論情感傾向性值時間序列的預(yù)測指標。又因情感傾向性值呈現(xiàn)非線性、隨機性的特征,本文采用相關(guān)向量機模型進行趨勢預(yù)測來提高精度。本文設(shè)計了實驗,對文中研究成果做了分析和驗證。針對識別和過濾垃圾評論的問題,實驗分析了界定垃圾評論的特征數(shù)量和不同特征對垃圾評論識別的影響,實驗結(jié)果說明了選擇合適的特征對于垃圾評論識別的重要性。對于情感趨勢預(yù)測,本文將相關(guān)向量機模型、Elman神經(jīng)網(wǎng)絡(luò)及BP神經(jīng)網(wǎng)絡(luò)模型各自的預(yù)測結(jié)果進行了對比實驗。利用平均絕對誤差(MAE)和均方根誤差(RMSE)評價預(yù)測的準確性。通過對比實驗說明,相關(guān)向量機的預(yù)測性能優(yōu)于其他兩種模型并能更為準確的反映網(wǎng)民對輿情事件的情感趨勢。故本文對民航輿情分析中的垃圾評論識別和情感趨勢預(yù)測的研究是有意義的。
[Abstract]:With the rapid development of the civil aviation industry in China, the public is paying more and more attention to the civil aviation industry. New media such as Weibo, forum and other new media have made civil aviation public opinion events highly concerned. Netizens will use these platforms to make their own comments on civil aviation affairs. However, there are comments generated by Internet users that have nothing to do with the topic, or even false spam comments. Therefore, before analyzing the civil aviation incident, we should first deal with the garbage comments. In addition, The emotional tendency of current netizens' comments will have an impact on the attitude of future netizens to the same event, so accurately and objectively carry out the emotional analysis of the comments and make a prediction of the development trend. It is very important to assess the development trend of public opinion on civil aviation incidents and to deal with it in advance. In view of the identification and filtering of garbage comments, this paper defines whether the comments are repeated. Six indexes, such as the number of government departments appearing in the comments, are used to identify the spam comments. The information gain algorithm is used to calculate the weight of the features. The support vector machine model based on particle swarm optimization (PSO) is used to identify and filter garbage comments. Since obtaining prediction index is the premise of prediction of sentiment trend of network public opinion, this paper proposes a simple heat index (for example, concern degree), which is different from previous ones. The prediction index of the time series of the emotional tendency value of the comment, the response number of comment, the number of retweets, etc., and because of the nonlinear and random characteristics of the emotional tendency value, In this paper, the correlation vector machine model is used to predict the trend to improve the accuracy. Experiments are designed, and the research results are analyzed and verified. The experimental results show the importance of choosing suitable features for garbage comment recognition, and the prediction of emotion trend. In this paper, the correlation vector machine model Elman neural network and the BP neural network model are compared. The accuracy of the prediction is evaluated by using the mean absolute error (mae) and the root mean square error (RMSE). The prediction performance of correlation vector machine is better than the other two models and can more accurately reflect the emotional trend of Internet users' public opinion events. So this paper is meaningful to the garbage comment identification and emotional trend prediction in civil aviation public opinion analysis.
【學位授予單位】:中國民航大學
【學位級別】:碩士
【學位授予年份】:2017
【分類號】:TP391.1
【參考文獻】
相關(guān)期刊論文 前10條
1 李猛;劉元寧;;一種基于信息增益的新垃圾郵件特征選擇算法[J];吉林大學學報(理學版);2017年02期
2 張代磊;黃大年;張沖;;基于遺傳算法優(yōu)化的BP神經(jīng)網(wǎng)絡(luò)在密度界面反演中的應(yīng)用[J];吉林大學學報(地球科學版);2017年02期
3 昝紅英;畢銀龍;石金銘;;基于Adaboost算法與規(guī)則匹配的垃圾評論識別[J];鄭州大學學報(理學版);2017年01期
4 陳婷;王雪怡;曲霏;陳福集;;基于時序主題的網(wǎng)絡(luò)輿情熱點話題演化分析方法[J];華中師范大學學報(自然科學版);2016年05期
5 王振武;孫佳駿;尹成峰;;改進粒子群算法優(yōu)化的支持向量機及其應(yīng)用[J];哈爾濱工程大學學報;2016年12期
6 何炎祥;劉健博;孫松濤;;基于神經(jīng)網(wǎng)絡(luò)的微博輿情預(yù)測方法[J];華南理工大學學報(自然科學版);2016年09期
7 董松月;陳潤雨;劉西菩;趙穎莉;馬曉寧;;網(wǎng)絡(luò)民航事件虛假評論的識別研究[J];智能計算機與應(yīng)用;2016年04期
8 游丹丹;陳福集;;基于改進粒子群和BP神經(jīng)網(wǎng)絡(luò)的網(wǎng)絡(luò)輿情預(yù)測研究[J];情報雜志;2016年08期
9 梁f,
本文編號:1610019
本文鏈接:http://www.lk138.cn/shoufeilunwen/xixikjs/1610019.html