評價對象短語識別在情感分析領域的研究與實現(xiàn)
[Abstract]:In recent years, with the rapid development of mobile Internet, Weibo, as a new social network medium, has a rapid rise, every day has produced a huge amount of social data for users. As a main carrier of mobile social networking, Weibo is rich in content and high in data value. The identification and affective analysis of Weibo data can provide important reference for government public opinion monitoring, enterprise advertising, user behavior prediction and information decision-making. Weibo's affective analysis mainly consists of two elements: target phrase recognition and affective orientation analysis. Due to the scattered content of Weibo, identifying the subject of comment on blog has become a hot and difficult point in the affective analysis of Weibo. The research shows that the recognition of unrecorded words is one of the important factors leading to the low recognition rate of Chinese evaluation object phrases. Therefore, it is very important and meaningful to study the extraction method of Weibo evaluation phrase based on unrecorded word recognition. In this paper, the feature vectors of the unrecorded word recognition model are designed from three aspects: feature selection, classifier selection and feature template selection, to improve the recognition rate, and then the algorithm is applied to the evaluation object phrase recognition. The validity of the experiment is verified by Weibo's actual corpus. The main work of this paper is as follows: 1. First, a statistical feature based on text word sequence, cohesion, left and right degrees of freedom is proposed as the feature of unrecorded word recognition, and then through naive Bayes, decision tree, logic regression, Support vector machine (SVM) and artificial neural network (Ann) are the five classification algorithms to identify unrecorded words, and compare the recognition results. An artificial neural network classification algorithm with good recognition effect for unrecorded words is selected as the decision model of unrecorded words. (2) then, three symbols of BIO are introduced, and the conditional random field CRFs is used to transform the evaluation phrase recognition problem into the sequence tagging problem. When identifying the target phrase, the appropriate feature template is selected, and the unrecorded words generated by artificial neural network training are applied to the process of identifying the evaluation object phrase. 3. The data of one day of Sina Weibo is chosen as the data source of this paper. After manual tagging, the experiment of evaluating object phrase recognition is carried out. The experimental results show that the accuracy and recall rate of phrase extraction of evaluation objects can be significantly improved by adding the unrecorded words in Weibo text which is automatically recognized into the evaluation object phrase recognition algorithm based on CRFs.
【學位授予單位】:東華大學
【學位級別】:碩士
【學位授予年份】:2017
【分類號】:TP391.1;TP18
【參考文獻】
相關期刊論文 前10條
1 葉成緒;楊萍;劉少鵬;;基于主題詞的微博熱點話題發(fā)現(xiàn)[J];計算機應用與軟件;2016年02期
2 李文坤;張仰森;陳若愚;;基于詞內(nèi)部結合度和邊界自由度的新詞發(fā)現(xiàn)[J];計算機應用研究;2015年08期
3 唐波;陳光;王星雅;王非;陳小慧;;微博新詞發(fā)現(xiàn)及情感傾向判斷分析[J];山東大學學報(理學版);2015年01期
4 霍帥;張敏;劉奕群;馬少平;;基于微博內(nèi)容的新詞發(fā)現(xiàn)方法[J];模式識別與人工智能;2014年02期
5 周紅照;侯明午;顏彭莉;張葉青;侯敏;滕永林;;語義特征在評價對象抽取與極性判定中的作用[J];北京大學學報(自然科學版);2014年01期
6 陳飛;劉奕群;魏超;張云亮;張敏;馬少平;;基于條件隨機場方法的開放領域新詞發(fā)現(xiàn)[J];軟件學報;2013年05期
7 鄭敏潔;雷志城;廖祥文;陳國龍;;中文句子評價對象抽取的特征分析研究[J];福州大學學報(自然科學版);2012年05期
8 林江豪;陽愛民;周詠梅;陳錦;蔡澤鍵;;一種基于樸素貝葉斯的微博情感分類[J];計算機工程與科學;2012年09期
9 顧正甲;姚天f ;;評價對象及其傾向性的抽取和判別[J];中文信息學報;2012年04期
10 徐遠方;李成城;;基于SVM和詞間特征的新詞識別研究[J];計算機技術與發(fā)展;2012年05期
相關會議論文 前4條
1 王倩;何婷婷;聞彬;宋樂;張茂元;;基于依存關系的中文情感要素抽取技術研究[A];中國計算機語言學研究前沿進展(2007-2009)[C];2009年
2 姚天f ;聶青陽;李建超;李林琳;婁德成;陳珂;付宇;;一個用于漢語汽車評論的意見挖掘系統(tǒng)[A];中文信息處理前沿進展——中國中文信息學會二十五周年學術會議論文集[C];2006年
3 倪茂樹;林鴻飛;;基于關聯(lián)規(guī)則和極性分析的商品評論挖掘[A];第三屆全國信息檢索與內(nèi)容安全學術會議論文集[C];2007年
4 王芳;萬常選;;基于可信度的中文完整詞自動識別[A];第四屆全國信息檢索與內(nèi)容安全學術會議論文集(上)[C];2008年
相關碩士學位論文 前4條
1 李文坤;面向微博的新詞發(fā)現(xiàn)和話題檢測技術研究[D];北京信息科技大學;2015年
2 侯立斌;中文事件抽取與缺失角色填充的研究[D];蘇州大學;2012年
3 朱洪;面向互聯(lián)網(wǎng)中文輿情信息的情感傾向分析[D];國防科學技術大學;2011年
4 徐東興;基于Gate框架的信息抽取系統(tǒng)的研究與實現(xiàn)[D];華東師范大學;2007年
,本文編號:2268739
本文鏈接:http://www.lk138.cn/wenyilunwen/guanggaoshejilunwen/2268739.html