旅游數(shù)據(jù)的查詢與可視分析技術研究
發(fā)布時間:2018-06-17 03:04
本文選題:社交媒體 + 旅游數(shù)據(jù); 參考:《西南科技大學》2016年碩士論文
【摘要】:近年來,隨著在線社交媒體的發(fā)展和普及,越來越多的游客傾向于隨時隨地在社交媒體上發(fā)布旅游信息,產(chǎn)生了海量的、多維度、非結構化的旅游數(shù)據(jù)。面向這種復雜數(shù)據(jù)的研究吸引了廣大高校和企業(yè)界的廣泛關注。本文從三個方面介紹了社交媒體上旅游數(shù)據(jù)的研究工作:首先是旅游數(shù)據(jù)的采集以及預處理,其次是基于旅游數(shù)據(jù)的分析,包括Top-k支配查詢算法、文本情感挖掘技術、關鍵詞提取技術等,最后是基于旅游數(shù)據(jù)的可視化研究。1、針對社交媒體上旅游數(shù)據(jù)的采集以及預處理,首先介紹了獲取旅游社交網(wǎng)站旅游數(shù)據(jù)的過程,其次對比分析抓包方式和模擬瀏覽器方式獲取微博數(shù)據(jù),接著介紹了如何通過搜索功能獲取微博數(shù)據(jù),最后從數(shù)據(jù)清洗和數(shù)據(jù)集成的角度對數(shù)據(jù)預處理。2、基于旅游數(shù)據(jù)的分析,為滿足子空間Top-k支配查詢需求,本文展開了Topk支配查詢算法的研究。首先采用B+-Tree構建有序列表,接著采用輪詢調(diào)度算法根據(jù)查詢條件獲取k組終結元組,其次,根據(jù)生成的候選元組和終結元組,采用概率分布模型計算終結元組支配分數(shù)。迭代上述過程優(yōu)化查詢結果,直到滿足條件為止。本文采用SVM對短文本情感分類,特征選取包括標點符號、標簽、情感詞等。從實驗結果來看,本文的方法具有一定的使用價值。3、基于旅游數(shù)據(jù)的網(wǎng)絡輿情,提出了一種面向對象的可視分析Web框架,可以有效地提高了團隊協(xié)同開發(fā)的速度。本文設計并開發(fā)了針對旅游網(wǎng)絡輿情的可視化分析系統(tǒng),該系統(tǒng)支持游客地點信息、評論情感信息、社交網(wǎng)絡信息可視化顯示和交互分析,從而方便用戶多角度地理解游客的輿情信息,發(fā)現(xiàn)評論中隱含的特征、關系和趨勢等。大量實驗結果表明了該系統(tǒng)不僅能有效的分析游客地域傾向和情感變化,而且還幫助旅游管理部門及時了解旅游網(wǎng)絡輿情。
[Abstract]:In recent years, with the development and popularization of online social media, more and more tourists tend to publish travel information on social media anytime and anywhere, which produces massive, multi-dimensional, unstructured travel data. The research of this kind of complex data attracts the extensive attention of universities and business circles. This paper introduces the research work of tourism data on social media from three aspects: first, the collection and preprocessing of tourism data; secondly, the analysis based on tourism data, including Top-k dominating query algorithm, text emotion mining technology. Finally, based on the visualization research of tourism data, aiming at the collection and preprocessing of tourism data on social media, this paper first introduces the process of obtaining tourism data of tourism social network. Secondly, the paper compares and analyzes how to obtain Weibo data by means of packet capture and simulation browser, and then introduces how to obtain Weibo data by searching function. Finally, it analyzes the data preprocessing from the angle of data cleaning and data integration, based on the analysis of travel data. In order to satisfy the demand of subspace Top-k dominating query, this paper develops the research of Topk dominating query algorithm. First, B Tree is used to construct an ordered list, then polling scheduling algorithm is used to obtain k terminal tuples according to the query conditions. Secondly, a probability distribution model is used to calculate the final tuple dominating fraction according to the candidate tuple and the final tuple. Iterate the above procedure to optimize the query results until the conditions are met. In this paper, SVM is used to classify the emotion of short text. The feature selection includes punctuation, label, affective words and so on. According to the experimental results, the method of this paper has some practical value. Based on the network public opinion of tourism data, an object-oriented visual analysis Web framework is proposed, which can effectively improve the speed of team collaborative development. This paper designs and develops a visual analysis system for tourism network public opinion. The system supports tourist location information, comments emotional information, social network information visual display and interactive analysis. It is convenient for users to understand tourists' public opinion information from many angles, and to discover the implied features, relationships and trends in the comments. A large number of experimental results show that the system can not only effectively analyze the regional tendency and emotional change of tourists, but also help the tourism management department to understand the tourism network public opinion in a timely manner.
【學位授予單位】:西南科技大學
【學位級別】:碩士
【學位授予年份】:2016
【分類號】:TP274
,
本文編號:2029360
本文鏈接:http://www.lk138.cn/kejilunwen/zidonghuakongzhilunwen/2029360.html
最近更新
教材專著