基于協(xié)同過濾的新聞推薦系統(tǒng)在Hadoop上的研究與實現(xiàn)

發(fā)布時間：2018-03-02 06:10

本文關(guān)鍵詞： 內(nèi)容聚合平臺推薦系統(tǒng) 協(xié)同過濾基于用戶推薦混合推薦　出處：《鄭州大學》2017年碩士論文　論文類型：學位論文

【摘要】：隨著信息時代的來臨,互聯(lián)網(wǎng)承載的信息量越來越大,導致了人們在瀏覽新聞門戶網(wǎng)站時,很難找到自己感興趣的信息。內(nèi)容聚合平臺及推薦系統(tǒng)的出現(xiàn)就是為了解決信息過載的問題。內(nèi)容聚合平臺從各大新聞網(wǎng)站上爬取新聞資訊,儲存到本地系統(tǒng)上后,再通過推薦系統(tǒng)推送給平臺的用戶,為每個用戶提供個性化新聞推薦。常規(guī)的推薦系統(tǒng)一般基于協(xié)同過濾算法,然而,基于協(xié)同過濾的熱點推薦和基于用戶推薦卻存在部分瑕疵。常規(guī)的熱點推薦算法,會將新聞的熱度值以固定的衰減系數(shù)進行衰減;本文通過測試分析,發(fā)現(xiàn)這種算法不能平衡用戶流量與熱度衰減之間的不均衡性,會導致熱點捕捉率很低。而常規(guī)的基于用戶推薦只使用了協(xié)同過濾算法,以近鄰用戶群為基礎對用戶進行推薦,在用戶興趣發(fā)生改變時會導致推薦結(jié)果不準確。本文在建立內(nèi)容聚合平臺的基礎上,研究并改進了這兩種推薦算法。本文的主要工作和創(chuàng)新點如下:1.建立了一個簡單但完善的內(nèi)容聚合平臺,其包括幾個子系統(tǒng):web系統(tǒng)、緩存服務、爬蟲、數(shù)據(jù)庫服務以及Hadoop集群,各個子系統(tǒng)間通過相應的協(xié)議進行通信;內(nèi)容聚合平臺主要為推薦算法提供服務。2.針對熱點推薦算法,本文提出了自適應時間衰減系數(shù)的熱點新聞推薦,綜合考慮了單條新聞流量與系統(tǒng)流量,以此來計算每條新聞的時間衰減系數(shù)。測試表明,自適應時間衰減系數(shù)的熱點推薦算法能有效提高熱點捕捉率。最后,對于新入新聞使用了潛在熱點挖掘算法以更新熱點新聞。3.針對基于用戶推薦算法,本文提出了自修正用戶模型的基于用戶推薦。首先分析出各個用戶的近鄰用戶群,再結(jié)合基于項目推薦,為每個用戶生成單獨的推薦列表;然后,分析用戶的瀏覽歷史,利用修正算法定期修正用戶模型,以跟蹤用戶興趣變化,提供更好的推薦。測試表明,修正算法能在三次迭代內(nèi)完成對于大多數(shù)用戶模型的修正,改進后的基于用戶推薦算法能提供比常規(guī)算法更好的查準率與查全率。最后,本文結(jié)合熱點新聞推薦,實現(xiàn)了用戶個性化推薦,以挖掘用戶的潛在興趣。
[Abstract]:With the advent of the information age, the Internet is carrying more and more information, which leads people to browse news portals. It's hard to find information that you're interested in. Content aggregation platforms and recommendation systems have emerged to solve the problem of information overload. Content aggregation platforms crawl news information from major news websites and store it on local systems. Then push the recommendation system to the users of the platform to provide personalized news recommendation for each user. The conventional recommendation system is usually based on collaborative filtering algorithm, however, The hot spot recommendation based on collaborative filtering and the user recommendation have some defects. The conventional hot spot recommendation algorithm attenuates the calorific value of the news with a fixed attenuation coefficient. It is found that this algorithm can not balance the imbalance between user flow and heat attenuation, which will lead to a very low hot spot capture rate. However, the conventional recommendation based on user recommendation only uses collaborative filtering algorithm, and recommends users on the basis of nearest neighbor user group. Recommendation results can be inaccurate when user interests change. The main work and innovation of this paper are as follows: 1. A simple but perfect content aggregation platform is established, which includes several subsystems: Web system, cache service, crawler, database service and Hadoop cluster. Each subsystem communicates through the corresponding protocols. The content aggregation platform mainly provides services for the recommendation algorithm. 2. Aiming at the hot spot recommendation algorithm, this paper puts forward the adaptive time attenuation coefficient of the hot news recommendation. The time attenuation coefficient of each news is calculated by synthetically considering the single news flow and system traffic. The test results show that the hot spot recommendation algorithm with adaptive time attenuation coefficient can effectively improve the hot spot capture rate. For the new news, we use the latent hotspot mining algorithm to update the hot news. 3. For the user recommendation algorithm, this paper proposes a user recommendation based on self-modified user model. Firstly, the nearest neighbor user groups of each user are analyzed. Combined with project-based recommendations, a separate recommendation list is generated for each user. Then, the browsing history of the user is analyzed, and the user model is regularly modified by the modified algorithm to track the change of user interest and provide better recommendations. The modified algorithm can complete the revision of most user models in three iterations, and the improved user-based recommendation algorithm can provide better recall and recall than the conventional algorithm. Finally, this paper combines the hot news recommendation. The user personalized recommendation is implemented to tap the potential interest of the user.
【學位授予單位】：鄭州大學
【學位級別】：碩士
【學位授予年份】：2017
【分類號】：TP391.3

【相似文獻】

相關(guān)期刊論文前10條

1 徐義峰;陳春明;徐云青;;一種基于分類的協(xié)同過濾算法[J];計算機系統(tǒng)應用;2007年01期

2 楊風召;;一種基于特征表的協(xié)同過濾算法[J];計算機工程與應用;2007年06期

3 王嵐;翟正軍;;基于時間加權(quán)的協(xié)同過濾算法[J];計算機應用;2007年09期

4 曾子明;張李義;;基于多屬性決策和協(xié)同過濾的智能導購系統(tǒng)[J];武漢大學學報(工學版);2008年02期

5 張富國;;用戶多興趣下基于信任的協(xié)同過濾算法研究[J];小型微型計算機系統(tǒng);2008年08期

6 侯翠琴;焦李成;張文革;;一種壓縮稀疏用戶評分矩陣的協(xié)同過濾算法[J];西安電子科技大學學報;2009年04期

7 廖新考;;基于用戶特征和項目屬性的混合協(xié)同過濾推薦[J];福建電腦;2010年07期

8 沈磊;周一民;李舟軍;;基于心理學模型的協(xié)同過濾推薦方法[J];計算機工程;2010年20期

9 徐紅;彭黎;郭艾寅;徐云劍;;基于用戶多興趣的協(xié)同過濾策略改進研究[J];計算機技術(shù)與發(fā)展;2011年04期

10 焦晨斌;王世卿;;基于模型填充的混合協(xié)同過濾算法[J];微計算機信息;2011年11期

相關(guān)會議論文前10條

1 沈杰峰;杜亞軍;唐俊;;一種基于項目分類的協(xié)同過濾算法[A];第二十二屆中國數(shù)據(jù)庫學術(shù)會議論文集（技術(shù)報告篇）[C];2005年

2 周軍鋒;湯顯;郭景峰;;一種優(yōu)化的協(xié)同過濾推薦算法[A];第二十一屆中國數(shù)據(jù)庫學術(shù)會議論文集（研究報告篇）[C];2004年

3 董全德;;基于雙信息源的協(xié)同過濾算法研究[A];全國第20屆計算機技術(shù)與應用學術(shù)會議（CACIS·2009）暨全國第1屆安全關(guān)鍵技術(shù)與應用學術(shù)會議論文集（上冊）[C];2009年

4 張光衛(wèi);康建初;李鶴松;劉常昱;李德毅;;面向場景的協(xié)同過濾推薦算法[A];中國系統(tǒng)仿真學會第五次全國會員代表大會暨2006年全國學術(shù)年會論文集[C];2006年

5 李建國;姚良超;湯庸;郭歡;;基于認知度的協(xié)同過濾推薦算法[A];第26屆中國數(shù)據(jù)庫學術(shù)會議論文集（B輯）[C];2009年

6 王明文;陶紅亮;熊小勇;;雙向聚類迭代的協(xié)同過濾推薦算法[A];第三屆全國信息檢索與內(nèi)容安全學術(shù)會議論文集[C];2007年

7 胡必云;李舟軍;王君;;基于心理測量學的協(xié)同過濾相似度方法(英文)[A];NDBC2010第27屆中國數(shù)據(jù)庫學術(shù)會議論文集(B輯)[C];2010年

8 林麗冰;師瑞峰;周一民;李月雷;;基于雙聚類的協(xié)同過濾推薦算法[A];2008'中國信息技術(shù)與應用學術(shù)論壇論文集（一）[C];2008年

9 羅喜軍;王韜丞;杜小勇;劉紅巖;何軍;;基于類別的推薦——一種解決協(xié)同推薦中冷啟動問題的方法[A];第二十四屆中國數(shù)據(jù)庫學術(shù)會議論文集（研究報告篇）[C];2007年

10 黃創(chuàng)光;印鑒;汪靜;劉玉葆;王甲海;;不確定近鄰的協(xié)同過濾推薦算法[A];NDBC2010第27屆中國數(shù)據(jù)庫學術(shù)會議論文集A輯一[C];2010年

相關(guān)重要報紙文章前8條

1 本報記者郭濤;機器大數(shù)據(jù)也離不開Hadoop[N];中國計算機報;2013年

2 本報記者王星;Hadoop引發(fā)大數(shù)據(jù)之戰(zhàn)[N];電腦報;2012年

3 本報記者鄒大斌;Hadoop一體機降低大數(shù)據(jù)門檻[N];計算機世界;2012年

4 孫定;云計算、大數(shù)據(jù)與Hadoop[N];計算機世界;2011年

5 樂天　編譯;Hadoop：打開大數(shù)據(jù)之門的金鑰匙[N];計算機世界;2012年

6 范范　編譯;Hadoop用戶可以使用多種搜索引擎[N];網(wǎng)絡世界;2013年

7 ;大數(shù)據(jù)如何“落地”[N];中國新聞出版報;2014年

8 波波　編譯;Hadoop、Web 2.0為磁帶帶來新商機[N];網(wǎng)絡世界;2013年

相關(guān)博士學位論文前10條

1 紀科;融合上下文信息的混合協(xié)同過濾推薦算法研究[D];北京交通大學;2016年

2 程殿虎;基于協(xié)同過濾的社會網(wǎng)絡推薦系統(tǒng)關(guān)鍵技術(shù)研究[D];中國海洋大學;2015年

3 于程遠;基于QoS的Web服務推薦技術(shù)研究[D];上海交通大學;2015年

4 李聰;電子商務推薦系統(tǒng)中協(xié)同過濾瓶頸問題研究[D];合肥工業(yè)大學;2009年

5 郭艷紅;推薦系統(tǒng)的協(xié)同過濾算法與應用研究[D];大連理工大學;2008年

6 羅恒;基于協(xié)同過濾視角的受限玻爾茲曼機研究[D];上海交通大學;2011年

7 薛福亮;電子商務協(xié)同過濾推薦質(zhì)量影響因素及其改進機制研究[D];天津大學;2012年

8 周魏;推薦系統(tǒng)中基于目標項目分析的托攻擊檢測研究[D];重慶大學;2015年

9 田剛;融合維基知識的情境感知Web服務發(fā)現(xiàn)方法研究[D];武漢大學;2015年

10 胡亮;集成多元信息的推薦系統(tǒng)建模方法的研究[D];上海交通大學;2015年

相關(guān)碩士學位論文前10條

1 讓家恒;基于協(xié)同過濾的新聞推薦系統(tǒng)在Hadoop上的研究與實現(xiàn)[D];鄭州大學;2017年

2 梁四香;基于改進協(xié)同過濾的推薦系統(tǒng)研究與實現(xiàn)[D];鄭州大學;2017年

3 呂杰;一種融合用戶上下文信息和評分傾向度的協(xié)同過濾推薦系統(tǒng)[D];天津大學;2016年

4 張路一;推薦系統(tǒng)中基于相似性計算的協(xié)同過濾算法研究[D];鄭州大學;2017年

5 鄒騰飛;基于多特征融合的混合協(xié)同過濾算法研究[D];西南大學;2015年

6 于鈺雯;基于項目凝聚層次聚類的協(xié)同過濾推薦算法研究[D];遼寧大學;2015年

7 杜文剛;基于多屬性評分的協(xié)同過濾推薦算法研究[D];遼寧大學;2015年

8 揭正梅;基于協(xié)同過濾的高校個性化就業(yè)推薦系統(tǒng)研究[D];昆明理工大學;2015年

9 高慧敏;融合占有度的時間遺忘協(xié)同過濾混合推薦算法研究[D];燕山大學;2015年

10 蘇靖涵;面向SaaS多租戶的動態(tài)推薦方法研究[D];遼寧大學;2015年

，

本文編號：1555334

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會員下載

Download by Member

本文鏈接：http://lk138.cn/kejilunwen/ruanjiangongchenglunwen/1555334.html

上一篇：基于ORB算法和OECF模型的快速圖像拼接研究
下一篇：大壩安全監(jiān)測輔助設計軟件的開發(fā)與應用

論文發(fā)表

·知網(wǎng)|萬方|維普|龍源|省級|國家級|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

国产伦乱,一曲二曲欧美日韩,AV在线不卡免费在线不卡免费,搞91AV视频

基于協(xié)同過濾的新聞推薦系統(tǒng)在Hadoop上的研究與實現(xiàn)