半監(jiān)督學(xué)習(xí)分類算法的研究

發(fā)布時間：2018-04-04 21:34

本文選題：半監(jiān)督學(xué)習(xí)　切入點(diǎn)：數(shù)據(jù)驅(qū)動　出處：《江蘇大學(xué)》2017年碩士論文

【摘要】：機(jī)器學(xué)習(xí)已成為計算機(jī)獲取知識的重要途徑和人工智能的重要標(biāo)志。傳統(tǒng)的機(jī)器學(xué)習(xí)技術(shù)需要使用大量有標(biāo)記樣本進(jìn)行訓(xùn)練,然而在很多實(shí)際應(yīng)用中,獲取大量的有標(biāo)記樣本相當(dāng)困難,而獲取大量未標(biāo)記樣本則相對容易得多。因此,只需標(biāo)注少量樣本的半監(jiān)督學(xué)習(xí)方法在模式識別和機(jī)器學(xué)習(xí)領(lǐng)域引起了極大的關(guān)注。本文主要針對半監(jiān)督學(xué)習(xí)的聚類與分類問題展開研究,完成的主要工作如下:根據(jù)半監(jiān)督學(xué)習(xí)理論中協(xié)同訓(xùn)練的思想,本文提出一種基于協(xié)同訓(xùn)練的支持向量機(jī)分類算法。該算法通過兩個不同的SVM分類器,獲取已標(biāo)記樣本中的信息,再分別預(yù)測未標(biāo)記樣本的標(biāo)類。利用相互驗(yàn)證方法篩選具有高置信度的結(jié)果,擴(kuò)充標(biāo)記樣本,根據(jù)擴(kuò)充后的標(biāo)記樣本,更新訓(xùn)練器實(shí)現(xiàn)半監(jiān)督學(xué)習(xí)。該方法在保證識別精度情況下,簡化了學(xué)習(xí)過程。利用UCI數(shù)據(jù)集,結(jié)合DAG-SVMs多分類策略證明了在標(biāo)記樣本較少的情況下本算法具有較高的分類精度,最后將算法應(yīng)用于原核蛋白蛹化點(diǎn)位的分類,獲得了良好的效果。針對當(dāng)初始標(biāo)記樣本量過少而導(dǎo)致的半監(jiān)督學(xué)習(xí)無法有效修正學(xué)習(xí)器的問題,本文提出一種基于聚類分析的自訓(xùn)練SVM分類算法。該算法首先選用半監(jiān)督模糊c均值聚類算法,挖掘整體樣本信息,再使用自訓(xùn)練SVM實(shí)現(xiàn)樣本分類,算法中通過二次篩選方法減少了錯分概率。本文考慮到時間序列的特殊性質(zhì),依據(jù)結(jié)構(gòu)學(xué)習(xí)原理,提出一種有監(jiān)督重構(gòu)算法,實(shí)現(xiàn)對原始時間序列的降維和特征提取。最后通過UCR數(shù)據(jù)集實(shí)驗(yàn)證明了本算法的有效性,并將算法應(yīng)用到化學(xué)物質(zhì)細(xì)胞毒性評估實(shí)驗(yàn)邊緣效應(yīng)的檢測,獲得了良好的檢測效果。
[Abstract]:Machine learning has become an important way for computer to acquire knowledge and an important symbol of artificial intelligence.Traditional machine learning technology needs to use a large number of labeled samples for training. However, in many practical applications, it is difficult to obtain a large number of labeled samples, but it is much easier to obtain a large number of unlabeled samples.Therefore, semi-supervised learning with only a small number of samples has attracted much attention in the field of pattern recognition and machine learning.This paper focuses on the clustering and classification of semi-supervised learning. The main work is as follows: according to the idea of cooperative training in semi-supervised learning theory, this paper proposes a support vector machine classification algorithm based on cooperative training.The algorithm uses two different SVM classifiers to obtain the information from the labeled samples and then predict the unlabeled samples respectively.The results with high confidence are screened by mutual verification method, and the labeled samples are expanded. According to the expanded tag samples, the semi-supervised learning is realized by updating the training device.This method simplifies the learning process under the condition that the recognition accuracy is guaranteed.Using UCI data set and DAG-SVMs multi-classification strategy, it is proved that this algorithm has higher classification accuracy when the number of labeled samples is small. Finally, the algorithm is applied to the classification of pupae position of prokaryotic protein, and good results are obtained.In order to solve the problem that semi-supervised learning can not effectively correct the learner when the initial sample size is too small, a self-training SVM classification algorithm based on clustering analysis is proposed in this paper.First, the semi-supervised fuzzy c-means clustering algorithm is used to mine the whole sample information, and then the self-training SVM is used to realize the classification of samples. In the algorithm, the probability of misdivision is reduced by using the quadratic filtering method.Considering the special properties of time series and based on the principle of structural learning, a supervised reconstruction algorithm is proposed to extract the dimensionality of the original time series.Finally, the effectiveness of the algorithm is proved by the UCR dataset experiment, and the algorithm is applied to the detection of the edge effect in the cytotoxicity assessment experiment of chemical substances, and a good detection effect is obtained.
【學(xué)位授予單位】：江蘇大學(xué)
【學(xué)位級別】：碩士
【學(xué)位授予年份】：2017
【分類號】：TP181

【參考文獻(xiàn)】

相關(guān)期刊論文前2條

1 張亮;李敏強(qiáng);;半監(jiān)督聚類中基于密度的約束擴(kuò)展方法[J];計算機(jī)工程;2008年10期

2 王玲;薄列峰;焦李成;;密度敏感的半監(jiān)督譜聚類[J];軟件學(xué)報;2007年10期

相關(guān)博士學(xué)位論文前3條

1 甘海濤;半監(jiān)督聚類與分類算法研究[D];華中科技大學(xué);2014年

2 蘭遠(yuǎn)東;基于圖的半監(jiān)督學(xué)習(xí)理論、算法及應(yīng)用研究[D];華南理工大學(xué);2012年

3 管仁初;半監(jiān)督聚類算法的研究與應(yīng)用[D];吉林大學(xué);2010年

，

本文編號：1711762

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會員下載

Download by Member

本文鏈接：http://lk138.cn/shoufeilunwen/xixikjs/1711762.html

上一篇：面向機(jī)器人導(dǎo)航的立體視覺及目標(biāo)檢測技術(shù)研究
下一篇：基于物聯(lián)網(wǎng)的水文監(jiān)測系統(tǒng)設(shè)計

論文發(fā)表

·知網(wǎng)|萬方|維普|龍?jiān)磡省級|國家級|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

国产伦乱,一曲二曲欧美日韩,AV在线不卡免费在线不卡免费,搞91AV视频

半監(jiān)督學(xué)習(xí)分類算法的研究