面向圖像內(nèi)容檢索的卷積神經(jīng)網(wǎng)絡(luò)

發(fā)布時間：2018-06-17 08:47

本文選題：卷積神經(jīng)網(wǎng)絡(luò) + 圖像分類　；參考：《杭州電子科技大學(xué)》2017年碩士論文

【摘要】：圖像分類以及檢索一直是圖像領(lǐng)域的經(jīng)典問題,隨著移動互聯(lián)網(wǎng)的快速發(fā)展,圖片信息量也呈爆炸式增長,對海量圖片信息的分類已經(jīng)成為一個研究熱點。傳統(tǒng)的圖像分類方法針對特定的圖像由人工去設(shè)計特征,其魯棒性較差,而且需要豐富的先驗知識。卷積神經(jīng)網(wǎng)絡(luò)方法則在該領(lǐng)域取得了重大突破,它可以自動從海量圖片中學(xué)習(xí)到屬于原始圖像的本質(zhì)特征進行分類,相比傳統(tǒng)方法具有更好地識別率和實用性。卷積神經(jīng)網(wǎng)絡(luò)模擬人的視覺系統(tǒng),將特征的提取過程分為從低到高多個層次,以網(wǎng)絡(luò)深度獲得高度抽象特征,它直接將圖片作為網(wǎng)絡(luò)的輸入,并且利用局部感受野、權(quán)值共享和子采樣技術(shù)減少網(wǎng)絡(luò)參數(shù)數(shù)量,從而避免權(quán)值數(shù)量過多導(dǎo)致過擬合,也使網(wǎng)絡(luò)具有一定程度上的平移、旋轉(zhuǎn)和扭曲不變性。目前,卷積神經(jīng)網(wǎng)絡(luò)已廣泛應(yīng)用于圖像檢索,其識別率和實用性均優(yōu)于傳統(tǒng)的分類方法,因此對卷積神經(jīng)網(wǎng)絡(luò)在圖像內(nèi)容檢索上應(yīng)用的研究具有十分重要的意義。本文主要從實際應(yīng)用和網(wǎng)絡(luò)改進兩方面進行研究,論文的主要工作如下:(1)針對CNN網(wǎng)絡(luò)模型設(shè)計的過程中,各參數(shù)如何選擇的問題,通過調(diào)整CNN中卷積核的個數(shù)和大小、采樣層的搭配方式以及激活函數(shù)進行對比實驗,發(fā)現(xiàn)在增加卷積核個數(shù)、減小核尺寸、使用Relu激活函數(shù)、第一個采樣層使用最大值采樣這些情況下,CNN在MNIST和CIFAR-10數(shù)據(jù)庫上的性能更好。(2)針對古玩圖片數(shù)據(jù)集的分類,提出一種圖片大小不一情況下數(shù)據(jù)預(yù)處理的方法,解決圖片目標(biāo)在格式統(tǒng)一時發(fā)生形變的問題;提出一種目標(biāo)與背景分離后再輸入到CNN的方法,并在古玩數(shù)據(jù)集進行實驗驗證該方法所用的CNN相比圖片直接輸入CNN,其網(wǎng)絡(luò)結(jié)構(gòu)更簡單,識別率更高;通過實驗驗證CNN在圖片包含多目標(biāo)的情況下仍然具有優(yōu)秀的分類性能;針對整個古玩數(shù)據(jù)集各類別樣本數(shù)量不平衡的情況,提出CNN結(jié)合HOG+SVM的方法進行分類,并通過實驗證明該方法比直接利用CNN分類的識別率要高。(3)針對CNN中常用的采樣方式各有優(yōu)缺點的情況,提出一種在采樣層分別進行最大值采樣和均值采樣的網(wǎng)絡(luò)模型(并行采樣模型),實驗驗證該模型相比傳統(tǒng)CNN泛化性能更好;另外,提出一種對CNN進行預(yù)訓(xùn)練,使網(wǎng)絡(luò)訓(xùn)練時可以剔除噪聲樣本的方法,解決在訓(xùn)練樣本中有噪聲時直接訓(xùn)練網(wǎng)絡(luò)會無法收斂的問題。
[Abstract]:Image classification and retrieval is a classic problem in the field of image. With the rapid development of mobile Internet, the amount of image information is also explosive growth, the classification of mass image information has become a research hotspot. Traditional image classification methods design features manually for a particular image, which is less robust and requires abundant prior knowledge. The convolutional neural network method has made a great breakthrough in this field. It can automatically learn the essential features of the original image from the massive images for classification. Compared with the traditional method, it has better recognition rate and practicability. Convolution neural network simulates human visual system, classifies the feature extraction process from low to high levels, obtains highly abstract features by network depth. It directly takes pictures as the input of the network, and uses the local receptive field. The techniques of weight sharing and subsampling reduce the number of network parameters so as to avoid overfitting caused by too many weights and make the network have the invariance of translation rotation and distortion to a certain extent. At present, convolution neural network has been widely used in image retrieval, its recognition rate and practicability are better than traditional classification methods, so it is very important to study the application of convolution neural network in image content retrieval. The main work of this paper is as follows: 1) aiming at the problem of how to select the parameters in the process of CNN network model design, we adjust the number and size of convolutional cores in CNN. The collocation of the sampling layer and the activation function are compared. It is found that when increasing the number of convolution kernels and reducing the size of the core, the Relu activation function is used. The first sampling layer uses maximum sampling in these cases CNN performs better on MNIST and CIFAR-10 databases. (2) aiming at the classification of antiques image data sets, a method of data preprocessing with different image sizes is proposed. In order to solve the problem that the image object is deformed when the format is unified, a method is proposed to separate the target from the background and then input it to CNN. Compared with CNN-based images, CNN has simpler network structure and higher recognition rate, and it has excellent classification performance in the case of multi-target images. In view of the imbalance in the number of different types of samples in the whole antique data set, a CNN combined with hog SVM method is proposed for classification. It is proved by experiments that the recognition rate of this method is higher than that of using CNN classification directly.) the sampling methods commonly used in CNN have their own advantages and disadvantages. In this paper, a network model of maximum sampling and mean sampling in sampling layer is proposed. The experiment results show that the proposed model has better generalization performance than traditional CNN, and a new network model is proposed to pretrain CNN. The method of eliminating noise samples can be used in network training to solve the problem that the direct training network can not converge when there is noise in the training samples.
【學(xué)位授予單位】：杭州電子科技大學(xué)
【學(xué)位級別】：碩士
【學(xué)位授予年份】：2017
【分類號】：TP391.41;TP183

【參考文獻】

相關(guān)期刊論文前3條

1 舒文娉;劉全香;;基于支持向量機的印品缺陷分類方法[J];包裝工程;2014年23期

2 應(yīng)義斌;桂江生;饒秀勤;;基于Zernike矩的水果形狀分類[J];江蘇大學(xué)學(xué)報(自然科學(xué)版);2007年01期

3 李向陽,莊越挺,潘云鶴;基于內(nèi)容的圖像檢索技術(shù)與系統(tǒng)[J];計算機研究與發(fā)展;2001年03期

，

本文編號：2030444

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會員下載

Download by Member

本文鏈接：http://lk138.cn/kejilunwen/zidonghuakongzhilunwen/2030444.html

上一篇：面向低能耗的型腔加工刀具組合優(yōu)化模型
下一篇：無線傳感網(wǎng)和神經(jīng)網(wǎng)絡(luò)的民用火藥貯存在線監(jiān)測系統(tǒng)

論文發(fā)表

·知網(wǎng)|萬方|維普|龍源|省級|國家級|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

国产伦乱,一曲二曲欧美日韩,AV在线不卡免费在线不卡免费,搞91AV视频

面向圖像內(nèi)容檢索的卷積神經(jīng)網(wǎng)絡(luò)