中国韩国日本在线观看免费,A级尤物一区,日韩精品一二三区无码,欧美日韩少妇色

多核環(huán)境下的生物信息序列比對并行優(yōu)化方法的研究

發(fā)布時間:2018-07-12 18:26

  本文選題:多核 + OpenMP。 參考:《黑龍江大學》2015年碩士論文


【摘要】:隨著大數(shù)據(jù)時代的到來,如何提高計算效率已經(jīng)成為焦點問題。隨著生物數(shù)據(jù)庫信息的日益增多,需要對原有的串行計算模式進行改變。同時提高主頻和淘汰單核心的多核心結構成為并行計算的主流。不同于GPU特殊的硬件要求,多核結構在數(shù)據(jù)傳輸、可移植性能上和發(fā)展前景上都具有優(yōu)勢,所以本文選擇在多核平臺上使用Open MP語言對廣泛使用的BLASTN進行并行計算的研究,同時使用基于Trie樹的預處理機制和調(diào)度分配算法更好的減少時間花銷。首先,本文提出基于Trie樹的預處理算法,主要思想是利用Trie樹過濾完全相同的字組,對數(shù)據(jù)庫進行簡化處理,減少BLASTN算法中匹配的次數(shù)。預處理機制包括將原數(shù)據(jù)庫分割成多個小數(shù)據(jù)庫,將數(shù)據(jù)庫中的目標序列劃分成長度為W的字組哈希表,建立Trie樹存儲相同的字組。實驗表明,建立Trie樹的預處理機制在數(shù)據(jù)庫規(guī)模較小時反而不如數(shù)據(jù)庫規(guī)模較大時高效,但是對于優(yōu)化BLASTN的并行算法有一定的作用。其次,本文研究了BLASTN算法的串行程序,分析其并行化可行性,使用Perf對BLASTN進行熱點函數(shù)分析,對BLASTN進行并行化改造。其并行BLASTN的思想主要在種子階段和延伸匹配階段,前者將查詢序列的字組劃分階段和查詢序列的字組與目標序列字組比對得到高分字組(HSP)階段同時并行化,同時,利用多個核心同時計算任務量;后者對延伸匹配階段實行左右同時進行延伸匹配和合并HSP的位置搜索樹上連續(xù)相鄰的字組,減少重復匹配次數(shù),使并行改造后的BLASTN進行加速。實驗表明,最好的情況下,并行后的BLASTN算法的時間與原來相比減小接近一半,即加速比為2,但是隨著序列數(shù)據(jù)庫的增加,加速比曲線將會持續(xù)上升。最后,針對處理器上多核心的計算任務的分配調(diào)度提出了基于棧的周期性調(diào)度分配算法ZD,衡量任務量大小的基準采用數(shù)據(jù)庫中序列的長度。實驗表明,本調(diào)度算法在一般情況下對計算量均衡分配和調(diào)度,在最壞的情況下ZD算法與無調(diào)度算法效率相同,并不影響其正常運行。
[Abstract]:With the arrival of big data era, how to improve computing efficiency has become the focus. With the increasing of biological database information, the original serial computing mode needs to be changed. At the same time, improving the main frequency and eliminating the single core multi-core architecture has become the mainstream of parallel computing. Different from the special hardware requirements of GPU, multi-core architecture has advantages in data transmission, portability and development prospects, so this paper chooses to use Open MP language on multi-core platform to study the parallel computing of widely used BLASTN. At the same time, the preprocessing mechanism and scheduling allocation algorithm based on Trie tree are used to reduce the time cost better. Firstly, a preprocessing algorithm based on Trie tree is proposed. The main idea is to use Trie tree to filter exactly the same groups of words, simplify the database and reduce the number of matching in BLASTN algorithm. The preprocessing mechanism consists of dividing the original database into several small databases, dividing the target sequences of the database into word-group hash tables with growth degree W, and establishing the Trie tree to store the same word groups. The experimental results show that the preprocessing mechanism of Trie tree is less efficient than that of large database when the scale of database is small, but it is useful to optimize the parallel algorithm of BLASTN. Secondly, the serial program of BLASTN algorithm is studied, and the feasibility of parallelization is analyzed. The hot function analysis of BLASTN is carried out by using Perf, and the parallel transformation of BLASTN is carried out. The idea of parallel BLASTN is mainly in seed stage and extended matching stage. The former parallelizes the word group of the query sequence and the target sequence word group to get the high score word group (HSP) phase simultaneously, and at the same time, Multiple cores are used to calculate the amount of work simultaneously, and the latter carries out the extension matching at the same time and combines the positions of HSP to search the successive adjacent word groups in the tree, which reduces the number of repeated matching and accelerates the parallel transformation of BLASTN. The experiments show that the time of the parallel BLASTN algorithm is nearly half that of the original algorithm, that is, the speedup ratio is 2, but with the increase of the sequence database, the speedup curve will continue to rise. Finally, a stack based periodic scheduling algorithm, ZD, is proposed for multi-core computing task allocation on the processor. The length of the sequence in the database is used as the benchmark to measure the size of the task. The experimental results show that the proposed algorithm has the same efficiency as the unscheduled algorithm in the worst case, and does not affect its normal operation.
【學位授予單位】:黑龍江大學
【學位級別】:碩士
【學位授予年份】:2015
【分類號】:Q811.4;TP311.13
,

本文編號:2118087

資料下載
論文發(fā)表

本文鏈接:http://www.lk138.cn/yixuelunwen/swyx/2118087.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權申明:資料由用戶2fcbe***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com