Punctual Algorithm for Small Gene Prediction in DNA Sequences Using a Time-Frequency Approach Based on the Z-Curve

Hamidreza Saberkari, Mousa Shamsi, Mohammad Hossein Sedaaghi

Abstract


Identification of protein-coding regions in
Deoxyribonucleic Acid (DNA) sequences because of their 3-base
periodicity has been a challenging issue in bioinformatics. Many
DSP (Digital Signal Processing) techniques have been applied for
identification task and concentrated on assigning numerical
values to the symbolic DNA sequence and then applying spectral
analysis tools such as the short-time discrete Fourier transform
(ST-DFT) to locate periodicity components. In this paper, we
investigate the location of exons in DNA strand using Variable
Length Window approach based on z-curve. Z-curve is a unique
3-D curve to illustrate DNA's sequence which presents a complete
description of DNA's sequence biological behavior. The proposed
algorithm has a high accuracy and resolution due to applying
Gaussian window with an adjustable length to identify and
estimate exonic areas and non-coding regions are totally
eliminated. In order to extract period-3 component we used a
narrow-band band-pass filter with a central frequency of . The
proposed algorithm was applied on some gene sequences existed
in GenBank dataset and its results were compared by other
existing methods at the nucleotide level. Simulation results show
that our algorithm increases the accuracy of exon detection
relative to other methods for exon prediction.


Keywords


sequence; Protein coding region; Signal processing; Exon; S-Transform; Multistage filter.

Full Text:

PDF

Refbacks

  • There are currently no refbacks.