| |
Among the transcribed regions of a gene,
introns are the most polymorphic; therefore,
they are ideally suited for developing molecular
markers. The identification of intron regions,
however, is not a straight-forward process and
involves the alignment of expressed sequence
tags (EST) or cDNA with their genomic counterpart.
In cotton, this process is exacerbated by
the scarcity of cotton genomic DNA sequences in
Genbank. In this study, the possibility of utilizing
genomic sequences from Arabidopsis, whose
genome has been completely sequenced, to locate
intron regions in cotton was evaluated. Cotton
ESTs were searched in BLAST against the Arabidopsis
database to identify orthologous genes.
Cotton introns were identified with a 92% success
rate, based on the alignment of cotton ESTs with
Arabidopsis genomic DNA, which demonstrated
that this approach is both feasible and practical
for predicting the locations of introns in cotton
ESTs. A majority of cotton introns had the canonical
GT-AG splice site junctions, facilitating their
identification in the sequence alignment process.
Comparison of sequences between G. arboreum L.
and G. raimondii Ulbr. indicated that introns had
an almost four-fold greater variation in nucleotides
than exons. A majority of the differences
were due to a repeating thymine (T) or to the
number of simple sequence repeat motifs. |
|
|
|
|
|