A
Tomato
Sequence
-
tagged
Connector
(STC)
Database
Lee,
S
.
,
Mao,
L
.
,
Main,
D
.
,
Wood,
T
.
,
Wing,
R
.
A
.
Clemson
University
Genomics
Institute,
100
Jordan
Hall,
Clemson,
SC
29631
USA
In
an
effort
to
develop
a
tomato
STC
database
for
genome
sequencing,
we
are
sequencing
the
ends
of
BAC
clones
from
a
15x
genome
equivalent
L
.
esculentum
BAC
library
(Budiman,
2000)
.
To
date,
we
have
generated
4,990
tomato
STCs
with
4,310
of
them
(86
.
4%)
having
an
average
sequence
length
of
372
.
4
high
quality
bases
.
All
STCs
were
searched
against
SwissProt
using
FASTX
(Pearson,
1988)
and
against
all
plant
sequences
downloaded
from
GenBank,
using
FASTA
(Pearson,
1988)
.
With
a
cutoff
expectation
(E)
value
of
<10
-
5,
1,756
sequences
(35
.
19%)
were
found
to
show
homology
with
known
sequences
.
As
shown
in
Fig
.
1,
about
40%
of
the
1,756
STCs
share
sequence
similarity
to
defined
gene
-
related
sequences
.
Various
retrotransposons
comprise
another
40%
of
all
the
STCs
having
a
match
with
GenBank,
suggesting
that
retrotransposons
are
a
major
component
of
the
tomato
genome
.
STCs
homologous
to
non
-
LTR
retrotransposons
were
also
found
and
reported
here
for
the
first
time
in
the
tomato
genome
according
to
our
GenBank
search
results
.
STCs
similar
to
repetitive
elements
constitute
13%
of
these
sequences
.
The
remaining
STCs
(6%),
which
we
labeled
miscellaneous
DNA,
were
homologous
with
GenBank
sequences
that
are
poorly
annotated
or
constitute
non
-
genomic
DNA,
such
as
chloroplast
and
mitochondrial
DNA
.
Retrotransposon
polyproteins,
i
.
e
.
T17459
(GenBank
acc
.
no
.
,
gypsy
-
like,
tomato),
Lere1
(copia
-
like,
tomato,
Mao
et
al
.
unpublished)
and
CAA73798
.
1
(GenBank
acc
.
no
.
,
non
-
LTR,
Beta
vulgaris),
were
used
as
queries
to
search
against
all
the
tomato
STCs
sequences
using
FASTA
and
TFASTA
(Pearson,
1988)
.
A
total
of
304
STCs
were
obtained,
of
which
195
were
homologous
to
gypsy
-
like
retrotransposons,
while
the
numbers
of
STCs
that
were
homologous
to
copia
-
like
and
non
-
LTR
retrotransposons
were
92
and
17,
respectively
.
It
is
interesting
that
the
ratio
of
tomato
STCs
homologous
to
each
type
of
retrotransposons
are
similar
to
that
shown
in
rice
(Fig
.
2),
i
.
e
.
gypsy
-
like
retrotransposons
make
up
more
than
half
of
the
total
STCs
homologous
to
retrotransposons
(Mao,
2000)
.
As
sequencing
has
progressed,
the
number
of
STCs
that
have
no
homology
to
GenBank
sequences
has
decreased
from
70%
in
our
previous
study
of
1205
STCs
(Budiman,
2000)
to
64%
.
We
expect
that
this
number
will
continue
to
decrease,
although
slowly
due
to
the
expected
large
number
of
retrotransposon
sequences
in
the
tomato
genome
.
The
4,990
tomato
BAC
ends
and
the
results
of
the
FASTX
and
FASTA
searches
are
accessible