Template Image

Gene Pair Detective- A Computational Tool to Guide Discovery of Genomic Sequence from Poorly Represented Genomes

 

In our early studies of strawberry (only a few years ago) there were only 58 sequences in Genbank, and none of them were particularly useful.  The slim set represented only a few common, abundant transcripts or didn't have any matches.

We were interested in genomic sequence information, especially the information residing in the intergenic regions.  The cultivated strawberry is octoploid, so efforts of mapping in the polyploid might benefit from the polymorphism-rich regions between genes rather than strict coding sequence. But how do you get genome sequence without cloning and sequencing?

EST information is relatively inexpensive to harvest, even with Sanger technology.  We hypothesized that if we compared EST information in our tiny dataset against a completely sequenced genome, we'd find instances where two ESTs corresponded to adjacent genes. Primer prediction software could then suggest primers corresponding to these regions.

By recruiting the talents of computational scientists on campus we devised Gene Pair Detective, a simple program that compares your ESTs against sequenced genomes and tests them for collinear relationships.

A manuscript detailing the process and use is in review at this time and details will be provided on acceptance.

For now, the files are available from the following URL, and include a file called 'Documentation' that describes how to covert user data into a useable form, run the program and change variables.

http://rapidshare.com/files/228477075/Code-GPD.rar.html

This is a large file because it is supplied with an Arabidopsis dataset.