English
NAME
G - G-language core module in Perl (Prelude)
SYNOPSIS
use G; # Imports G module
$gb = new G("ecoli.gbk"); # Creates G's instance at $gb
# At the same time, read in ecoli.gbk.
# Read the annotation and sequential
# information
# See DESCRIPTION for details
$gb->seq_info(); # Prints the basic sequence information.
$MT::find_ori_ter(\$gb->{SEQ}); # Gives sequence as a reference to
# MT package functions
DESCRIPTION
The Prelude Core of G-language fully suports the GenBank database.
*supported annotation information
LOCUS
$gb->{LOCUS}->{id} -accession number
$gb->{LOCUS}->{length} -length of sequence
$gb->{LOCUS}->{nucleotide} -type of sequence ex. DNA, RNA
$gb->{LOCUS}->{circular} -whether or not the genome is
circuler.
ex. 1 or 0
$gb->{LOCUS}->{type} -type of species ex. BCT, CON
$gb->{LOCUS}->{date} -date of accession ))
HEADER
$gb->{HEADER}
COMMENT
$gb->{COMMENT}
FEATURE
Each FEATURE is numbered(FEATURE1 .. FEATURE 1172), and is a
hash structure that contains all the keys of Genbank.
In other words, in most cases, FEATURE $i's hash at least
contains informations listed below:
$gb->{FEATURE$i}->{start}
$gb->{FEATURE$i}->{end}
$gb->{FEATURE$i}->{direction}
$gb->{FEATURE$i}->{join}
$gb->{FEATURE$i}->{note}
$gb->{FEATURE$i}->{type} -CDS,gene,RNA,etc.
To analyze each FEATURE, write:
$i = 1;
while(defined(%{$gb->{FEATURE$i}})){
$i ++;
}
Each CDS is stored in a similar manner.
There are
$gb->{CDS$i}->{start}
$gb->{CDS$i}->{end}
$gb->{CDS$i}->{direction}
$gb->{CDS$i}->{join}
$gb->{CDS$i}->{feature} -number $n for $gb->{FEATURE$n}
where "CDS$i" = "FEATURE$n"
In the same manner, to analyze all CDS, write:
$i = 1;
while(defined(%{$gb->{CDS$i}})){
$i ++;
}
BASE COUNT
$gb->{BASE_COUNT}
SEQ
$gb->{SEQ} -sequence data following "ORIGIN"
*supported methods
new()
Creates a G instance.
First option is the filename of the Genbank database.
Second option specifies detailed actions.
'without annotation' option skips the annotation.
'long sequence' option uses a pointer of the filehandle
to read the genome sequence. see
next_seq() method below for details.
complement()
Given a sequence, returns its complement.
eg. complement('atgc'); returns 'gcat'
translate()
Given a sequence, returns its translated sequence.
Regular codon table is used.
eg. translate('ctggtg'); returns 'LV'
$gb->seq_info()
Prints the basic information of the genome to STDOUT.
$gb->DESTROY()
Destroys the G instance
$gb->del_key()
Given a object, deletes it from the G instance structure
eg. $gb->del_key('FEATURE1'); deletes 'FEATURE1' hash
$gb->getseq()
Given the start and end positions (starting from 0 as in Perl),
returns the sequence specified.
eg. $gb->getseq(1,3); returns the 2nd, 3rd, and 4th nucleotides.
$gb->get_gbkseq()
Given the start and end positions (starting from 1 as in
Genbank), returns the sequence specified.
eg. $gb->get_gbkseq(1,3); returns the 1st, 2nd, and 3rd
nucleotides.
$gb->get_cdsseq()
Given a CDS ID, returns the CDS sequence.
'complement' is properly parsed.
eg. $gb->get_cdsseq('CDS1'); returns the 'CDS1' sequence.
$gb->get_geneseq()
Given a CDS ID, returns the CDS sequence, or the exon sequence
If introns are present.
'complement' is properly parsed, and introns are spliced out.
eg. $gb->get_geneseq('CDS1'); returns the 'CDS1' sequence or
exon.
$gb->get_intron()
Given a CDS ID, returns the intron sequences as array of
sequences.
eg. $gb->get_intron('CDS1');
returns ($1st_intron, $2nd_intron,..)
$gb->get_exon()
Given a CDS ID, returns the exon sequence.
'complement' is properly parsed, and introns are spliced out.
eg. $gb->get_exon('CDS1'); returns the 'CDS1' exon.
$gb->next_locus()
Reads the next locus.
the G instance is then updated.
do{
}while($gb->next_locus());
Enables multiple loci analysis.
$gb->next_seq()
If G instance is created with 'long sequence' option,
$gb->next_seq() method replace the next chunk of sequence
to $gb->{SEQ}.
while($gb->next_seq(100000)){
print $gb->{SEQ};
}
Enables continuous analysis.
$gb->rewind_genome()
If G instance is created with 'long sequence' option,
$gb->rewind_genome() method puts the finlehandle pointer back
to the ORIGIN position.