| Summary | Included libraries | Package variables | Synopsis | Description | General documentation | Methods |
use G; # Imports G-language GAE module
$gb = new G("ecoli.gbk"); # Creates G's instance as $gb
$gb = load("ecoli.gbk"); # this line is same as the above.
# At the same time, read in ecoli.gbk.
# Read the annotation and sequence
# information
# See DESCRIPTION for details
$gb->seq_info(); # Prints the basic sequence information.
find_ori_ter($gb); # Give $gb as the first argument to
# most of the analysis functions
The G-language GAE fully supports most sequence databases.
Stored annotation information:
LOCUS
$gb->{LOCUS}->{id} -accession number
$gb->{LOCUS}->{length} -length of sequence
$gb->{LOCUS}->{nucleotide} -type of sequence ex. DNA, RNA
$gb->{LOCUS}->{circular} -1 when the genome is circular.
otherwise 0
$gb->{LOCUS}->{type} -type of species ex. BCT, CON
$gb->{LOCUS}->{date} -date of accession
HEADER
$gb->{HEADER}
$gb->{DEFINITION}
$gb->{ACCESSION}
$gb->{SOURCE}
$gb->{ORGANISM}
$gb->{TAXONOMY}->{all} -same as $gb->{TAXONOMY}->{1}
$gb->{TAXONOMY}->{domain} -same as $gb->{TAXONOMY}->{2}
$gb->{TAXONOMY}->{phylum} -same as $gb->{TAXONOMY}->{3}
$gb->{TAXONOMY}->{class} -same as $gb->{TAXONOMY}->{4}
$gb->{TAXONOMY}->{order}} -same as $gb->{TAXONOMY}->{5}
$gb->{TAXONOMY}->{family} -same as $gb->{TAXONOMY}->{6}
$gb->{TAXONOMY}->{genus}
$gb->{TAXONOMY}->{species}
COMMENT
$gb->{COMMENT}
FEATURE
Each FEATURE is numbered(FEATURE1 .. FEATURE1172), and is a
hash structure that contains all the keys of Genbank.
In other words, in most cases, FEATURE$i's hash at least
contains informations listed below:
$gb->{FEATURE$i}->{start}
$gb->{FEATURE$i}->{end}
$gb->{FEATURE$i}->{direction}
$gb->{FEATURE$i}->{join}
$gb->{FEATURE$i}->{note}
$gb->{FEATURE$i}->{type} -CDS,gene,RNA,etc.
$gb->{FEATURE$i}->{feature} -same as $i
To analyze each FEATURE, write:
foreach my $feature ($gb->feature()){
print $gb->{$feature}->{type}, "\n";
}
In the same manner, to analyze all CDS, write:
foreach my $cds ($gb->cds()){
print $gb->{$cds}->{gene}, "\n";
}
Feature or gene information can also be accessed with CDS numbers:
$gb->{CDS$i}->{start}
or with locus_tags or gene names (for CDS, tRNA, and rRNA)
$gb->{thrL}->{start}
$gb->{b0001}->{start}
BASE COUNT
$gb->{BASE_COUNT}
SEQ
$gb->{SEQ} -sequence data following "ORIGIN"
or
$gb->seq()
| load | Description | Code |
| method_list | Description | Code |
| opt_list | No description | Code |
| load | code | next | Top |
Name: load - load genome databases
This funciton is used to load genome databases into memory.
First option is the filename of the database. Default format is
the GenBank database. Database format is guessed from the extensions.
(eg. .gbk => GenBank, .fasta => FASTA, .embl => EMBL)
Most of the major sequence formats are supported, including
Fasta, Fastq, GenBank, EMBL, Swiss-Prot, GCG, PIR, and so on.
Flatfile can be gzipped. If the file extension ends with ".gz",
load() can automatically handle it as compressed file.
There are also several sample bacterial genomes included in the system.
$eco = load("ecoli"); # Escherichia coli K12 MG1655 - NC_000913
$bsub = load("bsub"); # Bacillus subtilis - NC_000964
$mgen = load("mgen"); # Mycoplasma genitalium - NC_000908
$cyano = load("cyano"); # Synechococcus sp. - NC_005070
$pyro = load("pyro"); # Pyrococcus furiosus - NC_003413
$bbur = load("bbur"); # Borrelia burgdorferi B31 - NC_001318
$plasF = load("plasmidf"); # Plasmid F - NC_002483
Data can be automatically donwloaded from public databases using
Uniform Sequence Address (USA) keys.
http://emboss.sourceforge.net/docs/themes/UniformSequenceAddress.html |
| method_list | code | prev | next | Top |
Name: method_list - get the list of availabel G-language GAE functions
Description:
Returns an array of available method names.
When 1 is supplied as an argument, returns an array of API-related
method names.
eg. @methods = method_list(); # contains more than 100 analysis functions
@APImethods = method_list(1); # contains around 50 API-related methods.
REST:
http://rest.g-language.org/method_list |
| load | description | prev | next | Top |
return new G(@_);}
| method_list | description | prev | next | Top |
my $opt = shift; my %system; for my $name (qw/}
p puts say readFile writeFile
opt_as_gb opt_default opt_get opt_list opt_val
msg_ask_interface msg_error msg_send msg_gimv msg_interface msg_percent msg_progress msg_set_gimv msg_system_console msg_term_console
sdb_exists sdb_load sdb_save _sdb_path _set_sdb_path
db_dbi db_exists db_load db_path db_save db_set_path
pass_send pass_get
/){ $system{$name} ++;
| opt_list | description | prev | next | Top |
my $sub = shift; SubOpt::opt_default(); SubOpt::set_opt_list(1); eval("&{$sub}"); SubOpt::set_opt_list(0); return opt_val();}
| Supported methods of G-language Genome Analysis Environment | Top |
| $gb = new G("genome file") | Top |
Name: $gb = new G("genome file") - create a G instance
see "help load" for more information.| $gb->next_locus() | Top |
Name: $gb->next_locus() - read the next locus and update the G instance
Description:
Reads the next locus.
the G instance is then updated.
Load G instance with "no cache" option to use this feature.
eg.
do{
}while($gb->next_locus());
# Enables multiple loci analysis.
REST:
http://rest.g-language.org/NC_000913/next_locus| SEE ALSO | Top |
| AUTHOR | Top |