G-language Genome Analysis Environment (G-language GAE) is a set of Perl libraries for genome sequence analysis that is compatible with BioPerl, equipped with several software interfaces (interactive Perl/UNIX shell with persistent data, AJAX Web GUI, Perl API). The software package contains more than 100 original analysis programs especially focusing on bacterial genome analysis, including those for the identification of binding sites with information theory, analysis of nucleotide composition bias, analysis of the distribution of characteristic oligonucleotides, analysis of codons and prediction of expression levels, and visualization of genomic information. Taking advantage of the BioHackathon 2009, we have recently developed REST/SOAP web service APIs for this software system, in order to provide higher interoperability with other programming languages and bioinformatics software tools.
REST interface provides RESTful URL-based access to all functions of G-language GAE, which is highly interoperable to be accessed from other online resources. Here all analysis resource can be accessed through HTTP GET/POST request using unique URI. For example, graphical result of the GC skew analysis of Escherichia coli K12 genome is given by http://rest.g-language.org/NC_000913/gcskew, and cumulative GC skew analysis is given by http://rest.g-language.org/NC_000913/gcskew/cumulative=1/. Therefore, biological web sites can embed these analyses simply by linking to these URLs. These URLs gives the graphical results very quickly, but the analysis is done on the server on the fly, and the results are dynamics. This speed is a good example of the high performance of G-language System.
The web service API is already utilized in several software tools, including a lightweight version of G-language System available at CPAN that functions as a wrapper around the REST services, with minimal number of external modules for easy installation, and with minimal computational resource requirement. A web service for the generation of interactive and zoomable Chaos Game Representation images is also available utilizing the REST service.
- http://rest.g-language.org/method_list/ (analysis methods)
- List of functions as well as available options can be viewed in the AJAX Document Center, which is much more informative for humans. Double click on the rows to view detailed documentation.
- Documentation can be alternatively viewed using the REST service as well. For example, documentation for gcskew function can be viewed at http://rest.g-language.org/help/gcskew
- "G-language genome analysis environment with REST and SOAP web service interfaces", Arakawa K, Kido N, Oshita K, Tomita M, Nucleic Acids Res., 2010, 38 Suppl:W700-705 (PubMed).
- http://rest.g-language.org/mgen/metX/gene (gene name)
- http://rest.g-language.org/mgen/metX/translation (amino acid sequence)
- http://rest.g-language.org/mgen/*/translation (amino acid sequence of all genes)
- http://rest.g-language.org/mgen/product=glucose/product (show function of genes containing "glucose" in the product feature tag)
- POST a file to fileform=file
- File types are automatically interpreted by the system. Supported formats are: ABI, ACE, ALF, BSML, CTF, EMBL, Entrez Gene, Exp, FastA, FastQ, GCG, GenBank, Phd, PIR, PLN, raw, SCF, SWISS.
- Returns a unique reference ID for the uploaded file. You can use this ID for the rest of the analysis so that the large file is not transferred over the network.
- for example, if you received an ID of "B619CD",
http://rest.g-language.org/[genome]/[method]/[required input (if any)]/[option1=value]/[option2=value]...
- http://rest.g-language.org/ecoli/recA/before_startcodon (5' upstream sequence of recA gene, for 100bp)
- http://rest.g-language.org/ecoli/recA/before_startcodon/200 (same as above, for 200 bp)
- http://rest.g-language.org/ecoli/before_startcodon/recA (another way for 5' upstream sequence of recA gene)
- http://rest.g-language.org/ecoli/recA/get_geneseq (nucleotide sequence of recA gene)
- http://rest.g-language.org/ecoli/get_geneseq/recA (another way for nucleotide sequence of recA gene)
- http://rest.g-language.org/ecoli/cds (get a list of all feature ids)
- http://rest.g-language.org/ecoli/*/get_geneseq/ (nucleotide sequence of all genes)
- http://rest.g-language.org/ecoli/get_geneseq/* (another way for nucleotide sequence of all genes)
- http://rest.g-language.org/ecoli/codon_usage (shows the codon table - produces an image)
- http://rest.g-language.org/ecoli/view_cds (composition around start/stop codons - produces a graph)
- http://rest.g-language.org/ecoli/genomicskew (GC skew analysis - produces a graph)
- http://rest.g-language.org/ecoli/gcskew (GC skew analysis - produces a graph)
- http://rest.g-language.org/ecoli/gcskew/cumulative=1 (cumulative GC skew)
- http://rest.g-language.org/ecoli/gcskew/at=1/cumulative=1/output=f/ (cumulative AT skew - as csv data)
- http://rest.g-language.org/ecoli/gcsi (GC Skew Index analysis - single double value)
http://rest.g-language.org/[method]/[required input (if any)]/[option1=value]/[option2=value]...
- http://rest.g-language.org/togoWS/C00001 (use togoWS to retrieve data)
- http://rest.g-language.org/max/1,2,3,4,5 (maximum of given vector)
- http://rest.g-language.org/help/gcskew (manual of gcskew analysis method)
- http://rest.g-language.org/help/-g (list of available manual documentations)
- http://rest.g-language.org/pubmed/G-language (search pubmed with keyword: "G-language")