User Tools

Site Tools


gcskewanalysisenglish

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

gcskewanalysisenglish [2010/10/01 12:08]
ike
gcskewanalysisenglish [2014/01/18 07:44]
Line 1: Line 1:
-====== Introduction ====== 
- 
-As a beginning of this series of tutorial (‘GCskew Basics’), we analyzed GC skew by using only existing G-language GAE methods. In this tutorial as a next step, we are going to analyze actual GC skew by taking advantage of G-language GAE and Perl scripts. 
-====== STEP 1 - Loading G and subroutines ====== 
- 
-Let’s start with writing Perl scripts. First create new G instance as shown in previous tutorial. 
-<code perl> 
-  use G; 
-  my $gb = load("​ecoli"​);​ 
-</​code>​ 
- 
-There will be no need to suffer from loading genome files, parsing it, or searching for which functions to use when G-language GAE is selected. The above two-line script is it. Therefore users can focus on what they want not programming itself. 
- 
-Here we are going to make script with GC skew subroutine to avoid re-build same program. The subroutine returns value of GC skew as output. Setting secondary argument as window size makes script much more resourcefull. 
- 
-<code perl> 
-  sub gcskew{ ​ 
-     my $gb = shift; ​ 
-     my $window = shift; ​ 
-     my @gcskew = ();  
-      ​ 
-     ​return @gcskew; ​ 
-  }  
-</​code>​ 
-====== STEP 2 - Calculation and output of GC skews ====== 
- 
-Calculation of GC skew is very simple. Divide genome into the numbers of window and for each window calculate (C-G) / (C+G) then plot each value in the window. Let’s save the output by CSV so that graphs can be drawn with other applications. 
- 
-All genome sequences are stored in $gb->​{SEQ} in G instance. Other genomic information are stored in same way such as $gb->​{LOCUS},​ $gb->​{BASE_COUNT}. Utilizing for statement is suitable for executing some processes in each window. length ($gb→{SEQ}) count up whole length of genome and $a = $sequence =~ tr/a/a/ count up a number of whole adenine in the genome. 
- 
-The script should look like this: 
- 
-<code perl> 
-  my @location = ();  # Store window position 
-  my $i = 0;  # initialize counter 
-  for ($i = 0; $i * $window < length($gb->​{SEQ});​ $i ++){ # Execute while Window overflow genome size 
-        my $sequence = substr($gb->​{SEQ},​ $i * $window, $window); ​ # Cut out sequences within the window 
-        my $c = $sequence =~ tr/​c/​c/; ​  # count up Cs 
-        my $g = $sequence =~ tr/​g/​g/; ​  # count up Gs 
-        my $skew = ($c-$g)/​($c+$g); ​    # Calculate GC skew 
-        push (@location, $i * $window); # Store window position 
-        push (@gcskew, $skew); ​         # Store GC skew 
-  }  
-</​code>​ 
  
gcskewanalysisenglish.txt · Last modified: 2014/01/18 07:44 (external edit)