4. Transmembrane proteins

In this practical you will have to write a simple Perl script for performing a hydrophobicity analysis of membrane proteins. The program should read the sequence and create overlapping windows (the size can be variable from 5 to 20). For each window along the sequence the mean hydrophobicity will be calculated and its value will be assigned for simplicity to the middle residue (i.e. for a window with length 5 centered around residue i, the window will take values i-2, i-1, i, i+1 and i+2). Thus, the final "average" hydrophobicity assigned to residue i will be:
where h(i) is the hydrophobicity of the residue i given by any of the standard scales.

The Kyte-Doolitle hydrophobicity scale:
%hyd =('A' => 0.100,
      'C' => -1.420,
      'D' => 0.780,
      'E' => 0.830,
      'F' => -2.120,
      'G' => 0.330,
      'H' => -0.500,
      'I' => -1.130,
      'K' => 1.400,
      'L' => -1.180,
      'M' => -1.590,
      'N' => 0.480,
      'P' => 0.730,
      'Q' => 0.950,
      'R' => 1.910,
      'S' => 0.520,
      'T' => 0.070,
      'V' => -1.270,
      'W' => -0.510,
      'Y' => -0.210
      );

Hints:
  • You can also try any other hydrophobicity scale.
  • You can also incorporate into the program the functionality to read the observed transmembrane segments from the SwissProt file (as described in the previous practical).
  • Check also the respective tool located in Swiss Institute of Bioinformatics (ProtScale) and the Membrane Protein Explorer tool (MPEx).
  • For plotting the results you can use your favorite statistical analysis program, or alternatively you can use the Googe Charts API directly from your browser (it is more convenient to use the chart editor).
  • The file containing the transmembrane proteins that can be used as input.