Coding with DNA

This post was published 8 years, 1 month ago. Due to the rapidly evolving world of technology, some concepts may no longer be applicable.

Alright, this one isn’t quite as exciting as the title suggests. I had the need of a quick script for a bit of biology – genetics. Pretty simple stuff really, but I thought I would post it anyway.

Generating random sequences of DNA

function randdna($len){
$length=intval($len);
$bases=array('A','C','G','T');
$dna="";
for ($i=0; $i<$length;$i++){
$dna .=$bases[mt_rand(0,3)];
}
return $dna;
}

Generating an mRNA sequence from a DNA sequence (transcribing)

function transcribe($dna){
$dna=strtoupper($dna);
$dna = preg_replace("/[^ACGT]/", "", $dna);
$rna= strtr($dna,array('A'=>"U",'C'=>"G",'G'=>"C",'T'=>"A"));
return $rna;
}

Generating a protein sequence from an mRNA sequence (transcription)
function translate($rna){
$trans=array("UUU"=>"phe ","UUC"=>"phe ","UUA"=>"leu ","UUG"=>"leu ","CUU"=>"leu ","CUC"=>"leu ","CUA"=>"leu ","CUG"=>"leu ","AUU"=>"ile ","AUC"=>"ile ","AUA"=>"ile ","AUG"=>"met ","GUU"=>"val ","GUC"=>"val ","GUA"=>"val ","GUG"=>"val ","UCU"=>"ser ","UCC"=>"ser ","UCA"=>"ser ","UCG"=>"ser ","CCU"=>"pro ","CCC"=>"pro ","CCA"=>"pro ","CCG"=>"pro ","ACU"=>"thr ","ACC"=>"thr ","ACA"=>"thr ","ACG"=>"thr ","GCU"=>"ala ","GCC"=>"ala ","GCA"=>"ala ","GCG"=>"ala ","UAU"=>"tyr ","UAC"=>"tyr ","UAA"=>"chr ","UAG"=>"mbe ","CAU"=>"his ","CAC"=>"his ","CAA"=>"gln ","CAG"=>"gln ","AAU"=>"asn ","AAC"=>"asn ","AAA"=>"lys ","AAG"=>"lys ","GAU"=>"asp ","GAC"=>"asp ","GAA"=>"glu ","GAG"=>"glu ","UGU"=>"cys ","UGC"=>"cys ","UGA"=>"pal ","UGG"=>"trp ","CGU"=>"arg ","CGC"=>"arg ","CGA"=>"arg ","CGG"=>"arg ","AGU"=>"ser ","AGC"=>"ser ","AGA"=>"arg ","AGG"=>"arg ","GGU"=>"gly ","GGC"=>"gly ","GGA"=>"gly ","GGG"=>"gly ");

$rna=strtoupper($rna);
$rna = preg_replace("/[^ACGU]/", "", $rna);
$start = strpos($rna,"AUG");
$end = strlen($rna);
foreach(array("UAA", "UGA", "UAG") as $bp){
$end1=0;
$found=false;
do{
$end1=strpos($rna,$bp,$start+$end1);
if(($end1-$start)%3==0){$found=true;break;}
}while($end1!==false);
if ($end1!==false && $found && $end1<$end){$end=$end1;}
}
$gene = substr($rna,$start,((intval($end-$start)/3)*3));
echo "Coding Sequence: $gene";

$prot = strtr($gene,$trans);
return $prot;
}

Not quite the most elegant code, but should be fairly accurate – the last function (translate), finds the start codon, determines the reading frame, and locates the first end codon.

Leave a Reply

Your email address will not be published. Required fields are marked *