<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Find some sequence in DNA in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Find-some-sequence-in-DNA/m-p/331935#M74673</link>
    <description>YES</description>
    <pubDate>Sun, 12 Feb 2017 08:03:33 GMT</pubDate>
    <dc:creator>LauChiFung</dc:creator>
    <dc:date>2017-02-12T08:03:33Z</dc:date>
    <item>
      <title>Find some sequence in DNA</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Find-some-sequence-in-DNA/m-p/331779#M74609</link>
      <description>&lt;P&gt;if i have 10000 DNA sequence data e,g;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;1.G&lt;/P&gt;&lt;P&gt;2.C&lt;/P&gt;&lt;P&gt;3.A&lt;/P&gt;&lt;P&gt;4.C&lt;/P&gt;&lt;P&gt;....&lt;/P&gt;&lt;P&gt;....&lt;/P&gt;&lt;P&gt;How can i do to find no. GCC pattern in this dataset?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 11 Feb 2017 05:57:35 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Find-some-sequence-in-DNA/m-p/331779#M74609</guid>
      <dc:creator>LauChiFung</dc:creator>
      <dc:date>2017-02-11T05:57:35Z</dc:date>
    </item>
    <item>
      <title>Re: Find some sequence in DNA</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Find-some-sequence-in-DNA/m-p/331791#M74617</link>
      <description>&lt;P&gt;Just to clarify, your data looks something like this right?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data DNA;
   input ID seq $;
   datalines;
   1 G
   2 C
   3 A 
   4 C
   ;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;.. and so on. Then you want the observation where a seq value of G is followed by a C and then another C right? &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Sat, 11 Feb 2017 09:38:41 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Find-some-sequence-in-DNA/m-p/331791#M74617</guid>
      <dc:creator>PeterClemmensen</dc:creator>
      <dc:date>2017-02-11T09:38:41Z</dc:date>
    </item>
    <item>
      <title>Re: Find some sequence in DNA</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Find-some-sequence-in-DNA/m-p/331795#M74620</link>
      <description>&lt;P&gt;The way to deal with your query depends on your data file type and format.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;1) assuming your data is a flat file then:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;filename DNA '...path and filename';
data want;
      infile DNA truncover end=eof;
      length c1-c3 $1 ;
      array cx c1-c3;
      retain c1-c3 ' '  i 0 ;
      input   na $1.;
      link check;&lt;BR /&gt;      keep pos c1-c3;
return;
check:
    if i &amp;lt; 3 then do;
       i+1; cx (i) = na;
    end;
    else do;&lt;BR /&gt;          pos = _N_-2 ;  /* position of 1st NA = G */
          if compress(c1||c2||c3) = 'GCC' 
            then output;
          c1=c2;
          c2=c3;
          c3=na;&lt;BR /&gt;     end;
return;
run;
      &lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;2) &amp;nbsp;Similarly, if the data is a sas dataset then the code should be,&lt;/P&gt;
&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;assuming that NA is the variable with the Nuclear Acid code:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want;
      set have;
      length c1-c3 $1 ;
      array cx c1-c3;
      retain c1-c3 ' '  i 0 ;      
      link check;
      keep pos c1-c3;
return;
check:
    if i &amp;lt; 3 then do;
       i+1; cx (i) = na;
    end;
    else do;
          pos = _N_-2;  /* position of &lt;/CODE&gt;&lt;CODE class=" language-sas"&gt;1st NA = G */&lt;BR /&gt;&lt;/CODE&gt;&lt;CODE class=" language-sas"&gt;          if compress(c1||c2||c3) = 'GCC' then output; &lt;BR /&gt;             c1=c2; c2=c3; c3=na; &lt;BR /&gt;    end;&lt;BR /&gt;return; &lt;BR /&gt;run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 11 Feb 2017 10:48:34 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Find-some-sequence-in-DNA/m-p/331795#M74620</guid>
      <dc:creator>Shmuel</dc:creator>
      <dc:date>2017-02-11T10:48:34Z</dc:date>
    </item>
    <item>
      <title>Re: Find some sequence in DNA</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Find-some-sequence-in-DNA/m-p/331806#M74624</link>
      <description>&lt;P&gt;data _null_;&lt;/P&gt;
&lt;P&gt;&amp;nbsp; retain n_gcc 0;&lt;/P&gt;
&lt;P&gt;&amp;nbsp; set dna end=eod;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp;if lag2(seq)='G' and lag(seq)='C'&amp;nbsp; and seq='C'&amp;nbsp; then&amp;nbsp;n_gcc+1;&lt;/P&gt;
&lt;P&gt;&amp;nbsp; if eod then put n_gcc=;&lt;/P&gt;
&lt;P&gt;run;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 11 Feb 2017 13:29:30 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Find-some-sequence-in-DNA/m-p/331806#M74624</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2017-02-11T13:29:30Z</dc:date>
    </item>
    <item>
      <title>Re: Find some sequence in DNA</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Find-some-sequence-in-DNA/m-p/331887#M74662</link>
      <description>So thank for you all</description>
      <pubDate>Sat, 11 Feb 2017 23:35:16 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Find-some-sequence-in-DNA/m-p/331887#M74662</guid>
      <dc:creator>LauChiFung</dc:creator>
      <dc:date>2017-02-11T23:35:16Z</dc:date>
    </item>
    <item>
      <title>Re: Find some sequence in DNA</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Find-some-sequence-in-DNA/m-p/331935#M74673</link>
      <description>YES</description>
      <pubDate>Sun, 12 Feb 2017 08:03:33 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Find-some-sequence-in-DNA/m-p/331935#M74673</guid>
      <dc:creator>LauChiFung</dc:creator>
      <dc:date>2017-02-12T08:03:33Z</dc:date>
    </item>
  </channel>
</rss>

