<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Avoiding duplication when creating new variables in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Avoiding-duplication-when-creating-new-variables/m-p/866207#M342076</link>
    <description>To clarify, if an observation had the value n3 in under multiple variables, would my code count that twice?</description>
    <pubDate>Fri, 24 Mar 2023 18:04:53 GMT</pubDate>
    <dc:creator>Beanpot</dc:creator>
    <dc:date>2023-03-24T18:04:53Z</dc:date>
    <item>
      <title>Avoiding duplication when creating new variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Avoiding-duplication-when-creating-new-variables/m-p/866206#M342075</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;I have a dataset that has multiple variables which can have the same values within different values. For example:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;TABLE border="1"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;Obs&lt;/TD&gt;&lt;TD&gt;Var1&lt;/TD&gt;&lt;TD&gt;Var2&amp;nbsp;&lt;/TD&gt;&lt;TD&gt;Var3&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;n1&lt;/TD&gt;&lt;TD&gt;n2&lt;/TD&gt;&lt;TD&gt;n3&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;2&lt;/TD&gt;&lt;TD&gt;n3&lt;/TD&gt;&lt;TD&gt;n4&lt;/TD&gt;&lt;TD&gt;.&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;3&lt;/TD&gt;&lt;TD&gt;n5&lt;/TD&gt;&lt;TD&gt;n1&lt;/TD&gt;&lt;TD&gt;.&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;In order to determine the frequency of each particular value I'm creating a new variable for each value using the following code:&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=""&gt;data want;
set have;
if whichc('n3',of var_1-var_20) then n3=1;
else n3=0;
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;The problem I'm encountering is that when I check the frequency of the values I create against the QC table I'm off by a small percent. For example, the QC table might say total observations with value n3 is 10,000 and when I do a proc freq on the new variable I get 10,020.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I'm wondering if the code might be double counting? If so I'm not sure how to fix this. Are there other reasons the counts could be off?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks!&lt;/P&gt;</description>
      <pubDate>Fri, 24 Mar 2023 18:00:33 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Avoiding-duplication-when-creating-new-variables/m-p/866206#M342075</guid>
      <dc:creator>Beanpot</dc:creator>
      <dc:date>2023-03-24T18:00:33Z</dc:date>
    </item>
    <item>
      <title>Re: Avoiding duplication when creating new variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Avoiding-duplication-when-creating-new-variables/m-p/866207#M342076</link>
      <description>To clarify, if an observation had the value n3 in under multiple variables, would my code count that twice?</description>
      <pubDate>Fri, 24 Mar 2023 18:04:53 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Avoiding-duplication-when-creating-new-variables/m-p/866207#M342076</guid>
      <dc:creator>Beanpot</dc:creator>
      <dc:date>2023-03-24T18:04:53Z</dc:date>
    </item>
    <item>
      <title>Re: Avoiding duplication when creating new variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Avoiding-duplication-when-creating-new-variables/m-p/866212#M342078</link>
      <description>&lt;P&gt;Your code using WHICHC would count it once per record, not once per variable.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;A carefully examination of the results from your code would confirm this.&lt;/P&gt;</description>
      <pubDate>Fri, 24 Mar 2023 18:29:20 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Avoiding-duplication-when-creating-new-variables/m-p/866212#M342078</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2023-03-24T18:29:20Z</dc:date>
    </item>
    <item>
      <title>Re: Avoiding duplication when creating new variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Avoiding-duplication-when-creating-new-variables/m-p/866219#M342084</link>
      <description>&lt;P&gt;Your posted code is checking if ANY of the 20 variables is exactly equal to the string 'n3'.&amp;nbsp; It will produce the same result when all 20 of them have n3 as it will when only one of them does.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Do any of the values have leading spaces?&amp;nbsp; Or contain other invisible characters like TAB, LF, CR, FF, non-breaking space?&amp;nbsp; '090A0D0CA0'x.&amp;nbsp; For WHICHC to work they need to match exactly.&amp;nbsp; &amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 24 Mar 2023 19:24:35 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Avoiding-duplication-when-creating-new-variables/m-p/866219#M342084</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2023-03-24T19:24:35Z</dc:date>
    </item>
    <item>
      <title>Re: Avoiding duplication when creating new variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Avoiding-duplication-when-creating-new-variables/m-p/866225#M342085</link>
      <description>&lt;P&gt;As an alternative approach, you could transpose your data into a vertical format, and the de-duplicate it to avoid double-counting, then run PROC FREQ.&amp;nbsp; Something like:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have ;
  input id Var1 : $2. Var2 : $2. Var3 : $2. ;
  cards ;
1 n1 n2 n3
2 n3 n4 .
3 n5 n1 .
4 n1 n2 n2
;
run ;

proc transpose data=have out=vert ;
  var _character_ ;
  by id ;
run ;

proc sort nodupkey data=vert out=vert2 ;
  by id col1 ;
run ;

proc freq data=vert2 ;
  tables col1 ;
run ;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Fri, 24 Mar 2023 19:50:26 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Avoiding-duplication-when-creating-new-variables/m-p/866225#M342085</guid>
      <dc:creator>Quentin</dc:creator>
      <dc:date>2023-03-24T19:50:26Z</dc:date>
    </item>
  </channel>
</rss>

