<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: counting distinct values over a range of variables in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/counting-distinct-values-over-a-range-of-variables/m-p/327833#M73162</link>
    <description>&lt;P&gt;&amp;lt;nods enthusiastically&amp;gt;&lt;/P&gt;</description>
    <pubDate>Thu, 26 Jan 2017 19:55:31 GMT</pubDate>
    <dc:creator>LaurieF</dc:creator>
    <dc:date>2017-01-26T19:55:31Z</dc:date>
    <item>
      <title>counting distinct values over a range of variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/counting-distinct-values-over-a-range-of-variables/m-p/327817#M73154</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;suppsoe I have the following data:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;TABLE width="320"&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD width="64"&gt;var1&lt;/TD&gt;
&lt;TD width="64"&gt;var2&lt;/TD&gt;
&lt;TD width="64"&gt;var3&lt;/TD&gt;
&lt;TD width="64"&gt;var4&lt;/TD&gt;
&lt;TD width="64"&gt;var5&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;abc&lt;/TD&gt;
&lt;TD&gt;d45g&lt;/TD&gt;
&lt;TD&gt;abc&lt;/TD&gt;
&lt;TD&gt;1&lt;/TD&gt;
&lt;TD&gt;2&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;a&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;c&lt;/TD&gt;
&lt;TD&gt;rrrrrr&lt;/TD&gt;
&lt;TD&gt;g9&lt;/TD&gt;
&lt;TD&gt;g9&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;var1 - var3 is one range, and var4 - var5 is another.&lt;/P&gt;
&lt;P&gt;For each row, I would like to count the number of distinct values for each range of variables. So for the above case, here is the count:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;TABLE style="border-collapse: collapse; width: 96pt;" border="0" width="128" cellspacing="0" cellpadding="0"&gt;&lt;COLGROUP&gt;&lt;COL style="width: 48pt;" span="2" width="64" /&gt; &lt;/COLGROUP&gt;
&lt;TBODY&gt;
&lt;TR style="height: 15.0pt;"&gt;
&lt;TD width="64" height="20" style="height: 15.0pt; width: 48pt;"&gt;var1-var3&lt;/TD&gt;
&lt;TD width="64" style="width: 48pt;"&gt;var4-var5&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR style="height: 15.0pt;"&gt;
&lt;TD height="20" align="right" style="height: 15.0pt;"&gt;2&lt;/TD&gt;
&lt;TD align="right"&gt;2&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR style="height: 15.0pt;"&gt;
&lt;TD height="20" align="right" style="height: 15.0pt;"&gt;3&lt;/TD&gt;
&lt;TD align="right"&gt;1&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks!&lt;/P&gt;</description>
      <pubDate>Thu, 26 Jan 2017 19:04:02 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/counting-distinct-values-over-a-range-of-variables/m-p/327817#M73154</guid>
      <dc:creator>ilikesas</dc:creator>
      <dc:date>2017-01-26T19:04:02Z</dc:date>
    </item>
    <item>
      <title>Re: counting distinct values over a range of variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/counting-distinct-values-over-a-range-of-variables/m-p/327829#M73158</link>
      <description>&lt;P&gt;Here is a brute force approach:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;data want;
  set have;
  array v1(*) $ var1-var3;
  array v2(*) $ var4-var5;
  call sortc(of v1(*));
  do i=1 to dim(v1);
    if not missing(v1(i)) then do;
      if i lt dim(v1) then do;
        if v1(i)ne v1(i+1) then var1_var3=sum(var1_var3,1);
      end;
      else var1_var3=sum(var1_var3,1);
    end;
  end;
  call sortc(of v2(*));
  do i=1 to dim(v2);
    if not missing(v2(i)) then do;
      if i lt dim(v2) then do;
        if v2(i)ne v2(i+1) then var4_var5=sum(var4_var5,1);
      end;
      else var4_var5=sum(var4_var5,1);
    end;
  end;
run;
&lt;/PRE&gt;
&lt;P&gt;HTH,&lt;/P&gt;
&lt;P&gt;Art, CEO, AnalystFinder.com&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 26 Jan 2017 19:45:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/counting-distinct-values-over-a-range-of-variables/m-p/327829#M73158</guid>
      <dc:creator>art297</dc:creator>
      <dc:date>2017-01-26T19:45:59Z</dc:date>
    </item>
    <item>
      <title>Re: counting distinct values over a range of variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/counting-distinct-values-over-a-range-of-variables/m-p/327831#M73160</link>
      <description>&lt;P&gt;Can you please provide more data? My initial generic response (using&amp;nbsp;&lt;EM&gt;proc transpose&lt;/EM&gt;) may or may not be viable, depending on the number of observations.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If the source data are small enough,&amp;nbsp;&lt;EM&gt;transpose&lt;/EM&gt; would be quite good. If it's over 32k observations, it won't!&lt;/P&gt;</description>
      <pubDate>Thu, 26 Jan 2017 19:49:44 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/counting-distinct-values-over-a-range-of-variables/m-p/327831#M73160</guid>
      <dc:creator>LaurieF</dc:creator>
      <dc:date>2017-01-26T19:49:44Z</dc:date>
    </item>
    <item>
      <title>Re: counting distinct values over a range of variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/counting-distinct-values-over-a-range-of-variables/m-p/327832#M73161</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/17429"&gt;@LaurieF&lt;/a&gt; wrote:&lt;BR /&gt;
&lt;P&gt;Can you please provide more data? My initial generic response (using&amp;nbsp;&lt;EM&gt;proc transpose&lt;/EM&gt;) may or may not be viable, depending on the number of observations.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If the source data are small enough,&amp;nbsp;&lt;EM&gt;transpose&lt;/EM&gt; would be quite good. If it's over 32k observations, it won't!&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/17429"&gt;@LaurieF&lt;/a&gt;&amp;nbsp;this is the second or third time &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/12982"&gt;@ilikesas&lt;/a&gt;&amp;nbsp;has asked a similar question.&amp;nbsp; If &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/12982"&gt;@ilikesas&lt;/a&gt;&amp;nbsp;would normalized their data they could easily deal with this type of problem and more, without resorting to ARRAYs and other clunky data step code.&lt;/P&gt;</description>
      <pubDate>Thu, 26 Jan 2017 19:55:17 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/counting-distinct-values-over-a-range-of-variables/m-p/327832#M73161</guid>
      <dc:creator>data_null__</dc:creator>
      <dc:date>2017-01-26T19:55:17Z</dc:date>
    </item>
    <item>
      <title>Re: counting distinct values over a range of variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/counting-distinct-values-over-a-range-of-variables/m-p/327833#M73162</link>
      <description>&lt;P&gt;&amp;lt;nods enthusiastically&amp;gt;&lt;/P&gt;</description>
      <pubDate>Thu, 26 Jan 2017 19:55:31 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/counting-distinct-values-over-a-range-of-variables/m-p/327833#M73162</guid>
      <dc:creator>LaurieF</dc:creator>
      <dc:date>2017-01-26T19:55:31Z</dc:date>
    </item>
    <item>
      <title>Re: counting distinct values over a range of variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/counting-distinct-values-over-a-range-of-variables/m-p/327836#M73163</link>
      <description>&lt;P&gt;Hi data_null_,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;could you please elaborate more about what you mean by "normalizing" the data? It is indeed true that arrays and do loops tend to be messy and complicated (at least for me)...&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks!&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 26 Jan 2017 20:22:37 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/counting-distinct-values-over-a-range-of-variables/m-p/327836#M73163</guid>
      <dc:creator>ilikesas</dc:creator>
      <dc:date>2017-01-26T20:22:37Z</dc:date>
    </item>
    <item>
      <title>Re: counting distinct values over a range of variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/counting-distinct-values-over-a-range-of-variables/m-p/327839#M73164</link>
      <description>&lt;P&gt;The long answer is:&amp;nbsp;&lt;A href="https://en.wikipedia.org/wiki/Database_normalization" target="_self"&gt;Wikipedia Data Normalisation&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The short answer: SAS works inherently better when it's processing very long, very narrow tables. One of the primary functions of&amp;nbsp;&lt;EM&gt;proc transpose&lt;/EM&gt;, for example, is to rotate data so that columns become rows and&amp;nbsp;&lt;EM&gt;vice versa&lt;/EM&gt;. That way you can do queries over columns, which&amp;nbsp;&lt;EM&gt;proc sql&lt;/EM&gt; naturally supports.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You can, of course, use data steps to walk along arrays, and that can work very well. But if you can manage to pre-process your data so that you can avoid that, the long term solution will tend to be more effective.&lt;/P&gt;</description>
      <pubDate>Thu, 26 Jan 2017 20:42:54 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/counting-distinct-values-over-a-range-of-variables/m-p/327839#M73164</guid>
      <dc:creator>LaurieF</dc:creator>
      <dc:date>2017-01-26T20:42:54Z</dc:date>
    </item>
    <item>
      <title>Re: counting distinct values over a range of variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/counting-distinct-values-over-a-range-of-variables/m-p/327840#M73165</link>
      <description>&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/12982"&gt;@ilikesas&lt;/a&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Without a little more feel for your data its hard to say what normalization you need to do.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you had survey data, for example with each survey question a variable (column) and each survey responder an observation (row) like:&lt;/P&gt;
&lt;TABLE width="314"&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD width="90"&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD width="96"&gt;ques1&lt;/TD&gt;
&lt;TD width="64"&gt;ques2&lt;/TD&gt;
&lt;TD width="64"&gt;ques3&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;responder1&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;responser2&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;responder3&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;that would be a pretty typical format in my experience.&amp;nbsp; For me (and I think for many) this sort of layout lends itself to differences between responders over a particular question easier than to differences in questions for a particular responder, although both are possible.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I think the suggestion is that perhaps a look at the data/database design/scheme and what you want to do with it could save you some work in the long run.&lt;/P&gt;</description>
      <pubDate>Thu, 26 Jan 2017 20:43:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/counting-distinct-values-over-a-range-of-variables/m-p/327840#M73165</guid>
      <dc:creator>HB</dc:creator>
      <dc:date>2017-01-26T20:43:05Z</dc:date>
    </item>
    <item>
      <title>Re: counting distinct values over a range of variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/counting-distinct-values-over-a-range-of-variables/m-p/327842#M73167</link>
      <description>&lt;P&gt;Hi LaurieF,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;here is some data that I prepared:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
input var1 $4. var2 $4. var3 $4. var4 $4. var5 $4.;
datalines;
abc abc def g55 g56
abc def ttt hhh hhh
ddd ddd ddd ddd ddd
111 111 112 111 111
a   a   a   a   a  
ab  ab  ac  ac  ac 
;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;I guess that in order to transpose more datalines are needed than variables. Here I also made al variables to be of character type and length 4, but I guess that this specification is not very important when data is imported from an external source such as Excel.&lt;/P&gt;
&lt;P&gt;Thanks!&lt;/P&gt;</description>
      <pubDate>Thu, 26 Jan 2017 20:47:20 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/counting-distinct-values-over-a-range-of-variables/m-p/327842#M73167</guid>
      <dc:creator>ilikesas</dc:creator>
      <dc:date>2017-01-26T20:47:20Z</dc:date>
    </item>
    <item>
      <title>Re: counting distinct values over a range of variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/counting-distinct-values-over-a-range-of-variables/m-p/327843#M73168</link>
      <description>&lt;P&gt;Again, that's all very well, but it&amp;nbsp;still doesn't define the problem very well: it's all a bit theoretical. Do you have some real-world data to play with?&lt;/P&gt;</description>
      <pubDate>Thu, 26 Jan 2017 20:49:26 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/counting-distinct-values-over-a-range-of-variables/m-p/327843#M73168</guid>
      <dc:creator>LaurieF</dc:creator>
      <dc:date>2017-01-26T20:49:26Z</dc:date>
    </item>
    <item>
      <title>Re: counting distinct values over a range of variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/counting-distinct-values-over-a-range-of-variables/m-p/327844#M73169</link>
      <description>&lt;P&gt;Hi HB,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;thanks for the insight. In my case my data is code names (such as abc, gf4). Could this be fine?&lt;/P&gt;</description>
      <pubDate>Thu, 26 Jan 2017 20:50:21 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/counting-distinct-values-over-a-range-of-variables/m-p/327844#M73169</guid>
      <dc:creator>ilikesas</dc:creator>
      <dc:date>2017-01-26T20:50:21Z</dc:date>
    </item>
    <item>
      <title>Re: counting distinct values over a range of variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/counting-distinct-values-over-a-range-of-variables/m-p/327845#M73170</link>
      <description>&lt;P&gt;Others pointed out already that you probably should structure your data differently.&lt;/P&gt;
&lt;P&gt;It's a bit of an "overkill" but should you have to do this for many variable combinations with your current data structure then the following would be an option.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
  input (var1 var2 var3) ($) var4 var5;
  datalines;
abc d45g abc 1 2
a c 9 9 9
;
run;

%macro count_dist_vals(invars,outvar);
  %local _num;
  %let _num=%sysfunc(ceil(%sysevalf(%sysfunc(ranuni(0))*10000000000)),z10.);
  if _n_=1 then
    do;
      length _tmp_var_12345 $32767;
      dcl hash h_&amp;amp;_num.(multidata:'n', hashexp:2);
      h_&amp;amp;_num..defineKey('_tmp_var_12345');
      h_&amp;amp;_num..defineDone();
    end;
    %do i= 1 %to %sysfunc(countw(&amp;amp;invars));
      _tmp_var_12345=catt(%scan(&amp;amp;invars,&amp;amp;i));
      h_&amp;amp;_num..ref();
    %end;
    &amp;amp;outvar=h_&amp;amp;_num..num_items;
    h_&amp;amp;_num..clear();
    drop _tmp_var_12345 ;
%mend;

data want;
  set have;
  %count_dist_vals(var1 var2 var3, n_dist_vals_1)
  %count_dist_vals(var3 var4 var5, n_dist_vals_2)
run;
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 26 Jan 2017 20:51:58 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/counting-distinct-values-over-a-range-of-variables/m-p/327845#M73170</guid>
      <dc:creator>Patrick</dc:creator>
      <dc:date>2017-01-26T20:51:58Z</dc:date>
    </item>
    <item>
      <title>Re: counting distinct values over a range of variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/counting-distinct-values-over-a-range-of-variables/m-p/327847#M73172</link>
      <description>&lt;P&gt;actually I don't have real data, but I am rather thinking of building a data which will consist of names or name codes (such as abc, gf5 etc.)&lt;/P&gt;</description>
      <pubDate>Thu, 26 Jan 2017 20:53:09 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/counting-distinct-values-over-a-range-of-variables/m-p/327847#M73172</guid>
      <dc:creator>ilikesas</dc:creator>
      <dc:date>2017-01-26T20:53:09Z</dc:date>
    </item>
    <item>
      <title>Re: counting distinct values over a range of variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/counting-distinct-values-over-a-range-of-variables/m-p/327849#M73173</link>
      <description>&lt;P&gt;I'd recommend, then, constructing your data so that you can avoid using arrays, and store the values in observations instead. You're making a rod for your own back otherwise.&lt;/P&gt;</description>
      <pubDate>Thu, 26 Jan 2017 20:54:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/counting-distinct-values-over-a-range-of-variables/m-p/327849#M73173</guid>
      <dc:creator>LaurieF</dc:creator>
      <dc:date>2017-01-26T20:54:50Z</dc:date>
    </item>
    <item>
      <title>Re: counting distinct values over a range of variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/counting-distinct-values-over-a-range-of-variables/m-p/327850#M73174</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/12982"&gt;@ilikesas&lt;/a&gt;&lt;/P&gt;
&lt;P&gt;It's a bit hard to give you advice without any context and without knowing how you plan to use this data.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;On a generic level: You could go for a key/value pair construct with one column holding the name of your variables and a second column holding the value. With such a structure it's then quite simple to implement distinct counts. You could even use such data to construct SAS Formats (via Proc Format, cntlin).&lt;/P&gt;</description>
      <pubDate>Thu, 26 Jan 2017 21:04:52 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/counting-distinct-values-over-a-range-of-variables/m-p/327850#M73174</guid>
      <dc:creator>Patrick</dc:creator>
      <dc:date>2017-01-26T21:04:52Z</dc:date>
    </item>
    <item>
      <title>Re: counting distinct values over a range of variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/counting-distinct-values-over-a-range-of-variables/m-p/327854#M73175</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/12982"&gt;@ilikesas&lt;/a&gt; wrote:&lt;BR /&gt;
&lt;P&gt;Hi LaurieF,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;here is some data that I prepared:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
input var1 $4. var2 $4. var3 $4. var4 $4. var5 $4.;
datalines;
abc abc def g55 g56
abc def ttt hhh hhh
ddd ddd ddd ddd ddd
111 111 112 111 111
a   a   a   a   a  
ab  ab  ac  ac  ac 
;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;I guess that in order to transpose more datalines are needed than variables. Here I also made al variables to be of character type and length 4, but I guess that this specification is not very important when data is imported from an external source such as Excel.&lt;/P&gt;
&lt;P&gt;Thanks!&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;For your first example if your data were normalized this is the code you could use to count distinct values per group.&amp;nbsp; It will do any number of groups.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;ods select none;
ods output nlevels=nlevels;
proc freq nlevels;
   by id group;
   table value / noprint;
   run;
ods select all;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&lt;IMG title="Capture.PNG" alt="Capture.PNG" src="https://communities.sas.com/t5/image/serverpage/image-id/6916i8048DD997E6F5982/image-size/original?v=1.0&amp;amp;px=-1" border="0" /&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 26 Jan 2017 21:19:37 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/counting-distinct-values-over-a-range-of-variables/m-p/327854#M73175</guid>
      <dc:creator>data_null__</dc:creator>
      <dc:date>2017-01-26T21:19:37Z</dc:date>
    </item>
    <item>
      <title>Re: counting distinct values over a range of variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/counting-distinct-values-over-a-range-of-variables/m-p/327857#M73177</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/12982"&gt;@ilikesas&lt;/a&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;It's hard to know what abc and gf5 represent or what the desired outcome is or anything at all actually so it's hard to give advice.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Let's assume we have a number of restaurants and a number of restaurant health examinations (not the same exam conducted on multiple dates but different exam types) and further that each restaurant undergoes one of the examtypes on each given testdate. One could structure the data to look like&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;TABLE width="366"&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD width="117"&gt;restaurantid&lt;/TD&gt;
&lt;TD width="83"&gt;testdate1&lt;/TD&gt;
&lt;TD width="83"&gt;testdate2&lt;/TD&gt;
&lt;TD width="83"&gt;testdate3&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;restaurant1&lt;/TD&gt;
&lt;TD&gt;examtype2&lt;/TD&gt;
&lt;TD&gt;examtype2&lt;/TD&gt;
&lt;TD&gt;examtype1&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;restaurant2&lt;/TD&gt;
&lt;TD&gt;examtype3&lt;/TD&gt;
&lt;TD&gt;examtype1&lt;/TD&gt;
&lt;TD&gt;examtype2&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;restaurant3&lt;/TD&gt;
&lt;TD&gt;examtype2&lt;/TD&gt;
&lt;TD&gt;examtype1&lt;/TD&gt;
&lt;TD&gt;examtype1&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;From this you could tell what restaurants underwent what kind of exam on testdate2, and whether restaurant2 has over the three exam dates completed all exam types, but it is a little cumbersome. Plus you have to add a new column for every new test date.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If the data looked like&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;TABLE width="283"&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD width="117"&gt;restaurantid&lt;/TD&gt;
&lt;TD width="83"&gt;testdates&lt;/TD&gt;
&lt;TD width="83"&gt;examtype&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;restaurant1&lt;/TD&gt;
&lt;TD&gt;testdate1&lt;/TD&gt;
&lt;TD&gt;2&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;restaurant1&lt;/TD&gt;
&lt;TD&gt;testdate2&lt;/TD&gt;
&lt;TD&gt;2&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;restaurant1&lt;/TD&gt;
&lt;TD&gt;testdate3&lt;/TD&gt;
&lt;TD&gt;1&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;restaurant2&lt;/TD&gt;
&lt;TD&gt;testdate1&lt;/TD&gt;
&lt;TD&gt;3&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;restaurant2&lt;/TD&gt;
&lt;TD&gt;testdate2&lt;/TD&gt;
&lt;TD&gt;1&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;restaurant2&lt;/TD&gt;
&lt;TD&gt;testdate3&lt;/TD&gt;
&lt;TD&gt;2&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;restaurant3&lt;/TD&gt;
&lt;TD&gt;testdate1&lt;/TD&gt;
&lt;TD&gt;2&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;restaurant3&lt;/TD&gt;
&lt;TD&gt;testdate2&lt;/TD&gt;
&lt;TD&gt;1&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;restaurant3&lt;/TD&gt;
&lt;TD&gt;testdate3&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;1&lt;/P&gt;
&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You can also tell what exams were given on any given date and you can tell whether a particular restaurant has completed all the required exams and its easier and you don't have to change the table structure for an additional test date.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;As &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/17429"&gt;@LaurieF&lt;/a&gt; says, "it's all a bit theoretical" at this point.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 26 Jan 2017 21:24:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/counting-distinct-values-over-a-range-of-variables/m-p/327857#M73177</guid>
      <dc:creator>HB</dc:creator>
      <dc:date>2017-01-26T21:24:50Z</dc:date>
    </item>
    <item>
      <title>Re: counting distinct values over a range of variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/counting-distinct-values-over-a-range-of-variables/m-p/327858#M73178</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13711"&gt;@art297&lt;/a&gt; wrote:&lt;BR /&gt;
&lt;P&gt;Here is a brute force approach:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;data want;
  set have;
  array v1(*) $ var1-var3;
  array v2(*) $ var4-var5;
  call sortc(of v1(*));
  do i=1 to dim(v1);
    if not missing(v1(i)) then do;
      if i lt dim(v1) then do;
        if v1(i)ne v1(i+1) then var1_var3=sum(var1_var3,1);
      end;
      else var1_var3=sum(var1_var3,1);
    end;
  end;
  call sortc(of v2(*));
  do i=1 to dim(v2);
    if not missing(v2(i)) then do;
      if i lt dim(v2) then do;
        if v2(i)ne v2(i+1) then var4_var5=sum(var4_var5,1);
      end;
      else var4_var5=sum(var4_var5,1);
    end;
  end;
run;
&lt;/PRE&gt;
&lt;P&gt;HTH,&lt;/P&gt;
&lt;P&gt;Art, CEO, AnalystFinder.com&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13711"&gt;@art297&lt;/a&gt;&amp;nbsp;I thought you might enjoy this Rube Goldberg machine with implicit and explicit arrays, arrays of implicit arrays and&amp;nbsp;temporary arrays.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data t;
   array a $8 var1-var3;
   array b $8 var4-var5;
   array d[2] _temporary_;
   d[1]=dim(a); d[2]=dim(b);
   array c(j) a b;
   input (var:)(:);
   array t[10] $8 _temporary_;
   array count[2];
   do over c;
      call missing(k,of t[*]);
      do _i_ = 1 to d[j];
         if c in t then continue;
         k + 1;
         t[k]=c;
         end;
      count[j] = k;
      end;
   drop j k;
   cards;
abc d45g abc 1 2
a c rrrrrr g9 g9
;;;;
   run;
proc print;
   run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&lt;IMG title="Capture.PNG" alt="Capture.PNG" src="https://communities.sas.com/t5/image/serverpage/image-id/6917i272CC98DE9BA4C45/image-size/original?v=1.0&amp;amp;px=-1" border="0" /&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 26 Jan 2017 22:05:53 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/counting-distinct-values-over-a-range-of-variables/m-p/327858#M73178</guid>
      <dc:creator>data_null__</dc:creator>
      <dc:date>2017-01-26T22:05:53Z</dc:date>
    </item>
    <item>
      <title>Re: counting distinct values over a range of variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/counting-distinct-values-over-a-range-of-variables/m-p/327896#M73184</link>
      <description>&lt;P&gt;I happen to disagree with my colleagues on this point. Normalizing one's data doesn't always provide the expected benefits .. especially with large data sets.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/15410"&gt;@data_null__&lt;/a&gt;: John, neither one of us would write such Goldberg-type code unless we were challenging each other to see who could come up with the most obfuscatious solution (job security maybe?).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;And, I'm definitely not saying that I'm proud of the code I offered. However, like your Goldberg contraption, it does what the OP wanted (p.s., mine ran faster than yours).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Here is the code I ran, the log follows:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;data have (drop=i);
  input (var1-var5) ($);
  do i=1 to 10000;output;end;
  cards;
abc d45g abc 1 2
a c rrrrrr g9 g9
abc abc def g55 g56
abc def ttt hhh hhh
ddd ddd ddd ddd ddd
111 111 112 111 111
a   a   a   a   a  
ab  ab  ac  ac  ac 
;

data want1 (keep=var1_var3 var4_var5);
  set have;
  array v1(*) $ var1-var3;
  array v2(*) $ var4-var5;
  call sortc(of v1(*));
  do i=1 to dim(v1);
    if not missing(v1(i)) then do;
      if i lt dim(v1) then do;
        if v1(i)ne v1(i+1) then var1_var3=sum(var1_var3,1);
      end;
      else var1_var3=sum(var1_var3,1);
    end;
  end;
  call sortc(of v2(*));
  do i=1 to dim(v2);
    if not missing(v2(i)) then do;
      if i lt dim(v2) then do;
        if v2(i)ne v2(i+1) then var4_var5=sum(var4_var5,1);
      end;
      else var4_var5=sum(var4_var5,1);
    end;
  end;
run;
data have;
  set have;
  id=_n_;
run;

proc transpose data=have out=need;
  var var1-var5;
  by id;
run;

data need (keep=id group col1);
  set need;
  if _name_ in ('var1', 'var2', 'var3') then group=1;
  else group=2;
run;

ods select none;
ods output nlevels=nlevels;
proc freq data=need nlevels;
   by id group;
   table col1 / noprint;
   run;
ods select all;

data want2 (keep=var1_var3 var4_var5);
  set nlevels (drop=TableVar);
  retain var1_var3;
  by id;
  if mod(_n_,2) eq 1 then var1_var3=nlevels;
  else do;
    var4_var5=nlevels;
    output;
  end;
run;

data want3 (keep=var1_var3 var4_var5);
   set have;
   array a $8 var1-var3;
   array b $8 var4-var5;
   array d[2] _temporary_;
   d[1]=dim(a); d[2]=dim(b);
   array c(j) a b;
   array t[10] $8 _temporary_;
   array count[2] var1_var3 var4_var5;
   do over c;
      call missing(k,of t[*]);
      do _i_ = 1 to d[j];
         if c in t then continue;
         k + 1;
         t[k]=c;
         end;
      count[j] = k;
      end;
run;
%macro count_dist_vals(invars,outvar);
  %local _num;
  %let _num=%sysfunc(ceil(%sysevalf(%sysfunc(ranuni(0))*10000000000)),z10.);
  if _n_=1 then
    do;
      length _tmp_var_12345 $32767;
      dcl hash h_&amp;amp;_num.(multidata:'n', hashexp:2);
      h_&amp;amp;_num..defineKey('_tmp_var_12345');
      h_&amp;amp;_num..defineDone();
    end;
    %do i= 1 %to %sysfunc(countw(&amp;amp;invars));
      _tmp_var_12345=catt(%scan(&amp;amp;invars,&amp;amp;i));
      h_&amp;amp;_num..ref();
    %end;
    &amp;amp;outvar=h_&amp;amp;_num..num_items;
    h_&amp;amp;_num..clear();
    drop _tmp_var_12345 ;
%mend;

data want4;
  set have;
  %count_dist_vals(var1 var2 var3, n_dist_vals_1)
  %count_dist_vals(var3 var4 var5, n_dist_vals_2)
run;
&lt;/PRE&gt;
&lt;P&gt;and now the log:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt; 1          OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 NOTE: ODS statements in the SAS Studio environment may disable some output features.
 56         
 57         data have (drop=i);
 58           input (var1-var5) ($);
 59           do i=1 to 10000;output;end;
 60           cards;
 
 NOTE: The data set WORK.HAVE has 80000 observations and 5 variables.
 NOTE: DATA statement used (Total process time):
       real time           0.01 seconds
       cpu time            0.00 seconds
       
 69         ;
 
 70         
 71         data want1 (keep=var1_var3 var4_var5);
 72           set have;
 73           array v1(*) $ var1-var3;
 74           array v2(*) $ var4-var5;
 75           call sortc(of v1(*));
 76           do i=1 to dim(v1);
 77             if not missing(v1(i)) then do;
 78               if i lt dim(v1) then do;
 79                 if v1(i)ne v1(i+1) then var1_var3=sum(var1_var3,1);
 80               end;
 81               else var1_var3=sum(var1_var3,1);
 82             end;
 83           end;
 84           call sortc(of v2(*));
 85           do i=1 to dim(v2);
 86             if not missing(v2(i)) then do;
 87               if i lt dim(v2) then do;
 88                 if v2(i)ne v2(i+1) then var4_var5=sum(var4_var5,1);
 89               end;
 90               else var4_var5=sum(var4_var5,1);
 91             end;
 92           end;
 93         run;
 
 NOTE: There were 80000 observations read from the data set WORK.HAVE.
 NOTE: The data set WORK.WANT1 has 80000 observations and 2 variables.
 NOTE: DATA statement used (Total process time):
       real time           0.03 seconds
       cpu time            0.04 seconds
       
 
 94         data have;
 95           set have;
 96           id=_n_;
 97         run;
 
 NOTE: There were 80000 observations read from the data set WORK.HAVE.
 NOTE: The data set WORK.HAVE has 80000 observations and 6 variables.
 NOTE: DATA statement used (Total process time):
       real time           0.02 seconds
       cpu time            0.02 seconds
       
 
 98         
 99         proc transpose data=have out=need;
 100          var var1-var5;
 101          by id;
 102        run;
 
 NOTE: There were 80000 observations read from the data set WORK.HAVE.
 NOTE: The data set WORK.NEED has 400000 observations and 3 variables.
 NOTE: PROCEDURE TRANSPOSE used (Total process time):
       real time           0.07 seconds
       cpu time            0.07 seconds
       
 
 103        
 104        data need (keep=id group col1);
 105          set need;
 106          if _name_ in ('var1', 'var2', 'var3') then group=1;
 107          else group=2;
 108        run;
 
 NOTE: There were 400000 observations read from the data set WORK.NEED.
 NOTE: The data set WORK.NEED has 400000 observations and 3 variables.
 NOTE: DATA statement used (Total process time):
       real time           0.04 seconds
       cpu time            0.05 seconds
       
 
 109        
 110        ods select none;
 111        ods output nlevels=nlevels;
 112        proc freq data=need nlevels;
 113           by id group;
 114           table col1 / noprint;
 115           run;
 
 NOTE: The data set WORK.NLEVELS has 160000 observations and 4 variables.
 NOTE: There were 400000 observations read from the data set WORK.NEED.
 NOTE: PROCEDURE FREQ used (Total process time):
       real time           14.90 seconds
       cpu time            15.02 seconds
       
 
 116        ods select all;
 117        
 118        data want2 (keep=var1_var3 var4_var5);
 119          set nlevels (drop=TableVar);
 120          retain var1_var3;
 121          by id;
 122          if mod(_n_,2) eq 1 then var1_var3=nlevels;
 123          else do;
 124            var4_var5=nlevels;
 125            output;
 126          end;
 127        run;
 
 NOTE: There were 160000 observations read from the data set WORK.NLEVELS.
 NOTE: The data set WORK.WANT2 has 80000 observations and 2 variables.
 NOTE: DATA statement used (Total process time):
       real time           0.02 seconds
       cpu time            0.02 seconds
       
 
 128        
 129        data want3 (keep=var1_var3 var4_var5);
 130           set have;
 131           array a $8 var1-var3;
 132           array b $8 var4-var5;
 133           array d[2] _temporary_;
 134           d[1]=dim(a); d[2]=dim(b);
 135           array c(j) a b;
 136           array t[10] $8 _temporary_;
 137           array count[2] var1_var3 var4_var5;
 138           do over c;
 139              call missing(k,of t[*]);
 140              do _i_ = 1 to d[j];
 141                 if c in t then continue;
 142                 k + 1;
 143                 t[k]=c;
 144                 end;
 145              count[j] = k;
 146              end;
 147        run;
 
 NOTE: There were 80000 observations read from the data set WORK.HAVE.
 NOTE: The data set WORK.WANT3 has 80000 observations and 2 variables.
 NOTE: DATA statement used (Total process time):
       real time           0.06 seconds
       cpu time            0.06 seconds
       
 
 148        %macro count_dist_vals(invars,outvar);
 149          %local _num;
 150          %let _num=%sysfunc(ceil(%sysevalf(%sysfunc(ranuni(0))*10000000000)),z10.);
 151          if _n_=1 then
 152            do;
 153              length _tmp_var_12345 $32767;
 154              dcl hash h_&amp;amp;_num.(multidata:'n', hashexp:2);
 155              h_&amp;amp;_num..defineKey('_tmp_var_12345');
 156              h_&amp;amp;_num..defineDone();
 157            end;
 158            %do i= 1 %to %sysfunc(countw(&amp;amp;invars));
 159              _tmp_var_12345=catt(%scan(&amp;amp;invars,&amp;amp;i));
 160              h_&amp;amp;_num..ref();
 161            %end;
 162            &amp;amp;outvar=h_&amp;amp;_num..num_items;
 163            h_&amp;amp;_num..clear();
 164            drop _tmp_var_12345 ;
 165        %mend;
 166        
 167        data want4;
 168          set have;
 169          %count_dist_vals(var1 var2 var3, n_dist_vals_1)
 170          %count_dist_vals(var3 var4 var5, n_dist_vals_2)
 171        run;
 
 NOTE: There were 80000 observations read from the data set WORK.HAVE.
 NOTE: The data set WORK.WANT4 has 80000 observations and 8 variables.
 NOTE: DATA statement used (Total process time):
       real time           38.12 seconds
       cpu time            37.94 seconds
       
 
 172        
 173        
 174        OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 186    &lt;/PRE&gt;</description>
      <pubDate>Thu, 26 Jan 2017 23:28:23 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/counting-distinct-values-over-a-range-of-variables/m-p/327896#M73184</guid>
      <dc:creator>art297</dc:creator>
      <dc:date>2017-01-26T23:28:23Z</dc:date>
    </item>
    <item>
      <title>Re: counting distinct values over a range of variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/counting-distinct-values-over-a-range-of-variables/m-p/328115#M73241</link>
      <description>&lt;P&gt;I used the following code and got what you wanted. Its a simple code using PROC sql and transpose etc rather than more complex arrays. Not sure if it will be worth if there are hundreds of variables.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
input var1 $	var2 $	var3 $	var4 $	var5 $;
cards;
abc	d45g	abc	1	2
a 	c	rrrrrr	g9	g9
;
run;


proc sql;
create table var1_3 as select var1, var2, var3 from have;
quit;

proc sort data = var1_3;
by var1 var2 var3;
run;
proc transpose data = var1_3 out=var1_3_ (DROP =_NAME_);
VAR var1 var2 var3;
run;

PROC SQL;
CREATE TABLE VAR1_VAR3 AS SELECT COUNT(DISTINCT COL1) AS VAR1, COUNT(DISTINCT COL2) AS VAR2 FROM var1_3_;
QUIT;

PROC TRANSPOSE DATA = VAR1_VAR3 OUT = VAR1_VAR3_ (DROP=_NAME_) PREFIX = VAR1_VAR3;
VAR VAR1 VAR2;
RUN;

proc sql;
create table var4_5 as select var4, var5 from have;
quit;

proc sort data = var4_5;
by var4 VAR5;
run;
proc transpose data = var4_5 out=var4_5_ (DROP =_NAME_);
VAR VAR4 VAR5;
run;

PROC SQL;
CREATE TABLE VAR4_VAR5 AS SELECT COUNT(DISTINCT COL1) AS VAR4, COUNT(DISTINCT COL2) AS VAR5 FROM var4_5_ ;
QUIT;

PROC TRANSPOSE DATA = VAR4_VAR5 OUT = VAR4_VAR5_ (DROP=_NAME_) PREFIX = VAR4_VAR5;
VAR VAR4 VAR5;
RUN;

DATA WANT;
MERGE VAR1_VAR3_ VAR4_VAR5_;
RUN;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Fri, 27 Jan 2017 20:26:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/counting-distinct-values-over-a-range-of-variables/m-p/328115#M73241</guid>
      <dc:creator>devsas</dc:creator>
      <dc:date>2017-01-27T20:26:50Z</dc:date>
    </item>
  </channel>
</rss>

