<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Search for a value across a group of columns in SQL in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Search-for-a-value-across-a-group-of-columns-in-SQL/m-p/951965#M372099</link>
    <description>&lt;P&gt;I'm trying to select all obs that have a particular value across a multiple variables in PROC SQL.&amp;nbsp; &amp;nbsp;I know it can be done with a do loop in a data step, but in this case I'm using proc sql because I'm trying to avoid doing a data step and then a proc print.&amp;nbsp; &amp;nbsp;The sample code below works, but I'm wondering if there is a more efficient way.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;/*create data set*/&lt;BR /&gt;data a;&lt;BR /&gt;input x 1. y 1. z 1.;&lt;BR /&gt;datalines;&lt;BR /&gt;000&lt;BR /&gt;100&lt;BR /&gt;213&lt;BR /&gt;;&lt;BR /&gt;run;&lt;/P&gt;&lt;P&gt;/*list if any variable has a&lt;BR /&gt;value of 1*/&lt;BR /&gt;proc sql;&lt;BR /&gt;select *&lt;BR /&gt;from a&lt;BR /&gt;where x=1&lt;BR /&gt;or y=1&lt;BR /&gt;or z=1&lt;BR /&gt;;&lt;BR /&gt;quit;&lt;/P&gt;</description>
    <pubDate>Tue, 26 Nov 2024 18:05:32 GMT</pubDate>
    <dc:creator>Batman</dc:creator>
    <dc:date>2024-11-26T18:05:32Z</dc:date>
    <item>
      <title>Search for a value across a group of columns in SQL</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Search-for-a-value-across-a-group-of-columns-in-SQL/m-p/951965#M372099</link>
      <description>&lt;P&gt;I'm trying to select all obs that have a particular value across a multiple variables in PROC SQL.&amp;nbsp; &amp;nbsp;I know it can be done with a do loop in a data step, but in this case I'm using proc sql because I'm trying to avoid doing a data step and then a proc print.&amp;nbsp; &amp;nbsp;The sample code below works, but I'm wondering if there is a more efficient way.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;/*create data set*/&lt;BR /&gt;data a;&lt;BR /&gt;input x 1. y 1. z 1.;&lt;BR /&gt;datalines;&lt;BR /&gt;000&lt;BR /&gt;100&lt;BR /&gt;213&lt;BR /&gt;;&lt;BR /&gt;run;&lt;/P&gt;&lt;P&gt;/*list if any variable has a&lt;BR /&gt;value of 1*/&lt;BR /&gt;proc sql;&lt;BR /&gt;select *&lt;BR /&gt;from a&lt;BR /&gt;where x=1&lt;BR /&gt;or y=1&lt;BR /&gt;or z=1&lt;BR /&gt;;&lt;BR /&gt;quit;&lt;/P&gt;</description>
      <pubDate>Tue, 26 Nov 2024 18:05:32 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Search-for-a-value-across-a-group-of-columns-in-SQL/m-p/951965#M372099</guid>
      <dc:creator>Batman</dc:creator>
      <dc:date>2024-11-26T18:05:32Z</dc:date>
    </item>
    <item>
      <title>Re: Search for a value across a group of columns in SQL</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Search-for-a-value-across-a-group-of-columns-in-SQL/m-p/951968#M372100</link>
      <description>&lt;P&gt;A little less programming, you can do this with the &lt;A href="https://documentation.sas.com/doc/en/pgmmvacdc/9.4/lefunctionsref/p0zs0pv38mel2jn1in4lte2akx4d.htm" target="_self"&gt;WHICHN function&lt;/A&gt;.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql;
    select *
    from a
    where whichn(1,x,y,z);
quit;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Which fails to be less efficient if you have a lot of variables, and then you would probably want to go back to a DATA step.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;(What is so terrible about PROC PRINT?)&lt;/P&gt;</description>
      <pubDate>Tue, 26 Nov 2024 18:18:15 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Search-for-a-value-across-a-group-of-columns-in-SQL/m-p/951968#M372100</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2024-11-26T18:18:15Z</dc:date>
    </item>
    <item>
      <title>Re: Search for a value across a group of columns in SQL</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Search-for-a-value-across-a-group-of-columns-in-SQL/m-p/951976#M372103</link>
      <description>&lt;P&gt;Hey &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/9248"&gt;@Batman&lt;/a&gt;! This is one of those situations where efficiency is probably going to be minsiscule depending on which way you go. You can use whichn as&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/10892"&gt;@PaigeMiller&lt;/a&gt;&amp;nbsp;showed, or you can add them to gether and only return rows where the sum is &amp;gt; 0.&amp;nbsp; So, let's go ahead and test this. We'll create a decent-sized dataset and compare methods so we can gauge how it scales.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This dataset will have 500k rows with exactly 100 rows where all values of x are 1.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
    array x[100];
    do i = 1 to 500000;
        do j = 1 to dim(x); 
            if(50000 LE i LE 50100) then x[j]=1;
                else x[j] = 0;
        end;
        output;
    end;
    drop i j;
run;

/* Helper function so we don't need to type x1, x2, x3, ... etc. */
%macro delimx(d);
    %local i;
    x1
    %do i = 2 %to 100;
        &amp;amp;d x&amp;amp;i
    %end;
%mend;

%macro timeit;
    %do i = 1 %to 100;

        /* Test 1: Whichn */
        %let start=%sysfunc(datetime());
            proc sql noprint;
                select *
                from have
                where whichn(1, %delimx(%str(,)))
                ;
            quit;
        %let time1 = %sysevalf(%sysfunc(datetime())-&amp;amp;start);

        /* Test 2: Sum function */
        %let start=%sysfunc(datetime());
            proc sql noprint;
                select *
                from have
                where sum(%delimx(%str(,))) &amp;gt; 0
                ;
            quit;
        %let time2 = %sysevalf(%sysfunc(datetime())-&amp;amp;start);

        /* Test 3: Addition */
        %let start=%sysfunc(datetime());
           proc sql noprint;
                select *
                from have
                where %delimx(+) &amp;gt; 0
                ;
            quit;
        %let time3 = %sysevalf(%sysfunc(datetime())-&amp;amp;start);

        data time;
            time1 = &amp;amp;time1;
            time2 = &amp;amp;time2;
            time3 = &amp;amp;time3;
        run;

        proc append base=times data=time;
        run;
    %end;

    proc sql;
        select mean(time1) as avg_time1 label='Avg: Whichn'
             , std(time1)  as std_time1 label='Std: Whichn'
             , mean(time2) as avg_time2 label='Avg: Sum'
             , std(time2)  as std_time2 label='Std: Sum'
             , mean(time3) as avg_time3 label='Avg: Add'
             , std(time3)  as std_time3 label='Std: Add'
        from times;
    quit;

    proc datasets lib=work nolist;
        delete times;
    quit;

%mend;

%timeit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Stu_SAS_1-1732648733266.png" style="width: 999px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/102546iC0E54F0FB3995E49/image-size/large?v=v2&amp;amp;px=999" role="button" title="Stu_SAS_1-1732648733266.png" alt="Stu_SAS_1-1732648733266.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;On my machine, we can see that there is nearly no difference between using whichn and the sum function across 100 trials in a SAS dataset, but whichn had a higher standard deviation. Straight addition was the slowest. So, what's the most efficient way of these three when working with SAS datasets? Either whichn or the sum function - but depending on the size of your dataset, you may be looking at milliseconds of difference that you'll never notice.&lt;/P&gt;</description>
      <pubDate>Tue, 26 Nov 2024 19:29:48 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Search-for-a-value-across-a-group-of-columns-in-SQL/m-p/951976#M372103</guid>
      <dc:creator>Stu_SAS</dc:creator>
      <dc:date>2024-11-26T19:29:48Z</dc:date>
    </item>
  </channel>
</rss>

