<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: subset multivariable data set that values of variables be larger than 75th percentile in Statistical Procedures</title>
    <link>https://communities.sas.com/t5/Statistical-Procedures/subset-multivariable-data-set-that-values-of-variables-be-larger/m-p/713658#M34489</link>
    <description>&lt;P&gt;Let's work backwards:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Your last step has this code:&lt;/P&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;
/*Find the subset that values of variables are larger than 75 percentiles*/
data subset;
  set mydata ;
  array var{5} varb varc varg varh vark;
	do i = 1 to 5;
	where var(i)&amp;gt;=&amp;amp;p75Mvalue	;
	end;
  
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;You have a where statement with a component specifying an array element.&amp;nbsp; This has two problems:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;You are comparing each variable to the same macro value.&amp;nbsp; But the variables each probably have their own 75th percentile.&lt;/LI&gt;
&lt;LI&gt;Far more problematic: the WHERE statement is always "outsourced" by the data step to the data engine (that's why you can use a WHERE statement in a PROC as well as a DATA step).&amp;nbsp; But because it is outsourced, it is not informed of the array definition, so you can't pass expressions like "var{i}" to it.&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;So, if you want to use macro variables, you probably want something like&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;where varb &amp;gt;= &amp;amp;varb_p75 and varc &amp;gt;= &amp;amp;varc_p75 and 
      varg &amp;gt;= &amp;amp;varg_p75 and varh &amp;gt;= &amp;amp;varh_p75 and 
      vark &amp;gt;= &amp;amp;vark_p75 ;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;That, in turn, means you have to modify your middle step to create those 5 macrovars.&amp;nbsp; It currently only creates (and repeatedly overwrites) a single macrovar p75mvalue.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;So take a look at the output of the proc means, and see how you can loop over the 5 values for the 75th percentiles, writing a single distinctly-named macrovar in each iteration.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;BTW, you could avoid the use of macrovars entirely if you choose to use an IF statement (instead of where) in the last data step.&amp;nbsp; You would then need only the proc means and the DATA SUBSET step with an additional "IF _N_=1 then SET P75DATASET;" statement..&lt;/P&gt;</description>
    <pubDate>Sun, 24 Jan 2021 05:22:17 GMT</pubDate>
    <dc:creator>mkeintz</dc:creator>
    <dc:date>2021-01-24T05:22:17Z</dc:date>
    <item>
      <title>subset multivariable data set that values of variables be larger than 75th percentile</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/subset-multivariable-data-set-that-values-of-variables-be-larger/m-p/713650#M34487</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;I have multivariable data set and I need to subset my data that values of all variables be larger than 75th percentile at each column. I appreciate for any help.&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;***********;
/*Calculate 75 percentile*/
proc means data=mydata  noprint;
  var varb varc varg varh vark    ;
  output out=p75dataset  P75=  / autoname;
run;

proc print data=p75dataset;run;

/*Store 75 percentile in a macro variable*/
data _null_;
  set p75dataset;
  call symputx('p75Mvalue', autoname);
run;



/*Find the subset that values of variables are larger than 75 percentiles*/
data subset;
  set mydata ;
  array var{5} varb varc varg varh vark;
	do i = 1 to 5;
	where var(i)&amp;gt;=&amp;amp;p75Mvalue	;
	end;
  
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 23 Jan 2021 23:15:24 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/subset-multivariable-data-set-that-values-of-variables-be-larger/m-p/713650#M34487</guid>
      <dc:creator>fatemeh</dc:creator>
      <dc:date>2021-01-23T23:15:24Z</dc:date>
    </item>
    <item>
      <title>Re: subset multivariable data set that values of variables be larger than 75th percentile</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/subset-multivariable-data-set-that-values-of-variables-be-larger/m-p/713658#M34489</link>
      <description>&lt;P&gt;Let's work backwards:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Your last step has this code:&lt;/P&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;
/*Find the subset that values of variables are larger than 75 percentiles*/
data subset;
  set mydata ;
  array var{5} varb varc varg varh vark;
	do i = 1 to 5;
	where var(i)&amp;gt;=&amp;amp;p75Mvalue	;
	end;
  
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;You have a where statement with a component specifying an array element.&amp;nbsp; This has two problems:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;You are comparing each variable to the same macro value.&amp;nbsp; But the variables each probably have their own 75th percentile.&lt;/LI&gt;
&lt;LI&gt;Far more problematic: the WHERE statement is always "outsourced" by the data step to the data engine (that's why you can use a WHERE statement in a PROC as well as a DATA step).&amp;nbsp; But because it is outsourced, it is not informed of the array definition, so you can't pass expressions like "var{i}" to it.&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;So, if you want to use macro variables, you probably want something like&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;where varb &amp;gt;= &amp;amp;varb_p75 and varc &amp;gt;= &amp;amp;varc_p75 and 
      varg &amp;gt;= &amp;amp;varg_p75 and varh &amp;gt;= &amp;amp;varh_p75 and 
      vark &amp;gt;= &amp;amp;vark_p75 ;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;That, in turn, means you have to modify your middle step to create those 5 macrovars.&amp;nbsp; It currently only creates (and repeatedly overwrites) a single macrovar p75mvalue.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;So take a look at the output of the proc means, and see how you can loop over the 5 values for the 75th percentiles, writing a single distinctly-named macrovar in each iteration.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;BTW, you could avoid the use of macrovars entirely if you choose to use an IF statement (instead of where) in the last data step.&amp;nbsp; You would then need only the proc means and the DATA SUBSET step with an additional "IF _N_=1 then SET P75DATASET;" statement..&lt;/P&gt;</description>
      <pubDate>Sun, 24 Jan 2021 05:22:17 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/subset-multivariable-data-set-that-values-of-variables-be-larger/m-p/713658#M34489</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2021-01-24T05:22:17Z</dc:date>
    </item>
  </channel>
</rss>

