data have;
input BAT$ CAT DAT FAT$ GAT HAT;
CARDS;
as 6 .5 ez 2 12
es 3 .7 az 7 5
os 11 .2 yz 9 3
us 13 .4 uz 11 8
is 2 .8 iz 1 10
ys 5 .6 oz 4 13
;
Hi everyone.
First, I thank everyone who takes time to consider this exercise and provide feedback on the solution. Ok, here goes my challenge.
1. I need to accomplish this in one "proc SQL" procedure in SAS.
2. The product of the procedure should be a macro variable storing a single value. Just that. No revised/final table.
3. The stored value in the macro variable should be a simple count of the column averages of a source table "have" like the one attached.
4. The column averages that qualify for being counted should be those above a pre-determined threshold value (in the case of this example, the threshold = 1).
5. (Note:) Not all of the table columns are numeric, and obviously the non-numeric columns should not be counted.
In the example image attached only 3 columns (CAT, GAT, HAT) averages qualify for counting, thus the macro variable takes the value: 3.
Here are the inputs for source dataset "have":
data have;
input BAT$ CAT DAT FAT$ GAT HAT;
CARDS;
as 6 .5 ez 2 12
es 3 .7 az 7 5
os 11 .2 yz 9 3
us 13 .4 uz 11 8
is 2 .8 iz 1 10
ys 5 .6 oz 4 13
;
Thanks once more. Regards.
The easiest way to code the stuff may be a small macro, e.g.:
%macro test_avg(var);
case when -0.8<avg(&var)<0.3 then 1 else 0 end
%mend;
proc sql noprint;
select cats('%test_avg(',name,')')
into :expression separated by '+'
from dictionary.columns
where libname='WORK' and memname='HAVE' and type='num';
select &expression into :count trimmed
from have;
quit;
%put &count;
Just put whatever criteria you need into the macro. And do NOT put a semicolon in the macro, is is an open statement.
This also has the advantage that the macro variable &expression does not become so long, meaning that you do not risk overflow (total text longer than maximum allowed macro variable length) even if you have several thousand numeric variables on your table.
Please supply example data in usable form. A picture of a handwritten note in a PDF is NOT USABLE.
Post your dataset as a data step with datalines, which we can copy into our SAS environment and submit.
Use the "little running man" button to post your code.
Hi @Kurt_Bremser ,
Sorry for any inconvenience. Following your advice, I updated the thread with the necessary input.
Most of us won't open attachments, or cannot open attachments, as they can be a security threat. You should include your SAS program in the post instead of as an attachment. Use the running man icon to include your SAS code.
I have a slightly different take, but the same direction:
data have;
input BAT$ CAT DAT FAT$ GAT HAT;
CARDS;
as 6 .5 ez 2 12
es 3 .7 az 7 5
os 11 .2 yz 9 3
us 13 .4 uz 11 8
is 2 .8 iz 1 10
ys 5 .6 oz 4 13
;
%let treshold=1;
proc sql noprint;
select "avg(" !! strip(name) !! ")>&treshold" into :averages separated by ","
from dictionary.columns
where libname = "WORK" and memname = "HAVE" and type = "num";
select
sum(&averages)
into :want
from have
;
quit;
%put &=want;
In SAS SQL, you can do it fast and dirty:
%let threshold=1;
proc sql noprint;
select sum(avg(cat)>&threshold,avg(dat)>&threshold,avg(gat)>&threshold,avg(hat)>&threshold) into :count trimmed
from have;
quit;
%put &count;
In standard SQL, you will have to use a CASE...WHEN instead, to convert the logical values to numerics:
%let Threshold=1;
proc sql noprint;
select
case when avg(cat)>&threshold then 1 else 0 end
+case when avg(dat)>&threshold then 1 else 0 end
+case when avg(gat)>&threshold then 1 else 0 end
+case when avg(hat)>&threshold then 1 else 0 end
into :count trimmed
from have;
quit;
%put &count;
If you need to generate the names of the numeric columns automatically, that can be done using DICTIONARY.COLUMNS:
%let Threshold=1;
proc sql noprint;
select cats("case when avg(",name,")>&threshold then 1 else 0 end")
into :expression separated by '+'
from dictionary.columns
where libname='WORK' and memname='HAVE' and type='num';
select &expression into :count trimmed
from have;
quit;
Hi @s_lassen ,
Thanks the prompt reply with your approach.
Can the threshold be coded as a real interval:
-0.8 < threshold < 0.3
...??
The easiest way to code the stuff may be a small macro, e.g.:
%macro test_avg(var);
case when -0.8<avg(&var)<0.3 then 1 else 0 end
%mend;
proc sql noprint;
select cats('%test_avg(',name,')')
into :expression separated by '+'
from dictionary.columns
where libname='WORK' and memname='HAVE' and type='num';
select &expression into :count trimmed
from have;
quit;
%put &count;
Just put whatever criteria you need into the macro. And do NOT put a semicolon in the macro, is is an open statement.
This also has the advantage that the macro variable &expression does not become so long, meaning that you do not risk overflow (total text longer than maximum allowed macro variable length) even if you have several thousand numeric variables on your table.
The above code did the trick for me, Thank you @s_lassen; short and elegant.
I thank all other thread participants as well.
The solution was needed to retreive the number of non-intercept coefficients that met a threshold criterion out of a "proc Varmax" estimators' output dataset, in order to use it to calculate the Adjusted R^2.
Hope this solution helps many others with their work.
All the best to everyone.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.