Hi,
I currently have dataset with 388 observations(rows) and over 8,000 variables of interest(i.e. columns). The data is in a matrix form. The row contains study numbers and each column is a gene. Each cell contains a ratio from 0 to 1. I want to tell SAS for each column pull study numbers that greater than 0.8. I've been manually doing this and I am not sure if there is a more efficient way of doing this. I am not familiar with sas macros or arrays. Below is a table of the data structure.
Study Number | A | B | C | D | E | F |
1 | 1 | 0.98 | 0.99 | 0.98 | .8 | 0.98 |
2 | .6 | 1 | 1 | .4 | 1 | 1 |
3 | 1 | .3 | 1 | 1 | 1 | 1 |
Please try this untested code
data want;
set have;
array vars(*) A B C D F ;
do i = 1 to dim(vars);
if vars(i) ne . and vars(i)>0.8 output;
end;
run;
We can use the proc sql to pull the variables into a macro variable as below ad then use this macro variable in place of the variables in array
proc sql;
select name into :vars separated by ' ' from dictionary.columns where libname='WORK' and memname='HAVE';
quit;
Hi @tadedir1
If your variables of interest are listed i.e occurs in sequence from left to right in succession, you could conveniently use variable lists with a double(--) .
For eg.
data want;
set have;
array vars(*) A --F ;
do i = 1 to dim(vars);
if vars(i)>0.8 do;
output;
leave;
end;
end;
run;
Exit the loop soon as a value>0.8 is found
When I run the arrays displayed above: I'm still pulling variables that are less than 0.8.
Oh you mean you wanna drop variables?or observations? You should prolly show your expected output
Yes I want to drop observations that don't meet the 0.8 criteria.
You should prolly show your expected output
Expected output
Option: Set those with cut off less than 0.8 as missing
Study Number | A | B | C | D | E | F |
1 | 1 | 0.98 | 0.99 | 0.98 | .8 | 0.98 |
2 | . | 1 | 1 | . | 1 | 1 |
3 | 1 | . | 1 | 1 | 1 | 1 |
or
Have 6 different tables only displaying the cutoffs greater than or equal to 0.8
Study Number | A | B | C | D | E | F |
1 | 1 | 0.98 | 0.99 | 0.98 | .8 | 0.98 |
2 | .6 | 1 | 1 | .4 | 1 | 1 |
3 | 1 | .3 | 1 | 1 | 1 | 1 |
Study Number a
1 1
3 1
study number b
1 0.98
2 1
study number c
1 0.99
2 1
3 1
Option: Set those with cut off less than 0.8 as missing - Easy
data want;
set have;
array vars(*) A --F ;
do i = 1 to dim(vars);
if vars(i)<0.8 then var(i)=.;
end;
drop i;
run;
@tadedir1 wrote:
Expected output
Option: Set those with cut off less than 0.8 as missing
Study Number A B C D E F 1 1 0.98 0.99 0.98 .8 0.98 2 . 1 1 . 1 1 3 1 . 1 1 1 1
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.