Hi,
I currently have dataset with 388 observations(rows) and over 8,000 variables of interest(i.e. columns). The data is in a matrix form. The row contains study numbers and each column is a gene. Each cell contains a ratio from 0 to 1. I want to tell SAS for each column pull study numbers that greater than 0.8. I've been manually doing this and I am not sure if there is a more efficient way of doing this. I am not familiar with sas macros or arrays. Below is a table of the data structure.
Study Number | A | B | C | D | E | F |
1 | 1 | 0.98 | 0.99 | 0.98 | .8 | 0.98 |
2 | .6 | 1 | 1 | .4 | 1 | 1 |
3 | 1 | .3 | 1 | 1 | 1 | 1 |
Please try this untested code
data want;
set have;
array vars(*) A B C D F ;
do i = 1 to dim(vars);
if vars(i) ne . and vars(i)>0.8 output;
end;
run;
We can use the proc sql to pull the variables into a macro variable as below ad then use this macro variable in place of the variables in array
proc sql;
select name into :vars separated by ' ' from dictionary.columns where libname='WORK' and memname='HAVE';
quit;
Hi @tadedir1
If your variables of interest are listed i.e occurs in sequence from left to right in succession, you could conveniently use variable lists with a double(--) .
For eg.
data want;
set have;
array vars(*) A --F ;
do i = 1 to dim(vars);
if vars(i)>0.8 do;
output;
leave;
end;
end;
run;
Exit the loop soon as a value>0.8 is found
When I run the arrays displayed above: I'm still pulling variables that are less than 0.8.
Oh you mean you wanna drop variables?or observations? You should prolly show your expected output
Yes I want to drop observations that don't meet the 0.8 criteria.
You should prolly show your expected output
Expected output
Option: Set those with cut off less than 0.8 as missing
Study Number | A | B | C | D | E | F |
1 | 1 | 0.98 | 0.99 | 0.98 | .8 | 0.98 |
2 | . | 1 | 1 | . | 1 | 1 |
3 | 1 | . | 1 | 1 | 1 | 1 |
or
Have 6 different tables only displaying the cutoffs greater than or equal to 0.8
Study Number | A | B | C | D | E | F |
1 | 1 | 0.98 | 0.99 | 0.98 | .8 | 0.98 |
2 | .6 | 1 | 1 | .4 | 1 | 1 |
3 | 1 | .3 | 1 | 1 | 1 | 1 |
Study Number a
1 1
3 1
study number b
1 0.98
2 1
study number c
1 0.99
2 1
3 1
Option: Set those with cut off less than 0.8 as missing - Easy
data want;
set have;
array vars(*) A --F ;
do i = 1 to dim(vars);
if vars(i)<0.8 then var(i)=.;
end;
drop i;
run;
@tadedir1 wrote:
Expected output
Option: Set those with cut off less than 0.8 as missing
Study Number A B C D E F 1 1 0.98 0.99 0.98 .8 0.98 2 . 1 1 . 1 1 3 1 . 1 1 1 1
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.