BookmarkSubscribeRSS Feed
tadedir1
Calcite | Level 5

Hi,

 

I currently have dataset with 388 observations(rows) and over 8,000 variables of interest(i.e. columns). The data is in a matrix form. The row contains study numbers and each column is a  gene. Each cell contains a ratio from 0 to 1. I want to tell SAS for each column pull study numbers that greater than 0.8. I've been manually doing this and I am not sure if there is a more efficient way of doing this. I am not familiar with sas macros or arrays. Below is a table of the data structure. 

 

Study NumberABCDEF
110.980.990.98.80.98
2.611.411
31.31111
10 REPLIES 10
Jagadishkatam
Amethyst | Level 16

Please try this untested code

 

data want;
set have;
array vars(*) A B C D F ;
do i = 1 to dim(vars);
if vars(i) ne . and vars(i)>0.8 output;
end;
run;
Thanks,
Jag
tadedir1
Calcite | Level 5
If you had 8,000 columns, do you have to write out each variable name (i.e. vars (*) A B C D F...8000) or is there a faster way to doing it this way?
Jagadishkatam
Amethyst | Level 16

We can use the proc sql to pull the variables into a macro variable as below ad then use this macro variable in place of the variables in array

 

proc sql;
select name into :vars separated by ' ' from dictionary.columns where libname='WORK' and memname='HAVE';
quit;

 

Thanks,
Jag
novinosrin
Tourmaline | Level 20

Hi @tadedir1 

 

If your variables of interest are listed i.e occurs in sequence from left to right in succession, you could conveniently use variable lists with a double(--) .

 

For eg.

 

data want;
set have;
array vars(*) A --F ;
do i = 1 to dim(vars);
if  vars(i)>0.8 do;
 output;
leave;
end;
end;
run;

Exit the loop soon as a value>0.8 is found

 

tadedir1
Calcite | Level 5

When I run the arrays displayed above: I'm still pulling variables that are less than 0.8.

novinosrin
Tourmaline | Level 20

Oh you mean you wanna drop variables?or observations? You should prolly show your expected output

tadedir1
Calcite | Level 5

Yes I want to drop observations that don't meet the 0.8 criteria.

 

 

novinosrin
Tourmaline | Level 20

 You should prolly show your expected output

tadedir1
Calcite | Level 5

Expected output 

Option: Set those with cut off less than 0.8 as missing 

 

Study NumberABCDEF
110.980.990.98.80.98
2.11.11
31.1111

 

or 

 

Have 6 different tables only displaying the cutoffs greater than or equal to 0.8

 

Study NumberABCDEF
110.980.990.98.80.98
2.611.411
31.31111

  

Study Number      a

1                           1

3                           1

 

study number       b

1                          0.98

2                           1

 

study number        c

1                           0.99

2                           1

3                           1

novinosrin
Tourmaline | Level 20

Option: Set those with cut off less than 0.8 as missing - Easy

 

data want;
set have;
array vars(*) A --F ;
do i = 1 to dim(vars);
if  vars(i)<0.8 then var(i)=.;
end;
drop i;
run;

 

 

 

 


@tadedir1 wrote:

Expected output 

Option: Set those with cut off less than 0.8 as missing 

 

Study Number A B C D E F
1 1 0.98 0.99 0.98 .8 0.98
2 . 1 1 . 1 1
3 1 . 1 1 1 1

 

 


 

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 10 replies
  • 1626 views
  • 0 likes
  • 3 in conversation