BookmarkSubscribeRSS Feed
linlin87
Quartz | Level 8

Hi SAS Community,

 

I have a numeric variable (var1) and I am trying to compute the mean of this value based on two variables - subjectID and siteID - but only for specific values of a third variable, test_run.

 

In particular, I only want to calculate the mean of var1 for test_run = 1-5. I want to do this for each subject and each site. I have a list of subjects (&sb) and sites (&site).

 

So far I have this:

 

proc sql;
create table dataset2 as
select *
from dataset1
where subjectid in (&sb) and site in (&site) and 1 le test_run le 5;
quit;

proc sort data=dataset2;
by subjectid site;
run; proc means data=dataset2; by subjectid site; var var1; output out=output1; run;

However these two steps take a very long time. I was wondering if there was a way of calculating this mean of var1 for 1<=testrun<=5 in the initial SQL call?

6 REPLIES 6
andreas_lds
Jade | Level 19

@linlin87 wrote:

....

However these two steps take a very long time. I was wondering if there was a way of calculating this mean of var1 for 1<=testrun<=5 in the initial SQL call?


"very long time" can mean anything, depending on the size of the data and the hardware, this could run some time. Please post the complete log of the steps shown with option fullstimer active.

linlin87
Quartz | Level 8
I would post the log, but 9/10 times SAS ends up crashing
PaigeMiller
Diamond | Level 26

How large is the data set?

--
Paige Miller
Tom
Super User Tom
Super User

Do you really need dataset2? Do you really need PROC MEANS to make a printout if you are already making a dataset?

Assuming that test_run only has integer values use IN operator for it also might be faster (depending on how DATASET1 is stored).

Try doing it in one step instead of three.

proc means data=dataset1 NOPRINT nway;
  where subjectid in (&sb)
    and site in (&site)
    and test_run in (1 2 3 4 5)
  ;
  class subjectid site;
  var var1;
  output out=output1;
run;

 

linlin87
Quartz | Level 8
Yes it does only have integer values. Thanks Tom I will try this and let you know how it goes
Kurt_Bremser
Super User

Try this:

proc means data=dataset1(where=(subjectid in (&sb) and site in (&site) and 1 le test_run le 5)) nway;
class subjectid site;
var var1;
output out=output1;
run;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 1345 views
  • 2 likes
  • 5 in conversation