BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
HB
Barite | Level 11 HB
Barite | Level 11

*  basic options;

OPTIONS MISSING=' ' SORTPGM=BEST ERRORS=1 LINESIZE=132 CENTER

        PAGESIZE=42 NUMBER OBS=MAX;

data last_period;

  input id hours;

  cards;

1 1

2 1

3 1

4 2

5 3

6 3

;

data this_period;

  input id hours;

  cards;

1 1

2 2

3 3

5 5

6 5

;

* I want to find the average gain in hours for each id that shows up in both files;

* for this data, ID 1 has gained 0 hours from last period to this period, ID 2 has gained 1 hours, ID 3 has gained 2 hours;

* 4 is not in the second file and should be ignored;

* ID's 5 and 6 have both gained 2 hours;

* average gain should be ( 0 + 1 + 2 + 2 + 2 ) / 5 = 1.4

* using PROC SQL I can do;

proc sql;

create table average_gain as

select avg(b.hours-a.hours) as hours_gained

from last_period a inner join

this_period b on

a.id = b.id;

quit;

proc print data=average_gain noobs;

   var hours_gained;

   title1 'Average additonal hours gained by PROC SQL';

run;

* this works perfectly and, exactly as predicted, prints out 1.4;

* I want to do the same thing without PROC SQL;

* I think I should be able to do better than the following;

proc sort data=last_period;

    by id;

run;

proc sort data=this_period;

    by id;

run;

data this_period2;                          

   set this_period;

   keep id hours1;

   hours1 = hours;

run;

data average_gain2;

   merge last_period (in = x) this_period2 (in = y);

   by id;

   if x and y;

run;

data average_gain3;                          

   set average_gain2;

   hours_gained = hours1 - hours;

run;

proc means data= average_gain3 mean;

   var hours_gained;

   title1 'Average additonal hours gained by PROC MEANS';

run;

* prints out 1.4000000

* Am I doing this efficiently?  How would you do it? Any ideas for me?;

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

<after the sort which is needed for merge>

data combined;

     merge last_period (in=x) this_period(in=y rename=(hours=hours1));

     by id;

     if x and y;

     hours_gained=hours1-hours;

run;

then proc means;

View solution in original post

2 REPLIES 2
ballardw
Super User

<after the sort which is needed for merge>

data combined;

     merge last_period (in=x) this_period(in=y rename=(hours=hours1));

     by id;

     if x and y;

     hours_gained=hours1-hours;

run;

then proc means;

HB
Barite | Level 11 HB
Barite | Level 11

Okay.  That's a bunch better.

I woudl have never figured out this part:

this_period(in=y rename=(hours=hours1));

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 1179 views
  • 0 likes
  • 2 in conversation