BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
HB
Barite | Level 11 HB
Barite | Level 11

*  basic options;

OPTIONS MISSING=' ' SORTPGM=BEST ERRORS=1 LINESIZE=132 CENTER

        PAGESIZE=42 NUMBER OBS=MAX;

data last_period;

  input id hours;

  cards;

1 1

2 1

3 1

4 2

5 3

6 3

;

data this_period;

  input id hours;

  cards;

1 1

2 2

3 3

5 5

6 5

;

* I want to find the average gain in hours for each id that shows up in both files;

* for this data, ID 1 has gained 0 hours from last period to this period, ID 2 has gained 1 hours, ID 3 has gained 2 hours;

* 4 is not in the second file and should be ignored;

* ID's 5 and 6 have both gained 2 hours;

* average gain should be ( 0 + 1 + 2 + 2 + 2 ) / 5 = 1.4

* using PROC SQL I can do;

proc sql;

create table average_gain as

select avg(b.hours-a.hours) as hours_gained

from last_period a inner join

this_period b on

a.id = b.id;

quit;

proc print data=average_gain noobs;

   var hours_gained;

   title1 'Average additonal hours gained by PROC SQL';

run;

* this works perfectly and, exactly as predicted, prints out 1.4;

* I want to do the same thing without PROC SQL;

* I think I should be able to do better than the following;

proc sort data=last_period;

    by id;

run;

proc sort data=this_period;

    by id;

run;

data this_period2;                          

   set this_period;

   keep id hours1;

   hours1 = hours;

run;

data average_gain2;

   merge last_period (in = x) this_period2 (in = y);

   by id;

   if x and y;

run;

data average_gain3;                          

   set average_gain2;

   hours_gained = hours1 - hours;

run;

proc means data= average_gain3 mean;

   var hours_gained;

   title1 'Average additonal hours gained by PROC MEANS';

run;

* prints out 1.4000000

* Am I doing this efficiently?  How would you do it? Any ideas for me?;

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

<after the sort which is needed for merge>

data combined;

     merge last_period (in=x) this_period(in=y rename=(hours=hours1));

     by id;

     if x and y;

     hours_gained=hours1-hours;

run;

then proc means;

View solution in original post

2 REPLIES 2
ballardw
Super User

<after the sort which is needed for merge>

data combined;

     merge last_period (in=x) this_period(in=y rename=(hours=hours1));

     by id;

     if x and y;

     hours_gained=hours1-hours;

run;

then proc means;

HB
Barite | Level 11 HB
Barite | Level 11

Okay.  That's a bunch better.

I woudl have never figured out this part:

this_period(in=y rename=(hours=hours1));

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 701 views
  • 0 likes
  • 2 in conversation