BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
luch25
Obsidian | Level 7

Hello all,

I need to calculate the annual and monthly average of values for each study subject. However, the twist is that they all have different start dates that start the window for the year or month of measurements. For example in the code below, I have 3 subjects with different earliest study start dates, so the lookout period for the annual measurement differs. I'm not sure how to create this annual arrays. Hoping to then modify the annual code to also calculate monthly averages.

 

data have;
	input id datetime hr resp;
	format datetime datetime22.3.;
	cards;
1 19MAR2009:13:15:00.000 90 16
1 19MAR2009:13:17:00.000 70 18
1 19MAR2009:13:18:00.000 80 17
1 19MAR2009:13.22:00.000 100 22
1 23MAR2009:10:36:00.000 83 21
2 29OCT2009:10:36:00.000 121 13
2 29OCT2009:10:38:00.000 83 14
2 08JUL2011:13:04:00.000 131 18
2 07SEP2016:14:26:59.000 50 12
3 06APR2017:13:39:00.000 76 18
3 08JUL2017:13:04:00.000 98 17
3 08JUL2018:13:08:00.000 96 18
3 24OCT2019:15:11:00.000 80 19
;

I'm envisioning the resulting dataset to look like this. Note that the "start_date" variable represents the very first encounter for each id, and then calculates exactly one year from that point to be the next start_date. It does not represent the actual dates of the measurements (except for the very first one)- simply when the clock "starts" for annual averages. And then HR_year and Resp_year represents the average of values for the year. Note that id=2 has missing values because there were no measurements taken in some of the annual time windows. 

 

ID Start_date HR_year Resp_year
1 19MAR2009:13:15:00.000 84.6 18.8
2 29OCT2009:10:36:00.000 102 13.5
2 29OCT2010:10:36:00.000 131 18
2 29OCT2011:10:36:00.000 . .
2 29OCT2012:10:36:00.000 . .
2 29OCT2013:10:36:00.000 . .
2 29OCT2014:10:36:00.000 . .
2 29OCT2015:10:36:00.000 50 12
3 06APR2017:13:39:00.000 90 17.667
3 06APR2018:13:39:00.000 96 18
3 06APR2019:13:39:00.000 80 19

 

Thank you so much for your consideration and help!

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

First a reminder to make sure that your data step runs and creates what is expected. Yours doesn't provide an informat for datetime and one of your values has a decimal in the wrong place.

 

This sort of question is one place where the power of the SAS Format comes to the fore. Groups created with a Format can be used in analysis, reporting and most formats will work with graphs. The following shows creating mean, median and standard deviation. Basic statistics are easy.  If you want a data set then use the version with output

 

data have;
	input id datetime :datetime25. hr resp;
	format datetime datetime22.3;
	cards;
1 19MAR2009:13:15:00.000 90 16
1 19MAR2009:13:17:00.000 70 18
1 19MAR2009:13:18:00.000 80 17
1 19MAR2009:13:22:00.000 100 22
1 23MAR2009:10:36:00.000 83 21
2 29OCT2009:10:36:00.000 121 13
2 29OCT2009:10:38:00.000 83 14
2 08JUL2011:13:04:00.000 131 18
2 07SEP2016:14:26:59.000 50 12
3 06APR2017:13:39:00.000 76 18
3 08JUL2017:13:04:00.000 98 17
3 08JUL2018:13:08:00.000 96 18
3 24OCT2019:15:11:00.000 80 19
;


proc means data=have mean median std;
   class id datetime;
   format datetime dtmonyy.;
   var hr resp;
run;


proc means data=have mean median std;
   class id datetime;
   format datetime dtyear.;
   var hr resp;
run;

/* creating an output data set*/
proc means data=have noprint nway;
   class id datetime;
   format datetime dtyear.;
   var hr resp;
   output out=want mean= median= std= /autoname;
run;

Autoname places the statistic as a suffix to the variable name. If you only have one statistic and use median = ; then  your result statistic has the same name as the variable. Or you can check syntax on creating your own names.

View solution in original post

6 REPLIES 6
luch25
Obsidian | Level 7
Sorry, instead of the average, I'd actual like to calculate the median. Not sure how much of a difference that makes
ballardw
Super User

First a reminder to make sure that your data step runs and creates what is expected. Yours doesn't provide an informat for datetime and one of your values has a decimal in the wrong place.

 

This sort of question is one place where the power of the SAS Format comes to the fore. Groups created with a Format can be used in analysis, reporting and most formats will work with graphs. The following shows creating mean, median and standard deviation. Basic statistics are easy.  If you want a data set then use the version with output

 

data have;
	input id datetime :datetime25. hr resp;
	format datetime datetime22.3;
	cards;
1 19MAR2009:13:15:00.000 90 16
1 19MAR2009:13:17:00.000 70 18
1 19MAR2009:13:18:00.000 80 17
1 19MAR2009:13:22:00.000 100 22
1 23MAR2009:10:36:00.000 83 21
2 29OCT2009:10:36:00.000 121 13
2 29OCT2009:10:38:00.000 83 14
2 08JUL2011:13:04:00.000 131 18
2 07SEP2016:14:26:59.000 50 12
3 06APR2017:13:39:00.000 76 18
3 08JUL2017:13:04:00.000 98 17
3 08JUL2018:13:08:00.000 96 18
3 24OCT2019:15:11:00.000 80 19
;


proc means data=have mean median std;
   class id datetime;
   format datetime dtmonyy.;
   var hr resp;
run;


proc means data=have mean median std;
   class id datetime;
   format datetime dtyear.;
   var hr resp;
run;

/* creating an output data set*/
proc means data=have noprint nway;
   class id datetime;
   format datetime dtyear.;
   var hr resp;
   output out=want mean= median= std= /autoname;
run;

Autoname places the statistic as a suffix to the variable name. If you only have one statistic and use median = ; then  your result statistic has the same name as the variable. Or you can check syntax on creating your own names.

luch25
Obsidian | Level 7

Thanks @ballardw , and sorry about the data step not running correctly. It looks like this isn't creating annual means in the example output because it's giving a monthly average, based on months that are available. Instead, I would like an annual average that starts from each individual's unique first datetime encounter. If an individual does not have any qualifying measurements for the year, then it will be set to missing (which is what happens with id=2). 

Thank you again for your thoughts!

luch25
Obsidian | Level 7

i'm so sorry. The code does work once I print the report. Is there a way to get a row for years that have missing data though? For example, in the example, 2011, 2012, 2013, 2014 would need a row with missing values for id=2. 

Apologies again and thank you!

ballardw
Super User

You are going to have provide your definition of "month" perhaps for use as "anniversary" dates. Will you accept the same day of the month in the following month? Where does that leave things that want Feb29 as an anniversary (since that only appears once every 4 years generally)?

 

See if this indicates the number of 'years' or 'months' correctly from your anniversary rules.

Everything in the first "year" will have 0 (same year) for yearint and then increase when more than a year. Similar with monthint

data need;
   set have;
   by id; /*<= implies data is sorted by ID*/
   retain basedatetime;
   if first.id then basedatetime=datetime;
   yearint = intck('dtyear',basedatetime,datetime,'C');
   monthint = intck('dtmon',basedatetime,datetime,'C');
run;

If the boundaries look right then you can play with something to make more meaningful year/month indicator than the yearint/monthint. Or perhaps a single variable using intck('dtday'   ) could be used.

I noticed that your OUTPUT included a great many values that were not in the example input data. So I had no way to check if the results created from the example data matched your actual want.

 

luch25
Obsidian | Level 7

Thank you @ballardw ! The first method ended up fitting my needs! Appreciate all of your help. 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 919 views
  • 1 like
  • 2 in conversation