BookmarkSubscribeRSS Feed
KALLEN
Obsidian | Level 7

I need to calculate the 5, 10 , and 20 year averages of August Days at/over 90° F from 1999 - 2018 and then add these 5, 10, 20 year averages to an already existing dataset (demo5.aug_yearly_weather). I want to do this in the data step rather than using proc means. 

 

This is my code:

 

 

/* Create the 5, 10, 20 year averages variable within the dataset aug_yearly_weather */
data demo5.aug_yearly_weather1;
        set demo5.aug_yearly_weather;
        Five_year_avg_90= mean(of '01Aug2013'd - '31Aug2018'd), Max_Temp_90);
        Ten_year_avg_90= mean(of '01Aug2008'd - '31Aug2018'd), Max_Temp_90);
        Twenty_year_avg_90=  mean(Max_Temp_90);
run;

 

 

This is the end result desired:

 

DateMax_Temp_905 Year Avg10 Year Avg20 Year Avg
199925242625
200027242625
200125242625
200229242625
200327242625
200422242625
200519242625
200611242625
200727242625
200817242625
200929242625
201027242625
201130242625
201230242625
201326242625
201417242625
201531242625
201620242625
201725242625
201828242625

 

Help is greatly appreciated. 

8 REPLIES 8
ballardw
Super User

What is the rationale behind not using an very appropriate tool in Proc Means/ summary?

 

Since you say you also want to "add these 5, 10, 20 year averages to an already existing dataset " you need to show how you want the data to look after adding the data as that can be interpreted in a number of ways.

 

Your attached example data also apparently does not include any date values, only the year.

KALLEN
Obsidian | Level 7

Thank you, I have updated the post with the desired result. As for the reason to use data set and not proc means, I want to keep the 5, 10, and 20 year averages in the same dataset so I can create a bar line chart. I have already used proc means, but do not know how to take the results of the proc means and put it into the data step. 

 

/*find 5 year average of the number of days in August that were =>90 by year (2014 - 2018)*/
proc means data=demo5.aug_yearly_weather mean maxdec=0;
	vars Max_Temp_90;
	where date between '01aug2013'd and '31Aug2018'd; /* 5 year average includes 2014 - 2018, but must specify dates between 2013 and 2018*/
	title "5 Year Average";
	output out=aug_5_year_avg_90;
run;

/*find 10 year average of the number of days in August that were =>90 by year (2009 - 2018)*/

proc means data=demo5.aug_yearly_weather mean maxdec=0;
	where date between '01aug2008'd and '31Aug2018'd; /* 10 year average includes 2009 - 2018, but must specify dates between 2008 and 2018*/
	vars Max_Temp_90;
	title "10 Year Average";
	output out=demo5.aug_10_yr_avg_90;
run;

/*find 20 year average of the number of days in August that were =>90 by year (1999 - 2018)*/
proc means data=demo5.aug_yearly_weather mean maxdec=0;
	vars Max_Temp_90;  /* no need to specify date range because this includes the entire dataset */
	title "20 Year Average";
	output out=demo5.aug_20_yr_avg_90;
run;

 

Reeza
Super User
Your sample data doesn't match your code, but hopefully you can align the solution I provided with your actual data.
KALLEN
Obsidian | Level 7

Is there a way to add the output of proc means to the existing dataset in the data step?

 

When I attempt to, I receive this message:

 

ERROR: DATA STEP Component Object failure. Aborted during the COMPILATION phase.
ERROR 557-185: Variable demo5 is not an object.

 

This is my code:

data demo5.aug_yearly_weather1; 
	set demo5.aug_yearly_weather;
  	Five_year_avg_90 = demo5.aug_5_year_avg_90;
Ten_year_avg_90 =  demo5.aug_10_year_avg_90;
Twenty_year_avg_90 =  demo5.aug_20_year_avg_90;
run;

 

KALLEN
Obsidian | Level 7

Is there a way to add the output of proc means to the existing dataset in the data step?

 

When I attempt to, I receive this message:

 

ERROR: DATA STEP Component Object failure. Aborted during the COMPILATION phase.
ERROR 557-185: Variable demo5 is not an object.

 

This is my code:

data demo5.aug_yearly_weather1; 
	set demo5.aug_yearly_weather;
  	Five_year_avg_90 = demo5.aug_5_year_avg_90;
        Ten_year_avg_90 =  demo5.aug_10_year_avg_90;
       Twenty_year_avg_90 =  demo5.aug_20_year_avg_90;
run;

 

Reeza
Super User

You would need to merge it in or something like that. You would first need to modify PROC MEANS to give you three variables - use OUTPUT OUT= instead. 

See a mockup below:

 

 

proc means ....
<same as previous>;

output out=summary_stats mean= / autoname;
run;

data combined;
set old_table;
if _n_=1 then set summary_stats;
run;
Reeza
Super User

Why do you need this in a data step rather than a PROC MEANS? It's more efficient to use PROC MEANS because you can use a Multilabel format that will handle multiple time periods and overlaps. 

 

 


@KALLEN wrote:

I need to calculate the 5, 10 , and 20 year averages of August Days at/over 90° F from 1999 - 2018 and then add these 5, 10, 20 year averages to an already existing dataset (demo5.aug_yearly_weather).

 

Your code doesn't seem to align with your data structure, it only has annual data when it seems like you're expecting monthly/daily data.

 

I suspect you need a format and PROC MEANS.

 

data have;
input Date	Max_Temp_90;
cards;
1999	25
2000	27
2001	25
2002	29
2003	27
2004	22
2005	19
2006	11
2007	27
2008	17
2009	29
2010	27
2011	30
2012	30
2013	26
2014	17
2015	31
2016	20
2017	25
2018	28
;
run;


proc format;
value year_mft ( multilabel)
2014-2018 = '5 year'
2009-2018 = '10 year'
1999-2018 = '20 year'
;

proc means data=have nway stackods N Mean;
class date / mlf;
format date year_mft.;
var Max_Temp_90;
ods output summary=want;
run;

proc print data=want;
run;
mkeintz
PROC Star

Questions:

 

  1. You apparently want  trailing 5, 10 and 20 year "averages" for 2018, and then assign those results to every year from 1999-2018, yes?  Otherwise why are the 5 and 10 year "averages" constant over 20 years?

  2. Also, I put "averages" in quotes because you also apparently want maxtemp90 in each average.  So the "five-year average" would have an N of 6.  And your illustrative attempt for five-year-average includes 2013 through 2018 (6 obs, not 5), so your "five year average" really has an N of 7.  What is you actual intent?
  3. Could you put your sample data in the form of a complete data step?

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 1928 views
  • 0 likes
  • 4 in conversation