How do I count how many consecutive days are 90 degrees or over each August (i.e. 19 consecutive days over 90 in August 1999)?
i.e. "What is longest sequence of consecutive days with temperature greater than 90?"
My code:
data demo5.aug_weather_consec;
set demo5.aug_weather1;
if Max_Temp >= 90 then consec+1;
else if Max_Temp < 90 then consec=0;
run;
proc means data=demo5.aug_weather_consec max;
var consec;
by date;
run;
Desired result:
August Consecutive Days at/or Above 90 | |
August, Year | August Consecutive Days at/or Above 90 |
1999 | 18 |
2000 | 16 |
2001 | 9 |
2002 | 27 |
2003 | 15 |
2004 | 13 |
2005 | 6 |
2006 | 7 |
2007 | 26 |
2008 | 6 |
2009 | 23 |
2010 | 23 |
2011 | 16 |
2012 | 16 |
2013 | 13 |
2014 | 7 |
2015 | 30 |
2016 | 12 |
2017 | 17 |
2018 | 19 |
I have attached my dataset.
data aug;
infile aug;
input date :date. degrees;
year = year(date);
gt90 = degrees ge 90;
format date date11.;
run;
proc summary nway data=aug;
by gt90 notsorted;
class date;
format date monyy7.;
output out=runs(drop=_type_ rename=(_freq_=runlength) where=(gt90 eq 1)) idgroup(out(date)=start) idgroup(last out(date)=end);
format start: end: date11.;
run;
proc summary data=runs nway;
class date;
output out=maxrun(drop=_:) max(runlength)=;
run;
You haven't actually asked a question.
Also, your title says you want consecutive days, but your first sentence doesn't say consecutive. So which is it?
I have revised the question.
I haven't answered the question you did not ask but is something I want to show you.
This outputs all 90+ days with start and end dates for consecutive runs. _FREQ_ is the day length of the run.
filename aug clipbrd;
data aug;
infile aug;
input date :date. degrees;
gt90 = degrees ge 90;
format date date11.;
run;
proc summary nway;
by gt90 notsorted;
class date;
format date monyy.;
output out=runs(drop=_type_ where=(gt90 eq 1)) idgroup(out(date)=start) idgroup(last out(date)=end);
format start: end: date11.;
run;
proc print;
run;
From your example data and what I think is your expected result the question you are asking is "what is longest sequence of consecutive days with temperature greater than 90". Is that a correct interpretation?
Yes, thank you.
data aug;
infile aug;
input date :date. degrees;
year = year(date);
gt90 = degrees ge 90;
format date date11.;
run;
proc summary nway data=aug;
by gt90 notsorted;
class date;
format date monyy7.;
output out=runs(drop=_type_ rename=(_freq_=runlength) where=(gt90 eq 1)) idgroup(out(date)=start) idgroup(last out(date)=end);
format start: end: date11.;
run;
proc summary data=runs nway;
class date;
output out=maxrun(drop=_:) max(runlength)=;
run;
This code and output look great. However, I am having trouble importing the data aug. When I run the first block of code, I get the error message below.
data aug;
infile aug;
input date :date. degrees;
year = year(date);
gt90 = degrees ge 90;
format date date11.;
run;
ERROR: I/O failure.
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.AUG may be incomplete. When this step was stopped there were 0 observations and 4 variables.
WARNING: Data set WORK.AUG was not replaced because this step was stopped.
What could I be doing wrong when importing the data or refer to the dataset in the data step?
@KALLEN wrote:
This code and output look great. However, I am having trouble importing the data aug. When I run the first block of code, I get the error message below.
data aug; infile aug; input date :date. degrees; year = year(date); gt90 = degrees ge 90; format date date11.; run;
ERROR: I/O failure.
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.AUG may be incomplete. When this step was stopped there were 0 observations and 4 variables.
WARNING: Data set WORK.AUG was not replaced because this step was stopped.
What could I be doing wrong when importing the data or refer to the dataset in the data step?
You still IMPORT your data the same way as you have done. You just need to create the indicator and then run the PROC SUMMARY steps.
Great, this worked for me - just had to alter the name of the dataset.
Thanks again.
I have a similar problem, but now I need to view the longest sequence of consecutive days with temperature greater than 90° for each period Jan-Aug 1999 - 2018 (YTD).
Below is similar code I used from the last post, but now I need to change proc summary such that the data from Jan - Aug for runlength are added together for each year:
/* create indicator variable for Max Temp >= 90 (gt90) */
data ytd_weather1;
set demo5.ytd_weather;
gt90 = Max_Temp ge 90;
format date date11.;
run;
/* create start and end dates for the beginning and end of each month */
proc summary nway data=ytd_weather1;
by gt90 notsorted;
class date;
format date monyy7.;
output out=runs(drop=_type_ rename=(_freq_=runlength) where=(gt90 eq 1)) idgroup(out(date)=start) idgroup(last out(date)=end);
format start: end: date11.;
run;
/* run the proc summary to view the longest sequence of consecutive days with temperature greater than 90 for each YTD AUG 1999 - 2018 */
proc summary data=runs nway;
class date;
output out=maxrun(drop=_:) max(runlength)=;
run;
current output:
Date | runlength |
Apr1999 | 2 |
May1999 | 5 |
Jun1999 | 10 |
Jul1999 | 9 |
Aug1999 | 18 |
Apr2000 | 5 |
May2000 | 9 |
Jun2000 | 8 |
Jul2000 | 31 |
Aug2000 | 16 |
Apr2001 | 1 |
May2001 | 9 |
Jun2001 | 13 |
Jul2001 | 31 |
Aug2001 | 9 |
Apr2002 | 2 |
May2002 | 4 |
Jun2002 | 24 |
Jul2002 | 10 |
Aug2002 | 28 |
May2003 | 8 |
Jun2003 | 24 |
Jul2003 | 28 |
Aug2003 | 15 |
May2004 | 11 |
Jun2004 | 21 |
Jul2004 | 23 |
Aug2004 | 13 |
May2005 | 7 |
Jun2005 | 26 |
Jul2005 | 26 |
Aug2005 | 6 |
Apr2006 | 2 |
May2006 | 14 |
Jun2006 | 25 |
Jul2006 | 19 |
Aug2006 | 7 |
May2007 | 5 |
Jun2007 | 18 |
Jul2007 | 7 |
Aug2007 | 27 |
Apr2008 | 1 |
May2008 | 7 |
Jun2008 | 23 |
Jul2008 | 7 |
Aug2008 | 6 |
May2009 | 12 |
Jun2009 | 8 |
Jul2009 | 31 |
Aug2009 | 23 |
May2010 | 6 |
Jun2010 | 28 |
Jul2010 | 13 |
Aug2010 | 23 |
Apr2011 | 1 |
May2011 | 7 |
Jun2011 | 30 |
Jul2011 | 28 |
Aug2011 | 17 |
Mar2012 | 1 |
Apr2012 | 6 |
May2012 | 15 |
Jun2012 | 30 |
Jul2012 | 15 |
Aug2012 | 16 |
Apr2013 | 2 |
May2013 | 7 |
Jun2013 | 30 |
Jul2013 | 12 |
Aug2013 | 13 |
May2014 | 6 |
Jun2014 | 30 |
Jul2014 | 15 |
Aug2014 | 7 |
May2015 | 3 |
Jun2015 | 30 |
Jul2015 | 31 |
Aug2015 | 31 |
May2016 | 6 |
Jun2016 | 25 |
Jul2016 | 30 |
Aug2016 | 12 |
Mar2017 | 1 |
Apr2017 | 2 |
May2017 | 4 |
Jun2017 | 28 |
Jul2017 | 15 |
Aug2017 | 17 |
Apr2018 | 2 |
May2018 | 14 |
Jun2018 | 15 |
Jul2018 | 21 |
Aug2018 | 19 |
desired output:
Date | runlength |
1999 | 44 |
2000 | 69 |
2001 | 63 |
2002 | 68 |
2003 | 75 |
2004 | 68 |
2005 | 65 |
etc to 2018
Thanks.
I had to delete obs where runlength=1 and then add the proc timeseries step to convert monthly to annual data:
/* drop any obs where runlength=1 because that is not a consecutive day >=90 */
data maxrun;
set maxrun;
if runlength=1 then delete;
run;
/*convert monthly data to yearly data by adding sum of runlength */
proc timeseries data=maxrun out=runs_ytd;
id date interval=year accumulate=total;
var runlength;
run;
This gave me the expected output.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.