Hi, I run this PROC REG to get results for every hour and I get this error and I miss a result for hour 23.
ERROR: Data set WORK.MATT is not sorted in ascending sequence. The current BY group has Hour = 23 and the next BY group has Hour = 0.
This is my code:
proc sort data=matt (where=(Month in(6,7,8))); by Month Hour; run; proc reg data=matt (where=(Month in(6,7,8))); model Load=Temperature; by Hour; label Load="Load (MW)" Temperature="Temperature (degrees F)"; title 'Load vs Temperature (Summer)'; run;
Also, I should be getting results for about 900 observation for every hour and only get about 250. What is wrong with my code?
You switch your BY statements. Sorting uses BY MONTH HOUR, but regression uses BY HOUR.
They need to be the same. I'm not sure which one should be used, since that depends on the output that you want. But the BY statement should be the same for both sorting and regression.
@matt23 wrote:
Hi, I run this PROC REG to get results for every hour and I get this error and I miss a result for hour 23.
ERROR: Data set WORK.MATT is not sorted in ascending sequence. The current BY group has Hour = 23 and the next BY group has Hour = 0.This is my code:
proc sort data=matt (where=(Month in(6,7,8))); by Month Hour; run; proc reg data=matt (where=(Month in(6,7,8))); model Load=Temperature; by Hour; label Load="Load (MW)" Temperature="Temperature (degrees F)"; title 'Load vs Temperature (Summer)'; run;Also, I should be getting results for about 900 observation for every hour and only get about 250. What is wrong with my code?
There are only 24 hours so specifying "BY hour;" would give you 24 estimates.
It means your BY group isn't really hour, it's probably day and hour.
If you're using HOUR it expects all hour=0 to be the same group which is not what you want.
You need to uniquely identify each group that you want somehow. Which variables in your data set uniquely identify the group? If you don't have a day, what happens if the data is out of order? How can you be sure which hour goes with which day? Relying solely on order is not a good idea.
And note that if you only have a single measurement per record (ie per hour/day) then you would get no results.
@matt23 wrote:
How would I fix this? How would I specify that I want all data from months 6,7,8 and that I want it separately for every hour?
Should I just do this for every hour then ?
proc reg data=matt (where=(Month in(6,7,8)) AND (Hour in(1)));
You switch your BY statements. Sorting uses BY MONTH HOUR, but regression uses BY HOUR.
They need to be the same. I'm not sure which one should be used, since that depends on the output that you want. But the BY statement should be the same for both sorting and regression.
I'm sure you know but your data violates the assumptions for linear regression because there's likely serial correlation.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Check out this tutorial series to learn how to build your own steps in SAS Studio.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.