Desktop productivity for business analysts and programmers

Sorting issue, missing observations

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 89
Accepted Solution

Sorting issue, missing observations

Hi, I run this PROC REG to get results for every hour and I get this error and I miss a result for hour 23.

ERROR: Data set WORK.MATT is not sorted in ascending sequence. The current BY group has Hour = 23 and
       the next BY group has Hour = 0.

This is my code:

proc sort data=matt (where=(Month in(6,7,8)));
by Month Hour;
run;

proc reg data=matt (where=(Month in(6,7,8)));
model Load=Temperature;
by Hour;
label Load="Load (MW)" Temperature="Temperature (degrees F)";
title 'Load vs Temperature (Summer)';
run;

Also, I should be getting results for about 900 observation for every hour and only get about 250. What is wrong with my code?

 


Accepted Solutions
Solution
‎06-28-2018 11:47 AM
Super User
Posts: 6,928

Re: Sorting issue, missing observations

You switch your BY statements.  Sorting uses BY MONTH HOUR, but regression uses BY HOUR.

 

They need to be the same.  I'm not sure which one should be used, since that depends on the output that you want.  But the BY statement should be the same for both sorting and regression.

View solution in original post


All Replies
Super User
Posts: 23,992

Re: Sorting issue, missing observations

 

 


@matt23 wrote:

Hi, I run this PROC REG to get results for every hour and I get this error and I miss a result for hour 23.

ERROR: Data set WORK.MATT is not sorted in ascending sequence. The current BY group has Hour = 23 and
       the next BY group has Hour = 0.

This is my code:

proc sort data=matt (where=(Month in(6,7,8)));
by Month Hour;
run;

proc reg data=matt (where=(Month in(6,7,8)));
model Load=Temperature;
by Hour;
label Load="Load (MW)" Temperature="Temperature (degrees F)";
title 'Load vs Temperature (Summer)';
run;

Also, I should be getting results for about 900 observation for every hour and only get about 250. What is wrong with my code?

 


There are only 24 hours so specifying "BY hour;" would give you 24 estimates.

 

It means your BY group isn't really hour, it's probably day and hour. 

 

If you're using HOUR it expects all hour=0 to be the same group which is not what you want.

Frequent Contributor
Posts: 89

Re: Sorting issue, missing observations

How would I fix this? How would I specify that I want all data from months 6,7,8 and that I want it separately for every hour?
Super User
Posts: 23,992

Re: Sorting issue, missing observations

You need to uniquely identify each group that you want somehow. Which variables in your data set uniquely identify the group? If you don't have a  day, what happens if the data is out of order? How can you be sure which hour goes with which day? Relying solely on order is not a good idea.

 

And note that if you only have a single measurement per record (ie per hour/day) then you would get no results.

 


@matt23 wrote:
How would I fix this? How would I specify that I want all data from months 6,7,8 and that I want it separately for every hour?

 

 

Frequent Contributor
Posts: 89

Re: Sorting issue, missing observations

[ Edited ]

Should I just do this for every hour then ?

 

proc reg data=matt (where=(Month in(6,7,8)) AND (Hour in(1)));
Solution
‎06-28-2018 11:47 AM
Super User
Posts: 6,928

Re: Sorting issue, missing observations

You switch your BY statements.  Sorting uses BY MONTH HOUR, but regression uses BY HOUR.

 

They need to be the same.  I'm not sure which one should be used, since that depends on the output that you want.  But the BY statement should be the same for both sorting and regression.

Frequent Contributor
Posts: 89

Re: Sorting issue, missing observations

Posted in reply to Astounding
This solved it. Thank you so much !!
Super User
Posts: 23,992

Re: Sorting issue, missing observations

I'm sure you know but your data violates the assumptions for linear regression because there's likely serial correlation.

 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 134 views
  • 2 likes
  • 3 in conversation