turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- BI
- /
- Enterprise Guide
- /
- Sorting issue, missing observations

Topic Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

06-28-2018 11:24 AM

Hi, I run this PROC REG to get results for every hour and I get this error and I miss a result for hour 23.

ERROR: Data set WORK.MATT is not sorted in ascending sequence. The current BY group has Hour = 23 and the next BY group has Hour = 0.

This is my code:

proc sort data=matt (where=(Month in(6,7,8))); by Month Hour; run; proc reg data=matt (where=(Month in(6,7,8))); model Load=Temperature; by Hour; label Load="Load (MW)" Temperature="Temperature (degrees F)"; title 'Load vs Temperature (Summer)'; run;

Also, I should be getting results for about 900 observation for every hour and only get about 250. What is wrong with my code?

Accepted Solutions

Solution

06-28-2018
11:47 AM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to matt23

06-28-2018 11:43 AM

You switch your BY statements. Sorting uses BY MONTH HOUR, but regression uses BY HOUR.

They need to be the same. I'm not sure which one should be used, since that depends on the output that you want. But the BY statement should be the same for both sorting and regression.

All Replies

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to matt23

06-28-2018 11:30 AM

@matt23 wrote:

Hi, I run this PROC REG to get results for every hour and I get this error and I miss a result for hour 23.

ERROR: Data set WORK.MATT is not sorted in ascending sequence. The current BY group has Hour = 23 and the next BY group has Hour = 0.This is my code:

proc sort data=matt (where=(Month in(6,7,8))); by Month Hour; run; proc reg data=matt (where=(Month in(6,7,8))); model Load=Temperature; by Hour; label Load="Load (MW)" Temperature="Temperature (degrees F)"; title 'Load vs Temperature (Summer)'; run;Also,

I should be getting results for about 900 observationfor every hour and only get about 250. What is wrong with my code?

There are only 24 hours so specifying "BY hour;" would give you 24 estimates.

It means your BY group isn't really hour, it's probably day and hour.

If you're using HOUR it expects all hour=0 to be the same group which is not what you want.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Reeza

06-28-2018 11:33 AM

How would I fix this? How would I specify that I want all data from months 6,7,8 and that I want it separately for every hour?

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to matt23

06-28-2018 11:34 AM

You need to uniquely identify each group that you want somehow. Which variables in your data set uniquely identify the group? If you don't have a day, what happens if the data is out of order? How can you be sure which hour goes with which day? Relying solely on order is not a good idea.

And note that if you only have a single measurement per record (ie per hour/day) then you would get no results.

@matt23 wrote:

How would I fix this? How would I specify that I want all data from months 6,7,8 and that I want it separately for every hour?

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Reeza

06-28-2018 11:37 AM - edited 06-28-2018 11:39 AM

Should I just do this for every hour then ?

proc reg data=matt (where=(Month in(6,7,8)) AND (Hour in(1)));

Solution

06-28-2018
11:47 AM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to matt23

06-28-2018 11:43 AM

You switch your BY statements. Sorting uses BY MONTH HOUR, but regression uses BY HOUR.

They need to be the same. I'm not sure which one should be used, since that depends on the output that you want. But the BY statement should be the same for both sorting and regression.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Astounding

06-28-2018 11:47 AM

This solved it. Thank you so much !!

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to matt23

06-28-2018 11:54 AM

I'm sure you know but your data violates the assumptions for linear regression because there's likely serial correlation.