BookmarkSubscribeRSS Feed
Vibhaa
Fluorite | Level 6

Hi All, 

 

I'm having a doubt regarding the sorting order during Group Format option:

While I run the below code: It 's running without any error :

 

proc format;
value even_odd
1,3,5,7 = odd
2,4,6,8 = even
other = Big
;
run;
data a;
input variable @@;
format variable even_odd.;
cards;
1 3 5 1 3 5 2 4 6 2 4 6 
;
data b;
set a;
by groupformat variable;
if first.variable then output;
run;

Log of the above code:

 

404  data b;
405  set a;
406  by groupformat variable;
407  if first.variable then output;
408  run;

NOTE: There were 12 observations read from the data set WORK.A.
NOTE: The data set WORK.B has 2 observations and 1 variables.
NOTE: DATA statement used (Total process time):
      real time           0.02 seconds
      cpu time            0.01 seconds

But if I change the data and run it again then it gives the sorting order Error.

data a;
input variable @@;
format variable even_odd.;
cards;
1 3 6  4 5 1 3 5 2 4 6 3 1 2 4  5 6 
;

data b;
set a;
by groupformat variable;
if first.variable then output;
run;

 Log:

416  data b;
417  set a;
418  by groupformat variable;
419  if first.variable then output;
420  run;

ERROR: BY variables are not properly sorted on data set WORK.A.
variable=even FIRST.variable=1 LAST.variable=0 _ERROR_=1 _N_=4
NOTE: The SAS System stopped processing this step because of errors.
NOTE: There were 5 observations read from the data set WORK.A.
WARNING: The data set WORK.B may be incomplete.  When this step was stopped there were 2
         observations and 1 variables.
WARNING: Data set WORK.B was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
      real time           0.03 seconds
      cpu time            0.03 seconds

I know that when we are using groupformat option the dataset must be sorted. But why the 1st code run successfully without any error even when the data in this is also not sorted ? Can anyone please help me out in this ?

 

Thanks!

 

2 REPLIES 2
Jagadishkatam
Amethyst | Level 16

please check the old and new datasets created from the input datastep code before going to the next datastep.

 

if we see the data of old code is already is in sorted order, i mean this is the raw data which is in sorted order due to which no sorting is required.

 

but in the new datastep, the raw data is not in sorted order due to which we get that error.

 

here the issue is with raw data sorting.

 

to avoid the same please include a proc sort step.

 

image.png

Thanks,
Jag
Reeza
Super User

GROUPFORMAT means apply the format to the variable and use that as a group. So your first data, SAS sees the following, which is clearly ordered.

 

odd odd odd odd odd odd even even even even even even

However, in the second data you have mixed even/odd so it's not sorted/ordered as expected. 

 

odd odd even even odd odd odd odd even even even...

 

If you want SAS to consider the groups as ordered and each new start is the start of a new group that is possible, but use it with caution. Use the NOTSORTED option on the BY statement in that case.

 

data a;
input variable @@;
format variable even_odd.;
cards;
1 3 6  4 5 1 3 5 2 4 6 3 1 2 4  5 6 
;

data b;
set a;
by  variable groupformat notsorted;
if first.variable then output;
run;

 


@Vibhaa wrote:

Hi All, 

 

I'm having a doubt regarding the sorting order during Group Format option:

While I run the below code: It 's running without any error :

 

proc format;
value even_odd
1,3,5,7 = odd
2,4,6,8 = even
other = Big
;
run;
data a;
input variable @@;
format variable even_odd.;
cards;
1 3 5 1 3 5 2 4 6 2 4 6 
;
data b;
set a;
by groupformat variable;
if first.variable then output;
run;

Log of the above code:

 

404  data b;
405  set a;
406  by groupformat variable;
407  if first.variable then output;
408  run;

NOTE: There were 12 observations read from the data set WORK.A.
NOTE: The data set WORK.B has 2 observations and 1 variables.
NOTE: DATA statement used (Total process time):
      real time           0.02 seconds
      cpu time            0.01 seconds

But if I change the data and run it again then it gives the sorting order Error.

data a;
input variable @@;
format variable even_odd.;
cards;
1 3 6  4 5 1 3 5 2 4 6 3 1 2 4  5 6 
;

data b;
set a;
by groupformat variable;
if first.variable then output;
run;

 Log:

416  data b;
417  set a;
418  by groupformat variable;
419  if first.variable then output;
420  run;

ERROR: BY variables are not properly sorted on data set WORK.A.
variable=even FIRST.variable=1 LAST.variable=0 _ERROR_=1 _N_=4
NOTE: The SAS System stopped processing this step because of errors.
NOTE: There were 5 observations read from the data set WORK.A.
WARNING: The data set WORK.B may be incomplete.  When this step was stopped there were 2
         observations and 1 variables.
WARNING: Data set WORK.B was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
      real time           0.03 seconds
      cpu time            0.03 seconds

I know that when we are using groupformat option the dataset must be sorted. But why the 1st code run successfully without any error even when the data in this is also not sorted ? Can anyone please help me out in this ?

 

Thanks!

 


 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 977 views
  • 2 likes
  • 3 in conversation