Hi All,
I'm having a doubt regarding the sorting order during Group Format option:
While I run the below code: It 's running without any error :
proc format;
value even_odd
1,3,5,7 = odd
2,4,6,8 = even
other = Big
;
run;
data a;
input variable @@;
format variable even_odd.;
cards;
1 3 5 1 3 5 2 4 6 2 4 6
;
data b;
set a;
by groupformat variable;
if first.variable then output;
run;
Log of the above code:
404 data b;
405 set a;
406 by groupformat variable;
407 if first.variable then output;
408 run;
NOTE: There were 12 observations read from the data set WORK.A.
NOTE: The data set WORK.B has 2 observations and 1 variables.
NOTE: DATA statement used (Total process time):
real time 0.02 seconds
cpu time 0.01 seconds
But if I change the data and run it again then it gives the sorting order Error.
data a;
input variable @@;
format variable even_odd.;
cards;
1 3 6 4 5 1 3 5 2 4 6 3 1 2 4 5 6
;
data b;
set a;
by groupformat variable;
if first.variable then output;
run;
Log:
416 data b;
417 set a;
418 by groupformat variable;
419 if first.variable then output;
420 run;
ERROR: BY variables are not properly sorted on data set WORK.A.
variable=even FIRST.variable=1 LAST.variable=0 _ERROR_=1 _N_=4
NOTE: The SAS System stopped processing this step because of errors.
NOTE: There were 5 observations read from the data set WORK.A.
WARNING: The data set WORK.B may be incomplete. When this step was stopped there were 2
observations and 1 variables.
WARNING: Data set WORK.B was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
real time 0.03 seconds
cpu time 0.03 seconds
I know that when we are using groupformat option the dataset must be sorted. But why the 1st code run successfully without any error even when the data in this is also not sorted ? Can anyone please help me out in this ?
Thanks!
please check the old and new datasets created from the input datastep code before going to the next datastep.
if we see the data of old code is already is in sorted order, i mean this is the raw data which is in sorted order due to which no sorting is required.
but in the new datastep, the raw data is not in sorted order due to which we get that error.
here the issue is with raw data sorting.
to avoid the same please include a proc sort step.
GROUPFORMAT means apply the format to the variable and use that as a group. So your first data, SAS sees the following, which is clearly ordered.
odd odd odd odd odd odd even even even even even even
However, in the second data you have mixed even/odd so it's not sorted/ordered as expected.
odd odd even even odd odd odd odd even even even...
If you want SAS to consider the groups as ordered and each new start is the start of a new group that is possible, but use it with caution. Use the NOTSORTED option on the BY statement in that case.
data a; input variable @@; format variable even_odd.; cards; 1 3 6 4 5 1 3 5 2 4 6 3 1 2 4 5 6 ; data b; set a; by variable groupformat notsorted; if first.variable then output; run;
@Vibhaa wrote:
Hi All,
I'm having a doubt regarding the sorting order during Group Format option:
While I run the below code: It 's running without any error :
proc format; value even_odd 1,3,5,7 = odd 2,4,6,8 = even other = Big ; run; data a; input variable @@; format variable even_odd.; cards; 1 3 5 1 3 5 2 4 6 2 4 6 ; data b; set a; by groupformat variable; if first.variable then output; run;
Log of the above code:
404 data b; 405 set a; 406 by groupformat variable; 407 if first.variable then output; 408 run; NOTE: There were 12 observations read from the data set WORK.A. NOTE: The data set WORK.B has 2 observations and 1 variables. NOTE: DATA statement used (Total process time): real time 0.02 seconds cpu time 0.01 seconds
But if I change the data and run it again then it gives the sorting order Error.
data a; input variable @@; format variable even_odd.; cards; 1 3 6 4 5 1 3 5 2 4 6 3 1 2 4 5 6 ; data b; set a; by groupformat variable; if first.variable then output; run;
Log:
416 data b; 417 set a; 418 by groupformat variable; 419 if first.variable then output; 420 run; ERROR: BY variables are not properly sorted on data set WORK.A. variable=even FIRST.variable=1 LAST.variable=0 _ERROR_=1 _N_=4 NOTE: The SAS System stopped processing this step because of errors. NOTE: There were 5 observations read from the data set WORK.A. WARNING: The data set WORK.B may be incomplete. When this step was stopped there were 2 observations and 1 variables. WARNING: Data set WORK.B was not replaced because this step was stopped. NOTE: DATA statement used (Total process time): real time 0.03 seconds cpu time 0.03 seconds
I know that when we are using groupformat option the dataset must be sorted. But why the 1st code run successfully without any error even when the data in this is also not sorted ? Can anyone please help me out in this ?
Thanks!
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.