BookmarkSubscribeRSS Feed
myao
Calcite | Level 5

Hi,

 

I have a script as follows.

 

data work.tst001;
	input x_ch $ y_ch $ z;

	datalines;
a A 1
a A 2
a B 1
a B 2
b A 1
b B 1
b B 2
;
run;

proc sort data=work.tst001; by x_ch y_ch z; run;

data work.tst004;
	set work.tst001;
	by x_ch y_ch z; 

	if first.z;
run;

tst004 should have 6 observations. But the code above chooses all observations in tst001. Why?

 

Thank you in advance.

 

MYao

8 REPLIES 8
PeterClemmensen
Tourmaline | Level 20

Because first.z evaluates to 1 for each of your records. See the code below. 

 

data work.tst001;
	input x_ch $ y_ch $ z;

	datalines;
a A 1
a A 2
a B 1
a B 2
b A 1
b B 1
b B 2
;
run;

proc sort data=work.tst001; by x_ch y_ch z; run;

data work.tst004;
	set work.tst001;
	by x_ch y_ch z; 
   first_z=first.z;
	if first.z;
run;
Tom
Super User Tom
Super User

Which observations are you expecting it to exclude?

497   data _null_;
498     set tst001;
499     by x_ch y_ch z ;
500     put (_all_ first.x_ch first.y_ch first.z) (=);
501   run;

x_ch=a y_ch=A z=1 FIRST.x_ch=1 FIRST.y_ch=1 FIRST.z=1
x_ch=a y_ch=A z=2 FIRST.x_ch=0 FIRST.y_ch=0 FIRST.z=1
x_ch=a y_ch=B z=1 FIRST.x_ch=0 FIRST.y_ch=1 FIRST.z=1
x_ch=a y_ch=B z=2 FIRST.x_ch=0 FIRST.y_ch=0 FIRST.z=1
x_ch=b y_ch=A z=1 FIRST.x_ch=1 FIRST.y_ch=1 FIRST.z=1
x_ch=b y_ch=B z=1 FIRST.x_ch=0 FIRST.y_ch=1 FIRST.z=1
x_ch=b y_ch=B z=2 FIRST.x_ch=0 FIRST.y_ch=0 FIRST.z=1
PGStats
Opal | Level 21

FIRST.Z will pick up the first obs from the group defined by x_ch, y_ch and z values. I think you should use first.y_ch. It will pick the first obs from the group defined by x_ch and y_ch.

PG
Kurt_Bremser
Super User

When several variables appear in a by statement, they constitute a hierarchy from left to right. A group change further left forces a change in all levels to the right. E.g.

by a b c;

A change in b will automatically force a change in c, a change in a will force a change in b and c.

PeterClemmensen
Tourmaline | Level 20

@myao my guess is that this is what you want to do? 

 

data work.tst004;
	set work.tst001;
	by z notsorted;
	if first.z;
run;

The result is the six obs

 

x_ch y_ch z 
a    A    1 
a    A    2 
a    B    1 
a    B    2 
b    A    1 
b    B    2 
ballardw
Super User

@myao wrote:

Hi,

 

I have a script as follows.

 

tst004 should have 6 observations. But the code above chooses all observations in tst001. Why?

 

Thank you in advance.

 

MYao


As others have asked:  which 6 do you expect in the output?

myao
Calcite | Level 5

Thank you all the your replies.

myao
Calcite | Level 5

Thank you all for your replies.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 8 replies
  • 903 views
  • 0 likes
  • 6 in conversation