/*In my study, I want to flag all the rows before 1 by each group.
They are already sorted by time, and each group will have 9 records and will have and only have one "1" marked, and I want to select _n_ < rownum(when x=1) */
Data have;
INPUT Group $ ID X;
DATALINES;
A 01 0
A 02 0
A 03 0
A 04 0
A 05 1
A 06 0
A 07 0
A 08 0
A 09 0
B 01 0
B 02 0
B 03 0
B 04 0
B 05 0
B 06 0
B 07 0
B 08 1
B 09 0
;
Data want;
INPUT Group $ ID X Y;
DATALINES;
A 01 0 Y
A 02 0 Y
A 03 0 Y
A 04 0 Y
A 05 1 Y
A 06 0
A 07 0
A 08 0
A 09 0
B 01 0 Y
B 02 0 Y
B 03 0 Y
B 04 0 Y
B 05 0 Y
B 06 0 Y
B 07 0 Y
B 08 1 Y
B 09 0
;
@Yuna11 wrote:
Hi Tom, I just edited my question a little bit, hopefully it is more clear now. The data is sorted by time and each group will have a 1 and only one "1" either at middle of somewhere or at the end. And I want to flag the rows before 1.
That is even easier.
data want;
set have;
by group ;
if first.group then y='Y';
if lag(x)=1 and not first.group then y='N';
retain y;
run;
The reason there are two IF statements instead of just an ELSE is because you need to execute the LAG() function for every observation for it to work properly. The LAG() function remembers the values that are passed into it, so if you skip executing it for some observations then those values of X never make into the list of values that can be returned by the LAG() function.
You could also skip the LAG() function and instead insert an OUTPUT statement so you can change the value of Y after it has already been written to the output dataset.
data want;
set have;
by group ;
if first.group then y='Y';
retain y;
output;
if x=1 then y='N';
run;
What is "row index number"?
Why not just start the value as Y and then change it to N after you see the first one? That works for your two example groups.
data want;
set have ;
by group x notsorted;
if first.group then y='Y';
else if first.x and x=0 then y='N';
retain y;
run;
But what do you want to do with groups that never have a one?
What about groups that have multiple observations with a one?
@Yuna11 wrote:
Hi Tom, I just edited my question a little bit, hopefully it is more clear now. The data is sorted by time and each group will have a 1 and only one "1" either at middle of somewhere or at the end. And I want to flag the rows before 1.
That is even easier.
data want;
set have;
by group ;
if first.group then y='Y';
if lag(x)=1 and not first.group then y='N';
retain y;
run;
The reason there are two IF statements instead of just an ELSE is because you need to execute the LAG() function for every observation for it to work properly. The LAG() function remembers the values that are passed into it, so if you skip executing it for some observations then those values of X never make into the list of values that can be returned by the LAG() function.
You could also skip the LAG() function and instead insert an OUTPUT statement so you can change the value of Y after it has already been written to the output dataset.
data want;
set have;
by group ;
if first.group then y='Y';
retain y;
output;
if x=1 then y='N';
run;
Please explain why if you want "also all the rows after 1 by each group. " that all the rows actually after the 1 are marked N in your "want" data set?
I don't see anything in your "want" that indicates a selection of "after". So you need to very clearly describe what that would look like.
In a general use of SAS aside: recommend use 1 for Yes/True/of a property or similar and 0 for No/False/not of a property if there are only two values. For one thing SAS treats 1/0 as true/false internally.
So
x = (a > b);
assigns values of 1 to x when a is greater than b. To use Y/N letters you have to use If/Then/Else or some other more complex code. It can also be much easier to report on the 1's without including the 0s by using Sum and mean(percent of 1's).
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.