Hi all,
I hope you are doing well. I have a question regarding how to extract observations before and after a dummy.
This is what I have.
ID | Var1 | Var2 | Dummy |
1 | 20 | 41 | 0 |
1 | 17 | 20 | 0 |
1 | 97 | 68 | 0 |
1 | 93 | 49 | 0 |
1 | 91 | 77 | 1 |
1 | 51 | 37 | 0 |
1 | 88 | 55 | 0 |
1 | 42 | 14 | 0 |
2 | 62 | 40 | 0 |
2 | 93 | 46 | 0 |
2 | 77 | 70 | 0 |
2 | 31 | 32 | 0 |
2 | 53 | 58 | 1 |
2 | 21 | 73 | 0 |
2 | 55 | 94 | 0 |
2 | 73 | 29 | 0 |
2 | 34 | 27 | 0 |
What I need is to extract observation before, during, and after dummy variable indicating 1 (i.e., the text in red). So, the "want" dataset will be as below:
A | B | C | Dummy |
1 | 93 | 49 | 0 |
1 | 91 | 77 | 1 |
1 | 51 | 37 | 0 |
2 | 31 | 32 | 0 |
2 | 53 | 58 | 1 |
2 | 21 | 73 | 0 |
Thanks!
Best,
David
Try this
data have;
input ID Var1 Var2 Dummy;
datalines;
1 20 41 0
1 17 20 0
1 97 68 0
1 93 49 0
1 91 77 1
1 51 37 0
1 88 55 0
1 42 14 0
2 62 40 0
2 93 46 0
2 77 70 0
2 31 32 0
2 53 58 1
2 21 73 0
2 55 94 0
2 73 29 0
2 34 27 0
;
data want;
merge have
have(firstobs = 2 keep = dummy id rename = (dummy = lead_d id = lead_id));
lag_d = lag(dummy);
lag_id = lag(id);
if sum(dummy, lag_d, lead_d) and id = lag_id and id = lead_id;
drop lead_: lag_:;
run;
what about those little ugly ones (first or last in the group):
3 62 40 1
3 93 46 0
3 77 70 0
4 62 40 0
4 93 46 0
4 77 70 1
? 😉 😉
BTW @DavidLie Could we have such situation? And what about multiple dummy in one group?
Bart
data have;
input D Var1 Var2 Dummy;
cards;
1 20 41 0
1 17 20 0
1 97 68 0
1 93 49 0
1 91 77 1
1 51 37 0
1 88 55 0
1 42 14 0
2 62 40 0
2 93 46 0
2 77 70 0
2 31 32 0
2 53 58 1
2 21 73 0
2 55 94 0
2 73 29 0
2 34 27 0
;
run;
proc print;
run;
data want;
do _N_ = 1 by 1 until(last.d);
set have curobs=curobs;
by d;
if Dummy then CO=curobs;
end;
do _N_ = 1 to _N_;
set have curobs=curobs2;
if CO-1 <= curobs2 <= CO+1 then output;
end;
run;
proc print;
run;
You should be able to use the FIRST./LAST. flags to do this.
data want;
set have ;
by id dummy notsorted;
if dummy then output;
else if (first.dummy and not first.id) then output;
else if (last.dummy and not last.id) then output;
run;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.