Hi all,
I hope you are doing well. I have a question regarding how to extract observations before and after a dummy.
This is what I have.
ID | Var1 | Var2 | Dummy |
1 | 20 | 41 | 0 |
1 | 17 | 20 | 0 |
1 | 97 | 68 | 0 |
1 | 93 | 49 | 0 |
1 | 91 | 77 | 1 |
1 | 51 | 37 | 0 |
1 | 88 | 55 | 0 |
1 | 42 | 14 | 0 |
2 | 62 | 40 | 0 |
2 | 93 | 46 | 0 |
2 | 77 | 70 | 0 |
2 | 31 | 32 | 0 |
2 | 53 | 58 | 1 |
2 | 21 | 73 | 0 |
2 | 55 | 94 | 0 |
2 | 73 | 29 | 0 |
2 | 34 | 27 | 0 |
What I need is to extract observation before, during, and after dummy variable indicating 1 (i.e., the text in red). So, the "want" dataset will be as below:
A | B | C | Dummy |
1 | 93 | 49 | 0 |
1 | 91 | 77 | 1 |
1 | 51 | 37 | 0 |
2 | 31 | 32 | 0 |
2 | 53 | 58 | 1 |
2 | 21 | 73 | 0 |
Thanks!
Best,
David
Try this
data have;
input ID Var1 Var2 Dummy;
datalines;
1 20 41 0
1 17 20 0
1 97 68 0
1 93 49 0
1 91 77 1
1 51 37 0
1 88 55 0
1 42 14 0
2 62 40 0
2 93 46 0
2 77 70 0
2 31 32 0
2 53 58 1
2 21 73 0
2 55 94 0
2 73 29 0
2 34 27 0
;
data want;
merge have
have(firstobs = 2 keep = dummy id rename = (dummy = lead_d id = lead_id));
lag_d = lag(dummy);
lag_id = lag(id);
if sum(dummy, lag_d, lead_d) and id = lag_id and id = lead_id;
drop lead_: lag_:;
run;
what about those little ugly ones (first or last in the group):
3 62 40 1
3 93 46 0
3 77 70 0
4 62 40 0
4 93 46 0
4 77 70 1
? 😉 😉
BTW @DavidLie Could we have such situation? And what about multiple dummy in one group?
Bart
data have;
input D Var1 Var2 Dummy;
cards;
1 20 41 0
1 17 20 0
1 97 68 0
1 93 49 0
1 91 77 1
1 51 37 0
1 88 55 0
1 42 14 0
2 62 40 0
2 93 46 0
2 77 70 0
2 31 32 0
2 53 58 1
2 21 73 0
2 55 94 0
2 73 29 0
2 34 27 0
;
run;
proc print;
run;
data want;
do _N_ = 1 by 1 until(last.d);
set have curobs=curobs;
by d;
if Dummy then CO=curobs;
end;
do _N_ = 1 to _N_;
set have curobs=curobs2;
if CO-1 <= curobs2 <= CO+1 then output;
end;
run;
proc print;
run;
You should be able to use the FIRST./LAST. flags to do this.
data want;
set have ;
by id dummy notsorted;
if dummy then output;
else if (first.dummy and not first.id) then output;
else if (last.dummy and not last.id) then output;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.