Hi all,
I hope you are doing well. I have a question regarding how to extract observations before and after a dummy.
This is what I have.
ID | Var1 | Var2 | Dummy |
1 | 20 | 41 | 0 |
1 | 17 | 20 | 0 |
1 | 97 | 68 | 0 |
1 | 93 | 49 | 0 |
1 | 91 | 77 | 1 |
1 | 51 | 37 | 0 |
1 | 88 | 55 | 0 |
1 | 42 | 14 | 0 |
2 | 62 | 40 | 0 |
2 | 93 | 46 | 0 |
2 | 77 | 70 | 0 |
2 | 31 | 32 | 0 |
2 | 53 | 58 | 1 |
2 | 21 | 73 | 0 |
2 | 55 | 94 | 0 |
2 | 73 | 29 | 0 |
2 | 34 | 27 | 0 |
What I need is to extract observation before, during, and after dummy variable indicating 1 (i.e., the text in red). So, the "want" dataset will be as below:
A | B | C | Dummy |
1 | 93 | 49 | 0 |
1 | 91 | 77 | 1 |
1 | 51 | 37 | 0 |
2 | 31 | 32 | 0 |
2 | 53 | 58 | 1 |
2 | 21 | 73 | 0 |
Thanks!
Best,
David
Try this
data have;
input ID Var1 Var2 Dummy;
datalines;
1 20 41 0
1 17 20 0
1 97 68 0
1 93 49 0
1 91 77 1
1 51 37 0
1 88 55 0
1 42 14 0
2 62 40 0
2 93 46 0
2 77 70 0
2 31 32 0
2 53 58 1
2 21 73 0
2 55 94 0
2 73 29 0
2 34 27 0
;
data want;
merge have
have(firstobs = 2 keep = dummy id rename = (dummy = lead_d id = lead_id));
lag_d = lag(dummy);
lag_id = lag(id);
if sum(dummy, lag_d, lead_d) and id = lag_id and id = lead_id;
drop lead_: lag_:;
run;
what about those little ugly ones (first or last in the group):
3 62 40 1
3 93 46 0
3 77 70 0
4 62 40 0
4 93 46 0
4 77 70 1
? 😉 😉
BTW @DavidLie Could we have such situation? And what about multiple dummy in one group?
Bart
data have;
input D Var1 Var2 Dummy;
cards;
1 20 41 0
1 17 20 0
1 97 68 0
1 93 49 0
1 91 77 1
1 51 37 0
1 88 55 0
1 42 14 0
2 62 40 0
2 93 46 0
2 77 70 0
2 31 32 0
2 53 58 1
2 21 73 0
2 55 94 0
2 73 29 0
2 34 27 0
;
run;
proc print;
run;
data want;
do _N_ = 1 by 1 until(last.d);
set have curobs=curobs;
by d;
if Dummy then CO=curobs;
end;
do _N_ = 1 to _N_;
set have curobs=curobs2;
if CO-1 <= curobs2 <= CO+1 then output;
end;
run;
proc print;
run;
You should be able to use the FIRST./LAST. flags to do this.
data want;
set have ;
by id dummy notsorted;
if dummy then output;
else if (first.dummy and not first.id) then output;
else if (last.dummy and not last.id) then output;
run;
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.