Hi,
I have this querry...
i have a dataset like this and i want to flag it where i have consecutive values.
input subjid lbtestcd$ lbdy flag1$;
datalines;
101 alt 1 e
101 alt 2 .
101 alt 3 e
101 alt 4 e
101 ast 1 e
101 ast 3 e
101 ast 4 .
101 ast 5 e
102 alt 1 e
102 alt 2 .
102 alt 3 e
102 alt 4 e
102 ast 1 e
102 ast 3 e
102 ast 4 .
102 ast 5 e
102 ast 6 e
;
run;
i want an output like
i want to flag consecutive 'e' values for consecutive visits by subject and test.
could you please help me out
thanks.
hmm, something like this may work:
proc sort data = bob;by subjid lbtestcd lbdy;run;
data bob_1;
set bob;
by subjid lbtestcd lbdy;
prev_e = lag(flag1);
prev_visit = lag(lbdy);
if first.subjid then do;
prev_e = .;prev_visit = .;
end;
yes_prior = (prev_e = flag1);*flag that there is a prior 'e';
consecutive = ( lbdy - prev_visit = 1);*flag that there are consecutive visits;
what_you_want = (yes_prior = 1 & consecutive = 1);*the flag you are looking for, I think;
run;
Best of luck!
Anca.
Perhaps a simpler way to go:
proc sort data=bob;
by subjid lbtestcd lbdy;
run;
data want;
set bob;
by subjid lbtestcd flag1 notsorted;
if lbtestcd='e' and (first.flag1=0 or last.flag1=0) then new_flag='*';
run;
Depending on what you need, you could remove one condition:
if (first.flag1=0 or last.flag1=0) then new_flag='*';
Good luck.
Hello Rams,
Is the outcome below ("RESULTS") what you desire? I'm not sure I have understood the question.
data bob;
input subjid lbtestcd$ lbdy flag1$;
datalines;
101 alt 1 e
101 alt 2 .
101 alt 3 e
101 alt 4 e
101 ast 1 e
101 ast 3 e
101 ast 4 .
101 ast 5 e
102 alt 1 e
102 alt 2 .
102 alt 3 e
102 alt 4 e
102 ast 1 e
102 ast 3 e
102 ast 4 .
102 ast 5 e
102 ast 6 e
;
run;
data want(keep=subjid lbtestcd lbdy flag1 flag2);
if(1=2) then set bob;
declare hash f(ordered:"a",multidata:"y");
rc=f.defineKey("subjid", "lbtestcd", "flag2");
rc=f.defineData("subjid", "lbtestcd", "lbdy","flag1","flag2");
rc=f.defineDone();
length flag2 $1;
flag2="";
do until(done);
set bob end=done;
rcf=f.check();
if (missing(flag1)+rcf)=0 then do;
flag2="Y";
f.add();
end;
else if rcf ne 0 then f.add();
else if missing(flag1) then f.remove();
output;
call missing(of _all_);
end;
stop;
run;
RESULTS:
| subjid | lbtestcd | lbdy | flag1 | flag2 |
1 | 101 | alt | 1 | e |
|
2 | 101 | alt | 2 |
|
|
3 | 101 | alt | 3 | e |
|
4 | 101 | alt | 4 | e | Y |
5 | 101 | ast | 1 | e |
|
6 | 101 | ast | 3 | e | Y |
7 | 101 | ast | 4 |
|
|
8 | 101 | ast | 5 | e |
|
9 | 102 | alt | 1 | e |
|
10 | 102 | alt | 2 |
|
|
11 | 102 | alt | 3 | e |
|
Hello All,
Building on this solution, I wonder how to flag the subject ID where to have four consecutive values='e' other than at least two.
Thanks.
Ethan
data want;
set have;
prev_subjid=lag(subjid)
prev2_subjid=lag2(subjid);
prev3_subjid=lag3(subjid);
prev_flag1=lag(flag1);
prev2_flag1=lag2(flag1);
prev3_flag1=lag3(flag1);
if subjid=prev_subjid and subjid=prev2_subjid and subjid=prev3_subjid
and flag1='e' and prev_flag1='e' and prev2_flag1='e' and prev3_flag1='e'
then flag2='Y';
run;
Kind regards
Paul D.
My desired output data set with new variable named "newflag" shown below.
Thanks.
subjid lbtestcd$ lbdy flag1 newflag
101 alt 1 e 0
101 alt 2 . 0
101 alt 3 e 4
101 alt 4 e 4
101 ast 1 e 4
101 ast 3 e 4
101 ast 4 . 0
101 ast 5 e 2
102 alt 1 e 2
102 alt 2 .
102 alt 3 e 4
102 alt 4 e 4
102 ast 1 e 4
102 ast 3 e 4
102 ast 4 .
102 ast 5 e 2
102 ast 6 e 2
@t75wez1 :
This suggested output totally defies your verbal description in the original post. To wit, it's not "by subject and test" but merely by FLAG1.
Anyway, this should work:
data have ;
input subjid lbtestcd $ lbdy flag1 $ ;
cards ;
101 alt 1 e
101 alt 2 .
101 alt 3 e
101 alt 4 e
101 ast 1 e
101 ast 3 e
101 ast 4 .
101 ast 5 e
102 alt 1 e
102 alt 2 .
102 alt 3 e
102 alt 4 e
102 ast 1 e
102 ast 3 e
102 ast 4 .
102 ast 5 e
102 ast 6 e
run ;
data want ;
do _n_ = 1 by 1 until (last.flag1) ;
set have ;
by flag1 notsorted ;
newflag = sum (newflag, 1) ;
end ;
if cmiss (flag1) or _n_ < 2 then newflag = 0 ;
do _n_ = 1 to _n_ ;
set have ;
output ;
end ;
run ;
Apropos, I see no I/O logic vis-a-vis setting the NEWFLAG to 0 or blank (do you mean a missing value by that? kind of odd since the variable is numeric). Hence, whenever it's "non-consecutive", the code above sets it to 0.
Kind regards
Paul D.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.