BookmarkSubscribeRSS Feed
rams
Calcite | Level 5

Hi,

I have this querry...

i have a dataset like this and i want to flag it where i have consecutive values.

data bob;

input subjid lbtestcd$ lbdy flag1$;

datalines;

101 alt 1 e

101 alt 2 .

101 alt 3 e

101 alt 4 e

101 ast 1 e

101 ast 3 e

101 ast 4 .

101 ast 5 e

102 alt 1 e

102 alt 2 .

102 alt 3 e

102 alt 4 e

102 ast 1 e

102 ast 3 e

102 ast 4 .

102 ast 5 e

102 ast 6 e

;

run;


i want an output like

i want to flag consecutive 'e' values for consecutive visits by subject and test.


could you please help me out


thanks.

8 REPLIES 8
AncaTilea
Pyrite | Level 9

hmm, something like this may work:

proc sort data = bob;by subjid lbtestcd lbdy;run;

data bob_1;

    set bob;

    by subjid lbtestcd lbdy;

        prev_e = lag(flag1);

        prev_visit = lag(lbdy);

        if first.subjid then do;

                prev_e = .;prev_visit = .;

        end;

yes_prior = (prev_e = flag1);*flag that there is a prior 'e';

consecutive = ( lbdy - prev_visit = 1);*flag that there are consecutive visits;

what_you_want = (yes_prior = 1 & consecutive = 1);*the flag you are looking for, I think;

run;

Best of luck!

Anca.

Astounding
PROC Star

Perhaps a simpler way to go:

proc sort data=bob;

by subjid lbtestcd lbdy;

run;

data want;

   set bob;

   by subjid lbtestcd flag1 notsorted;

   if lbtestcd='e' and (first.flag1=0 or last.flag1=0) then new_flag='*';

run;

Depending on what you need, you could remove one condition:

if (first.flag1=0 or last.flag1=0) then new_flag='*';

Good luck.

joehinson
Calcite | Level 5

Hello Rams,

Is the outcome below ("RESULTS") what you desire? I'm not sure I have understood the question.

data bob;

input subjid lbtestcd$ lbdy flag1$;

datalines;

101 alt 1 e

101 alt 2 .

101 alt 3 e

101 alt 4 e

101 ast 1 e

101 ast 3 e

101 ast 4 .

101 ast 5 e

102 alt 1 e

102 alt 2 .

102 alt 3 e

102 alt 4 e

102 ast 1 e

102 ast 3 e

102 ast 4 .

102 ast 5 e

102 ast 6 e

;

run;

data want(keep=subjid lbtestcd lbdy flag1 flag2);

if(1=2) then set bob;

           declare hash f(ordered:"a",multidata:"y");

rc=f.defineKey("subjid", "lbtestcd", "flag2");

rc=f.defineData("subjid", "lbtestcd", "lbdy","flag1","flag2");

rc=f.defineDone();

length flag2 $1;

flag2="";

do until(done);

set bob end=done;

rcf=f.check();

if (missing(flag1)+rcf)=0 then do;

flag2="Y";

f.add();

end;

else if rcf ne 0 then f.add();

else if missing(flag1) then f.remove();

output;

call missing(of _all_);

end;

stop;

run;

RESULTS:

 

subjid

lbtestcd

lbdy

flag1

flag2

1

101

alt

1

e

 

2

101

alt

2

 

 

3

101

alt

3

e

 

4

101

alt

4

e

Y

5

101

ast

1

e

 

6

101

ast

3

e

Y

7

101

ast

4

 

 

8

101

ast

5

e

 

9

102

alt

1

e

 

10

102

alt

2

 

 

11

102

alt

3

e

 

t75wez1
Pyrite | Level 9

Hello All,

Building on this solution, I wonder how to flag the subject ID where to have four consecutive values='e' other than at least two.

Thanks.

Ethan

PaigeMiller
Diamond | Level 26
data want;
    set have;
    prev_subjid=lag(subjid)
    prev2_subjid=lag2(subjid);
    prev3_subjid=lag3(subjid);
    prev_flag1=lag(flag1);
    prev2_flag1=lag2(flag1);
    prev3_flag1=lag3(flag1);
    if subjid=prev_subjid and subjid=prev2_subjid and subjid=prev3_subjid
        and flag1='e' and prev_flag1='e' and prev2_flag1='e' and prev3_flag1='e'
        then flag2='Y';
run;
--
Paige Miller
hashman
Ammonite | Level 13

@rams:

 

  • You're making the readers of your post guess what kind of output you need.
  • The verbal description you've offered is very ambiguous and allows for a number of interpretations. 
  • Please show the exact output you want given the input you have shown.  
  • If you did, your problem would be solved sooner, and those volunteering their precious time to help you out wouldn't have to waste it trying to divine what output you may have in mind.

 

Kind regards

Paul D. 

t75wez1
Pyrite | Level 9

My desired output data set with new variable named "newflag"  shown below.

 

Thanks.

 

subjid lbtestcd$ lbdy flag1 newflag

101 alt 1 e 0

101 alt 2 . 0

101 alt 3 e 4

101 alt 4 e 4

101 ast 1 e 4

101 ast 3 e 4

101 ast 4 . 0

101 ast 5 e 2

102 alt 1 e 2

102 alt 2 .

102 alt 3 e 4

102 alt 4 e 4

102 ast 1 e 4

102 ast 3 e 4

102 ast 4 .

102 ast 5 e 2

102 ast 6 e 2

hashman
Ammonite | Level 13

@t75wez1 :

This suggested output totally defies your verbal description in the original post. To wit, it's not "by subject and test" but merely by FLAG1. 

Anyway, this should work:

data have ;                                     
  input subjid lbtestcd $ lbdy flag1 $ ;        
  cards ;                                       
101 alt 1 e                                     
101 alt 2 .                                     
101 alt 3 e                                     
101 alt 4 e                                     
101 ast 1 e                                     
101 ast 3 e                                     
101 ast 4 .                                     
101 ast 5 e                                     
102 alt 1 e                                     
102 alt 2 .                                     
102 alt 3 e                                     
102 alt 4 e                                     
102 ast 1 e                                     
102 ast 3 e                                     
102 ast 4 .                                     
102 ast 5 e                                     
102 ast 6 e                                     
run ;                                           
                            
data want ;                                     
  do _n_ = 1 by 1 until (last.flag1) ;          
    set have ;                                  
    by flag1 notsorted ;                        
    newflag = sum (newflag, 1) ;                
  end ;                                         
  if cmiss (flag1) or _n_ < 2 then newflag = 0 ;
  do _n_ = 1 to _n_ ;                           
    set have ;                                  
    output ;                                    
  end ;                                         
run ;                                           

Apropos, I see no I/O logic vis-a-vis setting the NEWFLAG to 0 or blank (do you mean a missing value by that? kind of odd since the variable is numeric). Hence, whenever it's "non-consecutive", the code above sets it to 0.

 

Kind regards

Paul D.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 6626 views
  • 0 likes
  • 7 in conversation