BookmarkSubscribeRSS Feed
rams
Calcite | Level 5

Hi,

I have this querry...

i have a dataset like this and i want to flag it where i have consecutive values.

data bob;

input subjid lbtestcd$ lbdy flag1$;

datalines;

101 alt 1 e

101 alt 2 .

101 alt 3 e

101 alt 4 e

101 ast 1 e

101 ast 3 e

101 ast 4 .

101 ast 5 e

102 alt 1 e

102 alt 2 .

102 alt 3 e

102 alt 4 e

102 ast 1 e

102 ast 3 e

102 ast 4 .

102 ast 5 e

102 ast 6 e

;

run;


i want an output like

i want to flag consecutive 'e' values for consecutive visits by subject and test.


could you please help me out


thanks.

8 REPLIES 8
AncaTilea
Pyrite | Level 9

hmm, something like this may work:

proc sort data = bob;by subjid lbtestcd lbdy;run;

data bob_1;

    set bob;

    by subjid lbtestcd lbdy;

        prev_e = lag(flag1);

        prev_visit = lag(lbdy);

        if first.subjid then do;

                prev_e = .;prev_visit = .;

        end;

yes_prior = (prev_e = flag1);*flag that there is a prior 'e';

consecutive = ( lbdy - prev_visit = 1);*flag that there are consecutive visits;

what_you_want = (yes_prior = 1 & consecutive = 1);*the flag you are looking for, I think;

run;

Best of luck!

Anca.

Astounding
PROC Star

Perhaps a simpler way to go:

proc sort data=bob;

by subjid lbtestcd lbdy;

run;

data want;

   set bob;

   by subjid lbtestcd flag1 notsorted;

   if lbtestcd='e' and (first.flag1=0 or last.flag1=0) then new_flag='*';

run;

Depending on what you need, you could remove one condition:

if (first.flag1=0 or last.flag1=0) then new_flag='*';

Good luck.

joehinson
Calcite | Level 5

Hello Rams,

Is the outcome below ("RESULTS") what you desire? I'm not sure I have understood the question.

data bob;

input subjid lbtestcd$ lbdy flag1$;

datalines;

101 alt 1 e

101 alt 2 .

101 alt 3 e

101 alt 4 e

101 ast 1 e

101 ast 3 e

101 ast 4 .

101 ast 5 e

102 alt 1 e

102 alt 2 .

102 alt 3 e

102 alt 4 e

102 ast 1 e

102 ast 3 e

102 ast 4 .

102 ast 5 e

102 ast 6 e

;

run;

data want(keep=subjid lbtestcd lbdy flag1 flag2);

if(1=2) then set bob;

           declare hash f(ordered:"a",multidata:"y");

rc=f.defineKey("subjid", "lbtestcd", "flag2");

rc=f.defineData("subjid", "lbtestcd", "lbdy","flag1","flag2");

rc=f.defineDone();

length flag2 $1;

flag2="";

do until(done);

set bob end=done;

rcf=f.check();

if (missing(flag1)+rcf)=0 then do;

flag2="Y";

f.add();

end;

else if rcf ne 0 then f.add();

else if missing(flag1) then f.remove();

output;

call missing(of _all_);

end;

stop;

run;

RESULTS:

 

subjid

lbtestcd

lbdy

flag1

flag2

1

101

alt

1

e

 

2

101

alt

2

 

 

3

101

alt

3

e

 

4

101

alt

4

e

Y

5

101

ast

1

e

 

6

101

ast

3

e

Y

7

101

ast

4

 

 

8

101

ast

5

e

 

9

102

alt

1

e

 

10

102

alt

2

 

 

11

102

alt

3

e

 

t75wez1
Quartz | Level 8

Hello All,

Building on this solution, I wonder how to flag the subject ID where to have four consecutive values='e' other than at least two.

Thanks.

Ethan

PaigeMiller
Diamond | Level 26
data want;
    set have;
    prev_subjid=lag(subjid)
    prev2_subjid=lag2(subjid);
    prev3_subjid=lag3(subjid);
    prev_flag1=lag(flag1);
    prev2_flag1=lag2(flag1);
    prev3_flag1=lag3(flag1);
    if subjid=prev_subjid and subjid=prev2_subjid and subjid=prev3_subjid
        and flag1='e' and prev_flag1='e' and prev2_flag1='e' and prev3_flag1='e'
        then flag2='Y';
run;
--
Paige Miller
hashman
Ammonite | Level 13

@rams:

 

  • You're making the readers of your post guess what kind of output you need.
  • The verbal description you've offered is very ambiguous and allows for a number of interpretations. 
  • Please show the exact output you want given the input you have shown.  
  • If you did, your problem would be solved sooner, and those volunteering their precious time to help you out wouldn't have to waste it trying to divine what output you may have in mind.

 

Kind regards

Paul D. 

t75wez1
Quartz | Level 8

My desired output data set with new variable named "newflag"  shown below.

 

Thanks.

 

subjid lbtestcd$ lbdy flag1 newflag

101 alt 1 e 0

101 alt 2 . 0

101 alt 3 e 4

101 alt 4 e 4

101 ast 1 e 4

101 ast 3 e 4

101 ast 4 . 0

101 ast 5 e 2

102 alt 1 e 2

102 alt 2 .

102 alt 3 e 4

102 alt 4 e 4

102 ast 1 e 4

102 ast 3 e 4

102 ast 4 .

102 ast 5 e 2

102 ast 6 e 2

hashman
Ammonite | Level 13

@t75wez1 :

This suggested output totally defies your verbal description in the original post. To wit, it's not "by subject and test" but merely by FLAG1. 

Anyway, this should work:

data have ;                                     
  input subjid lbtestcd $ lbdy flag1 $ ;        
  cards ;                                       
101 alt 1 e                                     
101 alt 2 .                                     
101 alt 3 e                                     
101 alt 4 e                                     
101 ast 1 e                                     
101 ast 3 e                                     
101 ast 4 .                                     
101 ast 5 e                                     
102 alt 1 e                                     
102 alt 2 .                                     
102 alt 3 e                                     
102 alt 4 e                                     
102 ast 1 e                                     
102 ast 3 e                                     
102 ast 4 .                                     
102 ast 5 e                                     
102 ast 6 e                                     
run ;                                           
                            
data want ;                                     
  do _n_ = 1 by 1 until (last.flag1) ;          
    set have ;                                  
    by flag1 notsorted ;                        
    newflag = sum (newflag, 1) ;                
  end ;                                         
  if cmiss (flag1) or _n_ < 2 then newflag = 0 ;
  do _n_ = 1 to _n_ ;                           
    set have ;                                  
    output ;                                    
  end ;                                         
run ;                                           

Apropos, I see no I/O logic vis-a-vis setting the NEWFLAG to 0 or blank (do you mean a missing value by that? kind of odd since the variable is numeric). Hence, whenever it's "non-consecutive", the code above sets it to 0.

 

Kind regards

Paul D.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 5986 views
  • 0 likes
  • 7 in conversation