Solved: Re: Loop, Conditional and Incrementation

Althaea · Posted 01-29-2019 11:48 AM

Hi !

I've read a lot of subjects about loops and incrementation but I can't figure out how to do what seems like a simple task.

Here's what my data looks like :

data WORK.HAVE;
  infile datalines;
  input Name:$10. ID:2. YN:$8. value:$5.;
datalines;
 Antoine 01 true (1)
 Antoine 01 false (2)
 Thomas 02 true (1)
 Thomas 02 true (2)
 Thomas 02 false (3)
 Adeline 03 false (1)
 Michelle 04 true (1)
 Michelle 04 false (2)
 Michelle 04 false (3)
 Michelle 04 false (4)
 Clemence 05 false (1)
;

For each individual, if YN was "true", a new row would be created. A "false" should have stoped the incrementation.

But following this logic, in this data Michelle has 2 rows that shouldn't exist.

My goal is to detect and delete those rows.

Expected result :

data WORK.WANT;
  infile datalines;
  input Name:$10. ID:2. YN:$8. value:$5.;
datalines;
 Antoine 01 true (1)
 Antoine 01 false (2)
 Thomas 02 true (1)
 Thomas 02 true (2)
 Thomas 02 false (3)
 Adeline 03 false (1)
 Michelle 04 true (1)
 Michelle 04 false (2)
 Clemence 05 false (1)
;

It seems easy but I'm struggling and feel like i'm burning learning steps.

My logic was something like that :

IF actualrow.ID = previousrow.ID AND actualrow.YN = previousrow.YN = "false" THEN drop actualrow

Maybe I can use proc sql, but I don't know how to include sql within a data step yet..

I'm also having issues finding documents talking about this from basics.

All help appreciated.

Thanks for your time !

ballardw · Posted 01-29-2019 12:18 PM

This works for your example data:

data work.want;
   set work.have;
   by notsorted name notsorted yn;
   retain flag;
   if first.name then flag=(yn='false');
   else do;
      if flag and (yn='false') then delete;
      else flag = (yn='false');
   end;
run;

If name is ever different for the same ID you could use ID in place of Name in the code.

SAS Proc SQL does not have a "previous row" feature that works in the way you are thinking.

BTW if you code this with true/false or yes/no as numeric 1/0 values you don't need to use yn='false' for comparisons. You could use "if yn" to test for true or "if not(yn)" for false

View solution in original post

novinosrin · Posted 01-29-2019 12:16 PM

data WORK.HAVE;
  infile datalines;
  input Name:$10. ID:2. YN:$8. value:$5.;
datalines;
 Antoine 01 true (1)
 Antoine 01 false (2)
 Thomas 02 true (1)
 Thomas 02 true (2)
 Thomas 02 false (3)
 Adeline 03 false (1)
 Michelle 04 true (1)
 Michelle 04 false (2)
 Michelle 04 false (3)
 Michelle 04 false (4)
 Clemence 05 false (1)
;




data want;
set have;
by id yn notsorted;
retain f;
if first.id then f=.;
f=ifn(not first.id and lag(yn)='false',1,.);
if 	not f;
drop f ;
run;

/*or*/

data want;
set have;
by id yn notsorted;
retain f;
if first.id then f=.;
f=not first.id and lag(yn)='false';
if 	not f;
drop f ;
run;

ballardw · Posted 01-29-2019 12:18 PM

This works for your example data:

data work.want;
   set work.have;
   by notsorted name notsorted yn;
   retain flag;
   if first.name then flag=(yn='false');
   else do;
      if flag and (yn='false') then delete;
      else flag = (yn='false');
   end;
run;

If name is ever different for the same ID you could use ID in place of Name in the code.

SAS Proc SQL does not have a "previous row" feature that works in the way you are thinking.

BTW if you code this with true/false or yes/no as numeric 1/0 values you don't need to use yn='false' for comparisons. You could use "if yn" to test for true or "if not(yn)" for false

Althaea · Posted 01-30-2019 05:13 AM

Hi Novinisrin and Ballardw !

Thanks for your time and help.

I've tested your scripts and they work fine in sandbox but not on live for a reason I can't find.

The closest result was with ballardw script, in witch the result was working fine but only on individuals that started with a "true" YN variable. It would drop the individuals starting with a "false" statement, except the very first individual SUBJID="3".

The script I executed and the output :

1998  data work.want;
1999     set work.have;
2000     by notsorted SUBJID notsorted CMYN;
2001     retain flag;
2002     if first.name then flag=(CMYN='false');
2003     else do;
2004        if flag and (CMYN='false') then delete;
2005        else flag = (CMYN='false');
2006     end;
2007  run;

NOTE: Variable first.name non initialisée.
NOTE: There were 28 observations read from the data set WORK.HAVE.
NOTE: The data set WORK.WANT has 5 observations and 4 variables.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds

It's weird because the script is working fine on sandbox data, whatever combinaison I try.

Thanks again for your time !

EDIT : I just realised first. function has "name" as parameter.. that was the problem.

Your scripts are working fine !

I'm still learning haha.

Thanks for your help !

Loop, Conditional and Incrementation

Re: Loop, Conditional and Incrementation

Re: Loop, Conditional and Incrementation

Re: Loop, Conditional and Incrementation

Re: Loop, Conditional and Incrementation

Ready to join fellow brilliant minds for the SAS Hackathon?

Click image to register for webinar

Classroom Training Available!