Filling a variable across multiple rows conditionally

Walternate · Posted 10-26-2015 03:24 PM

Hi,

I have a dataset which is supposed to be at the person-level but in reality has some duplicate rows. It has an ID variable, 2 date variables, and several other categorical variables.

ID Date1 Date2 Categ1.......Categ10

1 1/3/2014 4/9/2013 abc xyz

1 2/15/2015 def jkl

2 10/9/2013 abc def

2 11/4/2014 jkl xyz

3 2/28/2012 abc xyz

3 3/15/2013 9/7/2014 def jkl

What I want to do when there are dupes is to take the row with the latest Date1. I know how to do that by itself:

data want;

set have;

by ID Date1;

if last.ID then output;

run;

The problem is that in cases where a person has a value for Date2 in the row with the earlier Date1, that value of Date2 should be filled into the later Date1 row (ie, the row that will be output into the new dataset without duplicates). So for person 1 for example, their second row would be output and would have Date1=2/15/2015 and Date2=4/9/2013. All the other variables should maintain their values from the original Date1=2/15/2015 row). I need to be able to do this without overwriting Date2 in case the later/output row for a person already has a value for Date2--ie, person 3 should have Date1=3/15/2013 and Date2=9/7/2014 in the output dataset, with all the other variables corresponding to the Date1=3/15/2013 row.

Any help is much appreciated.

Reeza · Posted 10-26-2015 03:29 PM

Use retain and first with an IF condition.

Some variation of the following should work.

retain date_keep;
IF first.id then date_keep=date2;
if last.id and missing(date2) then date2=date_keep;
if last.id then output;

PGStats · Posted 10-26-2015 09:21 PM

I would do:

data want;
do until(last.id);
    set have(rename=date2=d2); by id;
    if not missing(d2) then date2 = d2;
    end;
format date2 mmddyy10.;
drop d2;
run;

PG

Filling a variable across multiple rows conditionally

Re: Filling a variable across multiple rows conditionally

Re: Filling a variable across multiple rows conditionally

Filling a variable across multiple rows conditionally

Re: Filling a variable across multiple rows conditionally

Re: Filling a variable across multiple rows conditionally

SAS Innovate 2025: Save the Date

SAS Training: Just a Click Away