BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Fluorite | Level 6

## Update information from previous row

I have a dataset with many patients first year visit information. Then I want to get some information from the first year data into the second and third year.  For example, for ID=1, then his age will be 51, 52 in the 2nd and 3rd year, while gender is still male.

``````data have;
input id age visit gender \$;
datalines;
1 50 1 male
1 .  2  .
1 .  3  .
2 60 1 female
2 .  2  .
2 .  3  .
;
run;``````

what I want

 id age visit gender 1 50 1 male 1 51 2 male 1 52 3 male 2 60 1 female 2 61 2 female

Thank you.

1 ACCEPTED SOLUTION

Accepted Solutions
Tourmaline | Level 20

## Re: Update information from previous row

Try this

``````
data have;
input id age visit gender \$;
datalines;
1 50 1 male
1 .  2  .
1 .  3  .
2 60 1 female
2 .  2  .
2 .  3  .
;
run;

data want(drop = a g);
set have;
by id;
if first.id then do;
g = gender;
a = age;
end;
else a + 1;

gender = g;
age    = a;

retain a g;
run;``````
6 REPLIES 6
Tourmaline | Level 20

## Re: Update information from previous row

Try this

``````
data have;
input id age visit gender \$;
datalines;
1 50 1 male
1 .  2  .
1 .  3  .
2 60 1 female
2 .  2  .
2 .  3  .
;
run;

data want(drop = a g);
set have;
by id;
if first.id then do;
g = gender;
a = age;
end;
else a + 1;

gender = g;
age    = a;

retain a g;
run;``````
Fluorite | Level 6

## Re: Update information from previous row

It works! Thank you.
Super User

## Re: Update information from previous row

You can actually trick the UPDATE statement into doing this for you.  The purpose of the UPDATE statement is to allow you to apply a transaction dataset to an existing dataset.  When the variable is the transaction dataset is missing then the current value is not replaced.

You have to have a BY group and normally the BY variables should uniquely identify the observations in the source dataset.  But you can have multiple transactions per BY group.  Your data already has a BY variable, ID, but it is not unique.  But if you just treat ALL of the observations as the transactions your can avoid that problem. So use the OBS=0 dataset option to start with an empty source dataset.  To output more than one observation per BY group just add an explicit OUTPUT statement.

``````data want;
update have(obs=0) have;
by id;
output;
run;``````

PROC Star

## Re: Update information from previous row

For future reference, consider a strategic combination of DROP=  and POINT=:

``````data have;
input id age visit gender \$;
datalines;
1 50 1 male
1 .  2  .
1 .  3  .
2 60 1 female
2 .  2  .
2 .  3  .
run;
data want;
set have (drop=age gender);
by id;
if first.id then set have point=_n_;
run;``````

The benefit of this approach is that it avoids the need for an explicit RETAIN statement.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Super User

## Re: Update information from previous row

That trick will only work when the non-missing values only appear on the first observation in the BY group.

Fluorite | Level 6

## Re: Update information from previous row

Your code would not give the updated age for 2nd and 3rd year:

id  visit   age  gender
 1 1 50 male 1 2 50 male 1 3 50 male 2 1 60 female 2 2 60 female 2 3 60 female

We need one more step to correct it:

``````data want;
set want;
if visit=2 then age=age+1;
if visit=3 then age=age+2;
run;``````
Discussion stats
• 6 replies
• 787 views
• 4 likes
• 4 in conversation