Calcite | Level 5

## Filling in missing variable based on other rows in table.

Good day,

I currently have a dataset that has transaction information for dispute cases.  Each one of these transactions can have multiple actions take place.  Only a certain type of action will show the transaction id.  I am looking for a way to copy that transaction ID to each action within a case.  Using the sample data below I need it to do the following:

Look for each trns_nbr, then for every matching Acnt_nbr, dispute_nbr, and act_seq that is the same it should fill in the trns_nbr.

so below should fill in 9876 in the 2nd and 3rd row and then fill in 5678 on the 5th and 6th row.  The table will have a couple thousands lines to look through.

acnt_nbr      dispute_nbr      act_seq      act_name        trns_nbr

12345               abcd                  1              XY                  9876

12345               abcd                  1              YZ

12345               abcd                  1              CV

12345               abcd                  2              XY                  5678

12345              abcd                   2              YZ

12345              abcd                   2              CV

1 ACCEPTED SOLUTION

Accepted Solutions

## Re: Filling in missing variable based on other rows in table.

Try this:

``````data have;
input acnt_nbr dispute_nbr \$ act_seq  act_name \$ trns_nbr;
cards;
12345 abcd 1 XY 9876
12345 abcd 1 YZ .
12345 abcd 1 CV .
12345 abcd 2 XY 5678
12345 abcd 2 YZ .
12345 abcd 2 CV .
;

data want;
do until(last.act_seq);
set have;
by acnt_nbr dispute_nbr act_seq notsorted;
if ~missing(trns_nbr) then tn=trns_nbr;
end;
do until(last.act_seq);
set have;
by acnt_nbr dispute_nbr act_seq notsorted;
if missing(trns_nbr) then trns_nbr=tn;
output;
end;
drop tn;
run;``````

Edit: This assumes that your input dataset is at least grouped by the three BY variables, i.e., groups of identical combinations of their values form contiguous blocks of observations.

## Re: Filling in missing variable based on other rows in table.

Try this:

``````data have;
input acnt_nbr dispute_nbr \$ act_seq  act_name \$ trns_nbr;
cards;
12345 abcd 1 XY 9876
12345 abcd 1 YZ .
12345 abcd 1 CV .
12345 abcd 2 XY 5678
12345 abcd 2 YZ .
12345 abcd 2 CV .
;

data want;
do until(last.act_seq);
set have;
by acnt_nbr dispute_nbr act_seq notsorted;
if ~missing(trns_nbr) then tn=trns_nbr;
end;
do until(last.act_seq);
set have;
by acnt_nbr dispute_nbr act_seq notsorted;
if missing(trns_nbr) then trns_nbr=tn;
output;
end;
drop tn;
run;``````

Edit: This assumes that your input dataset is at least grouped by the three BY variables, i.e., groups of identical combinations of their values form contiguous blocks of observations.

Discussion stats