## How to overwrite subsequent values in a longitudinal dataset?

Solved
Super Contributor
Posts: 338

# How to overwrite subsequent values in a longitudinal dataset?

Hi SAS Forum,

I have three accounts in this longitudinal dataset. This shows how accounts have aged over 4 months (note 0 is the first month).

Variables TRANSIT, ACCTNUM and age together form an unique record.

data have;

input TRANSIT ACCTNUM age bal stat \$ 14 -20;

cards;

10 111 0 220 1-30

10 222 0 450 31-60

20 111 0 300 61+

10 111 1 100 CURRENT

10 222 1 5   61+

20 111 1 75  NPNA

10 111 2 100 CURRENT

10 222 2 300 CURRENT

20 111 2 100 DWO

10 111 3 110 1-30

10 222 3 320 61+

20 111 3 105 DWO

;

RUN;

In transit 10, acctnum 111 started its life in age 0 with “1-30” stat (stat means status).

This account then cycled into “CURRENT” stat when age = 1, and remained in the same stat when age=2 also but deteriorated into “1-30” stat when age=3.

Question;

I need to over write all the subsequent values in variable “stat” with “CURRENT”, after an account once hit “CURRENT”.

So, the final data set I need is this:

data want;

input TRANSIT ACCTNUM age bal stat \$ 14 -20;

cards;

10 111 0 220 1-30

10 222 0 450 31-60

20 111 0 300 61+

10 111 1 100 CURRENT

10 222 1 5   61+

20 111 1 75  NPNA

10 111 2 100 CURRENT

10 222 2 300 CURRENT

20 111 2 100 DWO

10 111 3 110 CURRENT       /* in "have" dataset, this was 1-30”*/

10 222 3 320 CURRENT       /*in "have" dataset, this was 61+*/

20 111 3 105 DWO

;

RUN;

I HAVE tried THIS:

/*sorting*/

proc sort data=have;

by transit acctnum age;

run;

/*Interlace stacking – Haikuo is acknowledged*/

data want;

set have (in=up) have;

by transit acctnum age;

if first.acctnum then call missing(flag);

if up and STAT='CURRENT' then flag+1;

if not up and flag>=1 then output;

drop flag;

run;

But I do not know how to tweak the above code to get the intended results (or create an alternative code).

Could anyone help me?

Thanks

Mirisage

Accepted Solutions
Solution
‎05-02-2013 01:06 PM
Posts: 3,167

## Re: How to overwrite subsequent values in a longitudinal dataset?

The easiest way to do it is Hash() approach, no presort needed:

data have;

input TRANSIT ACCTNUM age bal stat \$ 14 -21;

cards;

10 111 0 220 1-30

10 222 0 450 31-60

20 111 0 300 61+

10 111 1 100 CURRENT

10 222 1 5 61+

20 111 1 75 NPNA

10 111 2 100 CURRENT

10 222 2 300 CURRENT

20 111 2 100 DWO

10 111 3 110 1-30

10 222 3 320 61+

20 111 3 105 DWO

;

data want;

if _n_=1 then do;

if 0 then set have;

dcl hash h();

h.definekey('transit','acctnum');

h.definedata('stat');

h.definedone();

end;

set have;

if stat='CURRENT' THEN RC=h.ref();

rc=h.find();

drop rc;

run;

Haikuo

All Replies
Solution
‎05-02-2013 01:06 PM
Posts: 3,167

## Re: How to overwrite subsequent values in a longitudinal dataset?

The easiest way to do it is Hash() approach, no presort needed:

data have;

input TRANSIT ACCTNUM age bal stat \$ 14 -21;

cards;

10 111 0 220 1-30

10 222 0 450 31-60

20 111 0 300 61+

10 111 1 100 CURRENT

10 222 1 5 61+

20 111 1 75 NPNA

10 111 2 100 CURRENT

10 222 2 300 CURRENT

20 111 2 100 DWO

10 111 3 110 1-30

10 222 3 320 61+

20 111 3 105 DWO

;

data want;

if _n_=1 then do;

if 0 then set have;

dcl hash h();

h.definekey('transit','acctnum');

h.definedata('stat');

h.definedone();

end;

set have;

if stat='CURRENT' THEN RC=h.ref();

rc=h.find();

drop rc;

run;

Haikuo

Super Contributor
Posts: 338

## Re: How to overwrite subsequent values in a longitudinal dataset?

Hi Haikuo,

Thank you very much for this code help which works well for my larger data set as well.

If you do not mind, pl let me to ask these 2 questions too but please consider them only if you have time.

1). I need to explain my co-worker how I have changed subsequent values into "CURRENT". Hash() approach is very much frighteneing me. Can we tweak your inteerlace stacking appraoch or any other merging, or proc sql method to generate the same resutls.

2). Although I said over writing the existing "stat" variable, I later found I still need the original "stat" variable. So, can we create a new variable, say "stat_new" instead of overwriting.

Mirisage

Posts: 3,167

## Re: How to overwrite subsequent values in a longitudinal dataset?

If not using Hash(), then I need to sort your data, and if you really want to maintain the original order, you just need to merge it back.

data have;

input TRANSIT ACCTNUM age bal stat \$ 14 -21;

cards;

10 111 0 220 1-30

10 222 0 450 31-60

20 111 0 300 61+

10 111 1 100 CURRENT

10 222 1 5 61+

20 111 1 75 NPNA

10 111 2 100 CURRENT

10 222 2 300 CURRENT

20 111 2 100 DWO

10 111 3 110 1-30

10 222 3 320 61+

20 111 3 105 DWO

;

proc sort data=have out=have;

by TRANSIT ACCTNUM age;

run;

data want;

set have;

by TRANSIT ACCTNUM age;

LENGTH _T \$ 8;

retain _t ;

if first.acctnum then _t='';

if stat='CURRENT' THEN _T=STAT;

NEW_STAT=COALESCEC(_T,STAT);

DROP _T;

RUN;

Haikuo

Super User
Posts: 10,788

## Re: How to overwrite subsequent values in a longitudinal dataset?

OR using Stack Skill:

```data have;
input TRANSIT ACCTNUM age bal stat : \$20.;
cards;
10 111 0 220 1-30
10 222 0 450 31-60
20 111 0 300 61+
10 111 1 100 CURRENT
10 222 1 5 61+
20 111 1 75 NPNA
10 111 2 100 CURRENT
10 222 2 300 CURRENT
20 111 2 100 DWO
10 111 3 110 1-30
10 222 3 320 61+
20 111 3 105 DWO
;
run;
data want;
set have;
array x{0:2} \$  _temporary_;
stat=coalescec(x{mod(_n_,3)},stat);
if stat='CURRENT' then x{mod(_n_,3)}=stat;
run;

```

Ksharp

Super Contributor
Posts: 338

## Re: How to overwrite subsequent values in a longitudinal dataset?

Hi Haikuo,

Thank you very much. Your code is great!

Hi Ksharp,

I applied your code. It runs smoothly but the results are not as intended. Can we tweak your code to generate the same resutls produced by Haikuo to satisfy our condition below.

"Once an account hits "CURRENT" under the "stat" variable, all the subsequent values for "stat" variable should be "CURRENT". And the "so created" new values for "stat" variable have to be included under a new variable called "New_stat".

This is the correct resutls from Haikuo

 TRANSIT ACCTNUM age bal stat NEW_STAT 10 111 0 220 1 - 30 1 - 30 10 111 1 100 CURRENT CURRENT 10 111 2 100 CURRENT CURRENT 10 111 3 110 1 - 30 CURRENT 10 222 0 450 31-60 31-60 10 222 1 5 61+ 61+ 10 222 2 300 CURRENT CURRENT 10 222 3 320 61+ CURRENT 20 111 0 300 61+ 61+ 20 111 1 75 NPNA NPNA 20 111 2 100 DWO DWO 20 111 3 105 DWO DWO

This is Ksharp's results

 TRANSIT ACCTNUM age bal stat 10 111 0 220 1 - 30 10 111 1 100 CURRENT 10 111 2 100 CURRENT 10 111 3 110 1 - 30 10 222 0 450 CURRENT 10 222 1 5 CURRENT 10 222 2 300 CURRENT 10 222 3 320 CURRENT 20 111 0 300 CURRENT 20 111 1 75 CURRENT 20 111 2 100 CURRENT 20 111 3 105 CURRENT

Thanks

Mirisage

Posts: 3,167

## Re: How to overwrite subsequent values in a longitudinal dataset?

Hi Mirisage,

Ksharp's code was based on these assumptions

1) The data came in not sorted.

2). You only have 3 accounts (TRANSIT + ACCTNUM combination), or if you know the total number of accounts ahead of time, the code can be tweaked accordingly.

3) No skipping. Meaning every accounts should have same amount of records.

Haikuo

Super User
Posts: 10,788

## Re: How to overwrite subsequent values in a longitudinal dataset?

As HaiKuo pointed out , as long as you make sure your data is under  these conditions .

And BTW :  Did you run my code after sorting the original data?

Super Contributor
Posts: 338

## Re: How to overwrite subsequent values in a longitudinal dataset?

Hi Haikuo,

Many thanks for these details.

Hi Ksharp,

Yes, I have run your code after sorting the original data.

Next I have run it without sorting, then it produced intended resutls, which are exactly same as Haikuo's resutls.

Thanks again to both of you.

Mirisage

🔒 This topic is solved and locked.