Help using Base SAS procedures

How to overwrite subsequent values in a longitudinal dataset?

Accepted Solution Solved
Reply
Super Contributor
Posts: 338
Accepted Solution

How to overwrite subsequent values in a longitudinal dataset?

Hi SAS Forum,

I have three accounts in this longitudinal dataset. This shows how accounts have aged over 4 months (note 0 is the first month).

Variables TRANSIT, ACCTNUM and age together form an unique record.

data have;

input TRANSIT ACCTNUM age bal stat $ 14 -20;

cards;

10 111 0 220 1-30

10 222 0 450 31-60

20 111 0 300 61+

10 111 1 100 CURRENT

10 222 1 5   61+

20 111 1 75  NPNA

10 111 2 100 CURRENT

10 222 2 300 CURRENT

20 111 2 100 DWO

10 111 3 110 1-30

10 222 3 320 61+

20 111 3 105 DWO

;

RUN;

In transit 10, acctnum 111 started its life in age 0 with “1-30” stat (stat means status).

This account then cycled into “CURRENT” stat when age = 1, and remained in the same stat when age=2 also but deteriorated into “1-30” stat when age=3.

Question;

I need to over write all the subsequent values in variable “stat” with “CURRENT”, after an account once hit “CURRENT”.

So, the final data set I need is this:

data want;

input TRANSIT ACCTNUM age bal stat $ 14 -20;

cards;

10 111 0 220 1-30

10 222 0 450 31-60

20 111 0 300 61+

10 111 1 100 CURRENT

10 222 1 5   61+

20 111 1 75  NPNA

10 111 2 100 CURRENT

10 222 2 300 CURRENT

20 111 2 100 DWO

10 111 3 110 CURRENT       /* in "have" dataset, this was 1-30”*/

10 222 3 320 CURRENT       /*in "have" dataset, this was 61+*/

20 111 3 105 DWO

;

RUN;

I HAVE tried THIS:

/*sorting*/

proc sort data=have;

  by transit acctnum age;

run;

/*Interlace stacking – Haikuo is acknowledged*/

data want;

   set have (in=up) have;

     by transit acctnum age;

     if first.acctnum then call missing(flag);

     if up and STAT='CURRENT' then flag+1;

     if not up and flag>=1 then output;

     drop flag;

run;

But I do not know how to tweak the above code to get the intended results (or create an alternative code).

Could anyone help me?

Thanks

Mirisage


Accepted Solutions
Solution
‎05-02-2013 01:06 PM
Respected Advisor
Posts: 3,156

Re: How to overwrite subsequent values in a longitudinal dataset?

The easiest way to do it is Hash() approach, no presort needed:

data have;

input TRANSIT ACCTNUM age bal stat $ 14 -21;

cards;

10 111 0 220 1-30

10 222 0 450 31-60

20 111 0 300 61+

10 111 1 100 CURRENT

10 222 1 5 61+

20 111 1 75 NPNA

10 111 2 100 CURRENT

10 222 2 300 CURRENT

20 111 2 100 DWO

10 111 3 110 1-30

10 222 3 320 61+

20 111 3 105 DWO

;

data want;

  if _n_=1 then do;

  if 0 then set have;

dcl hash h();

h.definekey('transit','acctnum');

h.definedata('stat');

h.definedone();

  end;

  set have;

  if stat='CURRENT' THEN RC=h.ref();

rc=h.find();

drop rc;

run;

Haikuo

View solution in original post


All Replies
Solution
‎05-02-2013 01:06 PM
Respected Advisor
Posts: 3,156

Re: How to overwrite subsequent values in a longitudinal dataset?

The easiest way to do it is Hash() approach, no presort needed:

data have;

input TRANSIT ACCTNUM age bal stat $ 14 -21;

cards;

10 111 0 220 1-30

10 222 0 450 31-60

20 111 0 300 61+

10 111 1 100 CURRENT

10 222 1 5 61+

20 111 1 75 NPNA

10 111 2 100 CURRENT

10 222 2 300 CURRENT

20 111 2 100 DWO

10 111 3 110 1-30

10 222 3 320 61+

20 111 3 105 DWO

;

data want;

  if _n_=1 then do;

  if 0 then set have;

dcl hash h();

h.definekey('transit','acctnum');

h.definedata('stat');

h.definedone();

  end;

  set have;

  if stat='CURRENT' THEN RC=h.ref();

rc=h.find();

drop rc;

run;

Haikuo

Super Contributor
Posts: 338

Re: How to overwrite subsequent values in a longitudinal dataset?

Hi Haikuo,

Thank you very much for this code help which works well for my larger data set as well.

If you do not mind, pl let me to ask these 2 questions too but please consider them only if you have time.

1). I need to explain my co-worker how I have changed subsequent values into "CURRENT". Hash() approach is very much frighteneing me. Can we tweak your inteerlace stacking appraoch or any other merging, or proc sql method to generate the same resutls.

2). Although I said over writing the existing "stat" variable, I later found I still need the original "stat" variable. So, can we create a new variable, say "stat_new" instead of overwriting.

Thanks again for your help.

Mirisage


Respected Advisor
Posts: 3,156

Re: How to overwrite subsequent values in a longitudinal dataset?

If not using Hash(), then I need to sort your data, and if you really want to maintain the original order, you just need to merge it back.

data have;

input TRANSIT ACCTNUM age bal stat $ 14 -21;

cards;

10 111 0 220 1-30

10 222 0 450 31-60

20 111 0 300 61+

10 111 1 100 CURRENT

10 222 1 5 61+

20 111 1 75 NPNA

10 111 2 100 CURRENT

10 222 2 300 CURRENT

20 111 2 100 DWO

10 111 3 110 1-30

10 222 3 320 61+

20 111 3 105 DWO

;

proc sort data=have out=have;

by TRANSIT ACCTNUM age;

run;

data want;

  set have;

   by TRANSIT ACCTNUM age;

   LENGTH _T $ 8;

   retain _t ;

   if first.acctnum then _t='';

   if stat='CURRENT' THEN _T=STAT;

   NEW_STAT=COALESCEC(_T,STAT);

   DROP _T;

   RUN;

Haikuo

Super User
Posts: 10,028

Re: How to overwrite subsequent values in a longitudinal dataset?

OR using Stack Skill:

data have;
input TRANSIT ACCTNUM age bal stat : $20.;
cards;
10 111 0 220 1-30
10 222 0 450 31-60
20 111 0 300 61+
10 111 1 100 CURRENT
10 222 1 5 61+
20 111 1 75 NPNA
10 111 2 100 CURRENT
10 222 2 300 CURRENT
20 111 2 100 DWO
10 111 3 110 1-30
10 222 3 320 61+
20 111 3 105 DWO
;
run;
data want;
 set have;
 array x{0:2} $  _temporary_;
 stat=coalescec(x{mod(_n_,3)},stat);
 if stat='CURRENT' then x{mod(_n_,3)}=stat;
run;

Ksharp

Super Contributor
Posts: 338

Re: How to overwrite subsequent values in a longitudinal dataset?

Hi Haikuo,

Thank you very much. Your code is great!

Hi Ksharp,

I applied your code. It runs smoothly but the results are not as intended. Can we tweak your code to generate the same resutls produced by Haikuo to satisfy our condition below.

"Once an account hits "CURRENT" under the "stat" variable, all the subsequent values for "stat" variable should be "CURRENT". And the "so created" new values for "stat" variable have to be included under a new variable called "New_stat".

This is the correct resutls from Haikuo

TRANSIT

ACCTNUM

age

bal

stat

NEW_STAT

10

111

0

220

1 - 30

1 - 30

10

111

1

100

CURRENT

CURRENT

10

111

2

100

CURRENT

CURRENT

10

111

3

110

1 - 30

CURRENT

10

222

0

450

31-60

31-60

10

222

1

5

61+

61+

10

222

2

300

CURRENT

CURRENT

10

222

3

320

61+

CURRENT

20

111

0

300

61+

61+

20

111

1

75

NPNA

NPNA

20

111

2

100

DWO

DWO

20

111

3

105

DWO

DWO

This is Ksharp's results 

TRANSIT

ACCTNUM

age

bal

stat

10

111

0

220

1 - 30

10

111

1

100

CURRENT

10

111

2

100

CURRENT

10

111

3

110

1 - 30

10

222

0

450

CURRENT

10

222

1

5

CURRENT

10

222

2

300

CURRENT

10

222

3

320

CURRENT

20

111

0

300

CURRENT

20

111

1

75

CURRENT

20

111

2

100

CURRENT

20

111

3

105

CURRENT

Thanks

Mirisage


Respected Advisor
Posts: 3,156

Re: How to overwrite subsequent values in a longitudinal dataset?

Hi Mirisage,

Ksharp's code was based on these assumptions

1) The data came in not sorted.

2). You only have 3 accounts (TRANSIT + ACCTNUM combination), or if you know the total number of accounts ahead of time, the code can be tweaked accordingly.

3) No skipping. Meaning every accounts should have same amount of records.

 

Haikuo

Super User
Posts: 10,028

Re: How to overwrite subsequent values in a longitudinal dataset?

As HaiKuo pointed out , as long as you make sure your data is under  these conditions .

And BTW :  Did you run my code after sorting the original data?

Super Contributor
Posts: 338

Re: How to overwrite subsequent values in a longitudinal dataset?

Hi Haikuo,

Many thanks for these details.

Hi Ksharp,

Yes, I have run your code after sorting the original data.

Next I have run it without sorting, then it produced intended resutls, which are exactly same as Haikuo's resutls.

Thanks again to both of you.

Mirisage


🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 217 views
  • 3 likes
  • 3 in conversation