BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Mirisage
Obsidian | Level 7

Hi SAS Forum,

I have three accounts in this longitudinal dataset. This shows how accounts have aged over 4 months (note 0 is the first month).

Variables TRANSIT, ACCTNUM and age together form an unique record.

data have;

input TRANSIT ACCTNUM age bal stat $ 14 -20;

cards;

10 111 0 220 1-30

10 222 0 450 31-60

20 111 0 300 61+

10 111 1 100 CURRENT

10 222 1 5   61+

20 111 1 75  NPNA

10 111 2 100 CURRENT

10 222 2 300 CURRENT

20 111 2 100 DWO

10 111 3 110 1-30

10 222 3 320 61+

20 111 3 105 DWO

;

RUN;

In transit 10, acctnum 111 started its life in age 0 with “1-30” stat (stat means status).

This account then cycled into “CURRENT” stat when age = 1, and remained in the same stat when age=2 also but deteriorated into “1-30” stat when age=3.

Question;

I need to over write all the subsequent values in variable “stat” with “CURRENT”, after an account once hit “CURRENT”.

So, the final data set I need is this:

data want;

input TRANSIT ACCTNUM age bal stat $ 14 -20;

cards;

10 111 0 220 1-30

10 222 0 450 31-60

20 111 0 300 61+

10 111 1 100 CURRENT

10 222 1 5   61+

20 111 1 75  NPNA

10 111 2 100 CURRENT

10 222 2 300 CURRENT

20 111 2 100 DWO

10 111 3 110 CURRENT       /* in "have" dataset, this was 1-30”*/

10 222 3 320 CURRENT       /*in "have" dataset, this was 61+*/

20 111 3 105 DWO

;

RUN;

I HAVE tried THIS:

/*sorting*/

proc sort data=have;

  by transit acctnum age;

run;

/*Interlace stacking – Haikuo is acknowledged*/

data want;

   set have (in=up) have;

     by transit acctnum age;

     if first.acctnum then call missing(flag);

     if up and STAT='CURRENT' then flag+1;

     if not up and flag>=1 then output;

     drop flag;

run;

But I do not know how to tweak the above code to get the intended results (or create an alternative code).

Could anyone help me?

Thanks

Mirisage

1 ACCEPTED SOLUTION

Accepted Solutions
Haikuo
Onyx | Level 15

The easiest way to do it is Hash() approach, no presort needed:

data have;

input TRANSIT ACCTNUM age bal stat $ 14 -21;

cards;

10 111 0 220 1-30

10 222 0 450 31-60

20 111 0 300 61+

10 111 1 100 CURRENT

10 222 1 5 61+

20 111 1 75 NPNA

10 111 2 100 CURRENT

10 222 2 300 CURRENT

20 111 2 100 DWO

10 111 3 110 1-30

10 222 3 320 61+

20 111 3 105 DWO

;

data want;

  if _n_=1 then do;

  if 0 then set have;

dcl hash h();

h.definekey('transit','acctnum');

h.definedata('stat');

h.definedone();

  end;

  set have;

  if stat='CURRENT' THEN RC=h.ref();

rc=h.find();

drop rc;

run;

Haikuo

View solution in original post

8 REPLIES 8
Haikuo
Onyx | Level 15

The easiest way to do it is Hash() approach, no presort needed:

data have;

input TRANSIT ACCTNUM age bal stat $ 14 -21;

cards;

10 111 0 220 1-30

10 222 0 450 31-60

20 111 0 300 61+

10 111 1 100 CURRENT

10 222 1 5 61+

20 111 1 75 NPNA

10 111 2 100 CURRENT

10 222 2 300 CURRENT

20 111 2 100 DWO

10 111 3 110 1-30

10 222 3 320 61+

20 111 3 105 DWO

;

data want;

  if _n_=1 then do;

  if 0 then set have;

dcl hash h();

h.definekey('transit','acctnum');

h.definedata('stat');

h.definedone();

  end;

  set have;

  if stat='CURRENT' THEN RC=h.ref();

rc=h.find();

drop rc;

run;

Haikuo

Mirisage
Obsidian | Level 7

Hi Haikuo,

Thank you very much for this code help which works well for my larger data set as well.

If you do not mind, pl let me to ask these 2 questions too but please consider them only if you have time.

1). I need to explain my co-worker how I have changed subsequent values into "CURRENT". Hash() approach is very much frighteneing me. Can we tweak your inteerlace stacking appraoch or any other merging, or proc sql method to generate the same resutls.

2). Although I said over writing the existing "stat" variable, I later found I still need the original "stat" variable. So, can we create a new variable, say "stat_new" instead of overwriting.

Thanks again for your help.

Mirisage


Haikuo
Onyx | Level 15

If not using Hash(), then I need to sort your data, and if you really want to maintain the original order, you just need to merge it back.

data have;

input TRANSIT ACCTNUM age bal stat $ 14 -21;

cards;

10 111 0 220 1-30

10 222 0 450 31-60

20 111 0 300 61+

10 111 1 100 CURRENT

10 222 1 5 61+

20 111 1 75 NPNA

10 111 2 100 CURRENT

10 222 2 300 CURRENT

20 111 2 100 DWO

10 111 3 110 1-30

10 222 3 320 61+

20 111 3 105 DWO

;

proc sort data=have out=have;

by TRANSIT ACCTNUM age;

run;

data want;

  set have;

   by TRANSIT ACCTNUM age;

   LENGTH _T $ 8;

   retain _t ;

   if first.acctnum then _t='';

   if stat='CURRENT' THEN _T=STAT;

   NEW_STAT=COALESCEC(_T,STAT);

   DROP _T;

   RUN;

Haikuo

Ksharp
Super User

OR using Stack Skill:

data have;
input TRANSIT ACCTNUM age bal stat : $20.;
cards;
10 111 0 220 1-30
10 222 0 450 31-60
20 111 0 300 61+
10 111 1 100 CURRENT
10 222 1 5 61+
20 111 1 75 NPNA
10 111 2 100 CURRENT
10 222 2 300 CURRENT
20 111 2 100 DWO
10 111 3 110 1-30
10 222 3 320 61+
20 111 3 105 DWO
;
run;
data want;
 set have;
 array x{0:2} $  _temporary_;
 stat=coalescec(x{mod(_n_,3)},stat);
 if stat='CURRENT' then x{mod(_n_,3)}=stat;
run;

Ksharp

Mirisage
Obsidian | Level 7

Hi Haikuo,

Thank you very much. Your code is great!

Hi Ksharp,

I applied your code. It runs smoothly but the results are not as intended. Can we tweak your code to generate the same resutls produced by Haikuo to satisfy our condition below.

"Once an account hits "CURRENT" under the "stat" variable, all the subsequent values for "stat" variable should be "CURRENT". And the "so created" new values for "stat" variable have to be included under a new variable called "New_stat".

This is the correct resutls from Haikuo

TRANSIT

ACCTNUM

age

bal

stat

NEW_STAT

10

111

0

220

1 - 30

1 - 30

10

111

1

100

CURRENT

CURRENT

10

111

2

100

CURRENT

CURRENT

10

111

3

110

1 - 30

CURRENT

10

222

0

450

31-60

31-60

10

222

1

5

61+

61+

10

222

2

300

CURRENT

CURRENT

10

222

3

320

61+

CURRENT

20

111

0

300

61+

61+

20

111

1

75

NPNA

NPNA

20

111

2

100

DWO

DWO

20

111

3

105

DWO

DWO

This is Ksharp's results 

TRANSIT

ACCTNUM

age

bal

stat

10

111

0

220

1 - 30

10

111

1

100

CURRENT

10

111

2

100

CURRENT

10

111

3

110

1 - 30

10

222

0

450

CURRENT

10

222

1

5

CURRENT

10

222

2

300

CURRENT

10

222

3

320

CURRENT

20

111

0

300

CURRENT

20

111

1

75

CURRENT

20

111

2

100

CURRENT

20

111

3

105

CURRENT

Thanks

Mirisage


Haikuo
Onyx | Level 15

Hi Mirisage,

Ksharp's code was based on these assumptions

1) The data came in not sorted.

2). You only have 3 accounts (TRANSIT + ACCTNUM combination), or if you know the total number of accounts ahead of time, the code can be tweaked accordingly.

3) No skipping. Meaning every accounts should have same amount of records.

 

Haikuo

Ksharp
Super User

As HaiKuo pointed out , as long as you make sure your data is under  these conditions .

And BTW :  Did you run my code after sorting the original data?

Mirisage
Obsidian | Level 7

Hi Haikuo,

Many thanks for these details.

Hi Ksharp,

Yes, I have run your code after sorting the original data.

Next I have run it without sorting, then it produced intended resutls, which are exactly same as Haikuo's resutls.

Thanks again to both of you.

Mirisage


hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 1909 views
  • 3 likes
  • 3 in conversation