BookmarkSubscribeRSS Feed
ash3
Calcite | Level 5
I have 3 columns i.e id, date and status and Status has 3 categories (ND, prob and Default). I wanted to make desired_col mentioned on below data sample and this column will make based on below conditions.
When default occurs it will check below 6 rows and if it finds ND then it will count how many rows in between default and ND. In below example the first default occurs in feb01 and now it will check below 6 rows and it finds after 4 rows the status becomes ND then in desired columns it gives 4 and so on. Could someone please help this query in SAS.

Id date status desired_col
A jan01 ND .
A feb01 Default 4
A mar01 Prob. 3
A apr01 prob. 2
A may01 prob. 1
A jun01 ND. .
A jul01 ND. .
A aug01 ND. .
A sep01 Default. 5
A oct01 Prob. 4
A nov01 prob. 3
A dec01 prob. 2
A jan02 prob. 1
A jan02 ND. .

3 REPLIES 3
mkeintz
PROC Star

The essential part of this task is looking forward to later observations, but checking only those later observations with the same id as the current obs.  The MERGE statement (with the FIRSTOBS data set name parameter) is well-suited.  Remember to rename variables for all those future obs.  Finaly, the ARRAY statement provides a mechanism for finding the distance to the future "ND." value.

 

I have read in your desired_col varible, and you can match it with the new DISTANCE variable:

data have;
  input Id $1. date :$5. status :$7.  desired_col;
datalines;
A jan01 ND .
A feb01 Default 4
A mar01 Prob. 3
A apr01 prob. 2
A may01 prob. 1
A jun01 ND. .
A jul01 ND. .
A aug01 ND. .
A sep01 Default. 5
A oct01 Prob. 4
A nov01 prob. 3
A dec01 prob. 2
A jan02 prob. 1
A jan02 ND. .
run;

data want (drop=_:  i);
  set have (keep=id);   /* Erroneously dropped */
  by id;                /* Erroneously dropped */
  merge have
        have (firstobs=2 keep=id status rename=(id=_nxt_id1 status=_nxt_status1))
        have (firstobs=3 keep=id status rename=(id=_nxt_id2 status=_nxt_status2))
        have (firstobs=4 keep=id status rename=(id=_nxt_id3 status=_nxt_status3))
        have (firstobs=5 keep=id status rename=(id=_nxt_id4 status=_nxt_status4))
        have (firstobs=6 keep=id status rename=(id=_nxt_id5 status=_nxt_status5))
        have (firstobs=7 keep=id status rename=(id=_nxt_id6 status=_nxt_status6)) ;

  array _nxt_id {6} _nxt_id1-_nxt_id6;
  array _nxt_st {6} _nxt_status1-_nxt_status6;

  retain distance .;
  if first.id=1 then distance=.;  /* Initialize at the start of each ID */

  if status='Default' then do i=1 to 6 while (_nxt_id{i}=id);  /*"Default"? Look forward */
    if _nxt_st{i}='ND.' then distance=i;
    if distance^=. then leave;
  end;
  else if distance>1 then distance=distance-1;
  else distance=.;
run;

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Tom
Super User Tom
Super User

A straight forward method would be to use a second SET statement using the POINT= option to look forward in the file.  You will want to make sure you don't try to read past the end of the file or try to use data from the next ID.

 

First let's convert your listing into actual data.  And remove the extra periods (hopefully those are not in the actual data).

data have;
  input Id $ date :monyy. status $ desired_col ;
  format date monyy7.;
cards;
A jan01 ND .
A feb01 Default 4
A mar01 Prob 3
A apr01 prob 2
A may01 prob 1
A jun01 ND .
A jul01 ND .
A aug01 ND .
A sep01 Default 5
A oct01 Prob 4
A nov01 prob 3
A dec01 prob 2
A jan02 prob 1
A jan02 ND .
;

Now we should be able to replicated your desired results by just looking for the next ND status in the following 6 observations.

data want;
  set have nobs=nobs;
  if status ne 'ND' then do p=_n_+1 to min(_n_+6,nobs) while(want=.);
    set have(keep=id status rename=(id=id2 status=status2)) point=p;
    if id2 ne id then p=p+6;
    else if status2='ND' then want=p-_n_;
  end;
  drop id2 status2;
run;

Results

                                   desired_
Obs    Id       date    status        col      want

  1    A     JAN2001    ND             .         .
  2    A     FEB2001    Default        4         4
  3    A     MAR2001    Prob           3         3
  4    A     APR2001    prob           2         2
  5    A     MAY2001    prob           1         1
  6    A     JUN2001    ND             .         .
  7    A     JUL2001    ND             .         .
  8    A     AUG2001    ND             .         .
  9    A     SEP2001    Default        5         5
 10    A     OCT2001    Prob           4         4
 11    A     NOV2001    prob           3         3
 12    A     DEC2001    prob           2         2
 13    A     JAN2002    prob           1         1
 14    A     JAN2002    ND             .         .

Ksharp
Super User
data have;
  input Id $1. date :monyy5. status $  ;
  format date monyy5.;
datalines;
A jan01 ND.  .
A feb01 Default 4
A mar01 Prob. 3
A apr01 prob. 2
A may01 prob. 1
A jun01 ND. .
A jul01 ND. .
A aug01 ND. .
A sep01 Default. 5
A oct01 Prob. 4
A nov01 prob. 3
A dec01 prob. 2
A jan02 prob. 1
A jan02 ND. .
;
data temp;
 set have;
 flag=ifn( status='ND.',0,1);
 n+1;
run;
data temp;
 set temp;
 by id flag notsorted;
 f+first.flag;
run;
proc sort data=temp;
by id descending n;
run;
data temp2;
 set temp;
 by id;
 if first.id or status='ND.' then group=0;
  else group+1;
run;
proc sql nowarn;
create table want as
select id,date,status,case when group=0 or count(*)>6 then . else group end as want
 from temp2
  group by id,f
   order by id,n;
quit;

Ksharp_0-1708586143422.png

 

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 972 views
  • 0 likes
  • 4 in conversation