BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
expertyejin
Obsidian | Level 7

Hello SAS community,

 

I am desperately trying to figure out how to replace missing values with non-missing values based on a given condition. To be more precise, we fill in a missing value if the time difference between its corresponding date and adjacent date is less than a month; and we leave it as missing if it's one month or over apart. For instance, the following dot should be replaced with 1 since the time difference is less than a month: 

 

09jan2016 1
15jan2016 .

On the other hand, the following dot should be left as missing as the two events happened over a span of more than one month:

 

 

 

21may2016 .
10jul2016 1

Now, here is a subset of the data set I'm dealing with in my workplace:

 

data my_data;
input ID Birthday date9. tpin;
label tpin="Temporary Personal Identification Number";
cards;
1 31mar1990 .
1 13apr1990 .
1 20apr1990 0
1 04jun1990 .
1 08jun1990 .
1 19jun1990 1
1 01jul1990 .
1 06aug1990 .
1 13aug1990 2
1 22feb1991 .
2 28feb1991 1
2 15apr1991 .
2 20may1991 0
2 25may1991 .
2 01jun1991 .
2 18jun1991 2
;
run;

If you've understood the rules to apply, the final table should look like this after an up and down algorithm, for each ID:

 

1 31mar1990 0
1 13apr1990 0
1 20apr1990 0
1 04jun1990 1
1 08jun1990 1
1 19jun1990 1
1 01jul1990 1
1 06aug1990 2
1 13aug1990 2
1 22feb1991 .
2 28feb1991 1
2 15apr1991 .
2 20may1991 0
2 25may1991 0
2 01jun1991 2
2 18jun1991 2

I've tried absolutely everything (merge, update, SQL etc.), unfortunately none of them yields the desired results. I'll be really grateful if anyone could help me out!

 

Thank you so much!

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User


data my_data;
input ID Birthday date9. tpin;
format Birthday date9.;
label tpin="Temporary Personal Identification Number";
cards;
1 31mar1990 .
1 13apr1990 .
1 20apr1990 0
1 04jun1990 .
1 08jun1990 .
1 19jun1990 1
1 01jul1990 .
1 06aug1990 .
1 13aug1990 2
1 22feb1991 .
2 28feb1991 1
2 15apr1991 .
2 20may1991 0
2 25may1991 .
2 01jun1991 .
2 18jun1991 2
;
run;
proc sort data=my_data;by id Birthday;run;
data temp;
 set my_data;
 by id;
 dif_month=Birthday gt intnx('month',lag(Birthday),1,'s') ;
 if first.id then dif_month=.;
run;
data temp1;
 set temp;
 by id;
 if first.id or dif_month=1 then group+1;
 drop dif_month;
run;
data temp1;
 set temp1;
 by id group;
 retain temp;
 if first.group then call missing(temp);
 if not missing(tpin) then temp=tpin;
run;
proc sort data=temp1;by id descending Birthday;run;
data want;
 set temp1;
 by id group notsorted;
 retain want;
 if first.group then call missing(want);
 if not missing(temp) then want=temp;
 drop temp ;
run;
proc sort data=want;by id Birthday;run;



View solution in original post

4 REPLIES 4
ballardw
Super User

You don't describe the rule for getting 31Mar1990 and 13Apr1990 to 0.

Also, are your dates actually SAS date values or character? From what you are displaying I would guess character.

When you have SAS date values there are lots of tools to calculate intervals and such.

expertyejin
Obsidian | Level 7

Dates are definitely in numeric (date9.) format. 

 

And as far as the rule is concerned, the tpin value for 31Mar1990 is assigned 0 because it is less than a month from 13Apr1990, which itself had been assigned a 0 because it is less than a month from 20Apr1990.

Ksharp
Super User


data my_data;
input ID Birthday date9. tpin;
format Birthday date9.;
label tpin="Temporary Personal Identification Number";
cards;
1 31mar1990 .
1 13apr1990 .
1 20apr1990 0
1 04jun1990 .
1 08jun1990 .
1 19jun1990 1
1 01jul1990 .
1 06aug1990 .
1 13aug1990 2
1 22feb1991 .
2 28feb1991 1
2 15apr1991 .
2 20may1991 0
2 25may1991 .
2 01jun1991 .
2 18jun1991 2
;
run;
proc sort data=my_data;by id Birthday;run;
data temp;
 set my_data;
 by id;
 dif_month=Birthday gt intnx('month',lag(Birthday),1,'s') ;
 if first.id then dif_month=.;
run;
data temp1;
 set temp;
 by id;
 if first.id or dif_month=1 then group+1;
 drop dif_month;
run;
data temp1;
 set temp1;
 by id group;
 retain temp;
 if first.group then call missing(temp);
 if not missing(tpin) then temp=tpin;
run;
proc sort data=temp1;by id descending Birthday;run;
data want;
 set temp1;
 by id group notsorted;
 retain want;
 if first.group then call missing(want);
 if not missing(temp) then want=temp;
 drop temp ;
run;
proc sort data=want;by id Birthday;run;



expertyejin
Obsidian | Level 7

How amazing! I can't believe it works! Thank you so much @Ksharp !

sas-innovate-white.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 1744 views
  • 1 like
  • 3 in conversation