Hello SAS community,
I am desperately trying to figure out how to replace missing values with non-missing values based on a given condition. To be more precise, we fill in a missing value if the time difference between its corresponding date and adjacent date is less than a month; and we leave it as missing if it's one month or over apart. For instance, the following dot should be replaced with 1 since the time difference is less than a month:
09jan2016 1
15jan2016 .
On the other hand, the following dot should be left as missing as the two events happened over a span of more than one month:
21may2016 .
10jul2016 1
Now, here is a subset of the data set I'm dealing with in my workplace:
data my_data;
input ID Birthday date9. tpin;
label tpin="Temporary Personal Identification Number";
cards;
1 31mar1990 .
1 13apr1990 .
1 20apr1990 0
1 04jun1990 .
1 08jun1990 .
1 19jun1990 1
1 01jul1990 .
1 06aug1990 .
1 13aug1990 2
1 22feb1991 .
2 28feb1991 1
2 15apr1991 .
2 20may1991 0
2 25may1991 .
2 01jun1991 .
2 18jun1991 2
;
run;
If you've understood the rules to apply, the final table should look like this after an up and down algorithm, for each ID:
1 31mar1990 0
1 13apr1990 0
1 20apr1990 0
1 04jun1990 1
1 08jun1990 1
1 19jun1990 1
1 01jul1990 1
1 06aug1990 2
1 13aug1990 2
1 22feb1991 .
2 28feb1991 1
2 15apr1991 .
2 20may1991 0
2 25may1991 0
2 01jun1991 2
2 18jun1991 2
I've tried absolutely everything (merge, update, SQL etc.), unfortunately none of them yields the desired results. I'll be really grateful if anyone could help me out!
Thank you so much!
data my_data; input ID Birthday date9. tpin; format Birthday date9.; label tpin="Temporary Personal Identification Number"; cards; 1 31mar1990 . 1 13apr1990 . 1 20apr1990 0 1 04jun1990 . 1 08jun1990 . 1 19jun1990 1 1 01jul1990 . 1 06aug1990 . 1 13aug1990 2 1 22feb1991 . 2 28feb1991 1 2 15apr1991 . 2 20may1991 0 2 25may1991 . 2 01jun1991 . 2 18jun1991 2 ; run; proc sort data=my_data;by id Birthday;run; data temp; set my_data; by id; dif_month=Birthday gt intnx('month',lag(Birthday),1,'s') ; if first.id then dif_month=.; run; data temp1; set temp; by id; if first.id or dif_month=1 then group+1; drop dif_month; run; data temp1; set temp1; by id group; retain temp; if first.group then call missing(temp); if not missing(tpin) then temp=tpin; run; proc sort data=temp1;by id descending Birthday;run; data want; set temp1; by id group notsorted; retain want; if first.group then call missing(want); if not missing(temp) then want=temp; drop temp ; run; proc sort data=want;by id Birthday;run;
You don't describe the rule for getting 31Mar1990 and 13Apr1990 to 0.
Also, are your dates actually SAS date values or character? From what you are displaying I would guess character.
When you have SAS date values there are lots of tools to calculate intervals and such.
Dates are definitely in numeric (date9.) format.
And as far as the rule is concerned, the tpin value for 31Mar1990 is assigned 0 because it is less than a month from 13Apr1990, which itself had been assigned a 0 because it is less than a month from 20Apr1990.
data my_data; input ID Birthday date9. tpin; format Birthday date9.; label tpin="Temporary Personal Identification Number"; cards; 1 31mar1990 . 1 13apr1990 . 1 20apr1990 0 1 04jun1990 . 1 08jun1990 . 1 19jun1990 1 1 01jul1990 . 1 06aug1990 . 1 13aug1990 2 1 22feb1991 . 2 28feb1991 1 2 15apr1991 . 2 20may1991 0 2 25may1991 . 2 01jun1991 . 2 18jun1991 2 ; run; proc sort data=my_data;by id Birthday;run; data temp; set my_data; by id; dif_month=Birthday gt intnx('month',lag(Birthday),1,'s') ; if first.id then dif_month=.; run; data temp1; set temp; by id; if first.id or dif_month=1 then group+1; drop dif_month; run; data temp1; set temp1; by id group; retain temp; if first.group then call missing(temp); if not missing(tpin) then temp=tpin; run; proc sort data=temp1;by id descending Birthday;run; data want; set temp1; by id group notsorted; retain want; if first.group then call missing(want); if not missing(temp) then want=temp; drop temp ; run; proc sort data=want;by id Birthday;run;
How amazing! I can't believe it works! Thank you so much @Ksharp !
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.