BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
expertyejin
Obsidian | Level 7

Hello SAS community,

 

I am desperately trying to figure out how to replace missing values with non-missing values based on a given condition. To be more precise, we fill in a missing value if the time difference between its corresponding date and adjacent date is less than a month; and we leave it as missing if it's one month or over apart. For instance, the following dot should be replaced with 1 since the time difference is less than a month: 

 

09jan2016 1
15jan2016 .

On the other hand, the following dot should be left as missing as the two events happened over a span of more than one month:

 

 

 

21may2016 .
10jul2016 1

Now, here is a subset of the data set I'm dealing with in my workplace:

 

data my_data;
input ID Birthday date9. tpin;
label tpin="Temporary Personal Identification Number";
cards;
1 31mar1990 .
1 13apr1990 .
1 20apr1990 0
1 04jun1990 .
1 08jun1990 .
1 19jun1990 1
1 01jul1990 .
1 06aug1990 .
1 13aug1990 2
1 22feb1991 .
2 28feb1991 1
2 15apr1991 .
2 20may1991 0
2 25may1991 .
2 01jun1991 .
2 18jun1991 2
;
run;

If you've understood the rules to apply, the final table should look like this after an up and down algorithm, for each ID:

 

1 31mar1990 0
1 13apr1990 0
1 20apr1990 0
1 04jun1990 1
1 08jun1990 1
1 19jun1990 1
1 01jul1990 1
1 06aug1990 2
1 13aug1990 2
1 22feb1991 .
2 28feb1991 1
2 15apr1991 .
2 20may1991 0
2 25may1991 0
2 01jun1991 2
2 18jun1991 2

I've tried absolutely everything (merge, update, SQL etc.), unfortunately none of them yields the desired results. I'll be really grateful if anyone could help me out!

 

Thank you so much!

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User


data my_data;
input ID Birthday date9. tpin;
format Birthday date9.;
label tpin="Temporary Personal Identification Number";
cards;
1 31mar1990 .
1 13apr1990 .
1 20apr1990 0
1 04jun1990 .
1 08jun1990 .
1 19jun1990 1
1 01jul1990 .
1 06aug1990 .
1 13aug1990 2
1 22feb1991 .
2 28feb1991 1
2 15apr1991 .
2 20may1991 0
2 25may1991 .
2 01jun1991 .
2 18jun1991 2
;
run;
proc sort data=my_data;by id Birthday;run;
data temp;
 set my_data;
 by id;
 dif_month=Birthday gt intnx('month',lag(Birthday),1,'s') ;
 if first.id then dif_month=.;
run;
data temp1;
 set temp;
 by id;
 if first.id or dif_month=1 then group+1;
 drop dif_month;
run;
data temp1;
 set temp1;
 by id group;
 retain temp;
 if first.group then call missing(temp);
 if not missing(tpin) then temp=tpin;
run;
proc sort data=temp1;by id descending Birthday;run;
data want;
 set temp1;
 by id group notsorted;
 retain want;
 if first.group then call missing(want);
 if not missing(temp) then want=temp;
 drop temp ;
run;
proc sort data=want;by id Birthday;run;



View solution in original post

4 REPLIES 4
ballardw
Super User

You don't describe the rule for getting 31Mar1990 and 13Apr1990 to 0.

Also, are your dates actually SAS date values or character? From what you are displaying I would guess character.

When you have SAS date values there are lots of tools to calculate intervals and such.

expertyejin
Obsidian | Level 7

Dates are definitely in numeric (date9.) format. 

 

And as far as the rule is concerned, the tpin value for 31Mar1990 is assigned 0 because it is less than a month from 13Apr1990, which itself had been assigned a 0 because it is less than a month from 20Apr1990.

Ksharp
Super User


data my_data;
input ID Birthday date9. tpin;
format Birthday date9.;
label tpin="Temporary Personal Identification Number";
cards;
1 31mar1990 .
1 13apr1990 .
1 20apr1990 0
1 04jun1990 .
1 08jun1990 .
1 19jun1990 1
1 01jul1990 .
1 06aug1990 .
1 13aug1990 2
1 22feb1991 .
2 28feb1991 1
2 15apr1991 .
2 20may1991 0
2 25may1991 .
2 01jun1991 .
2 18jun1991 2
;
run;
proc sort data=my_data;by id Birthday;run;
data temp;
 set my_data;
 by id;
 dif_month=Birthday gt intnx('month',lag(Birthday),1,'s') ;
 if first.id then dif_month=.;
run;
data temp1;
 set temp;
 by id;
 if first.id or dif_month=1 then group+1;
 drop dif_month;
run;
data temp1;
 set temp1;
 by id group;
 retain temp;
 if first.group then call missing(temp);
 if not missing(tpin) then temp=tpin;
run;
proc sort data=temp1;by id descending Birthday;run;
data want;
 set temp1;
 by id group notsorted;
 retain want;
 if first.group then call missing(want);
 if not missing(temp) then want=temp;
 drop temp ;
run;
proc sort data=want;by id Birthday;run;



expertyejin
Obsidian | Level 7

How amazing! I can't believe it works! Thank you so much @Ksharp !

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 1200 views
  • 1 like
  • 3 in conversation