SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

Filling up and down missing values with non-missing values

Accepted Solution Solved
Reply
Contributor
Posts: 26
Accepted Solution

Filling up and down missing values with non-missing values

Hello SAS community,

 

I am desperately trying to figure out how to replace missing values with non-missing values based on a given condition. To be more precise, we fill in a missing value if the time difference between its corresponding date and adjacent date is less than a month; and we leave it as missing if it's one month or over apart. For instance, the following dot should be replaced with 1 since the time difference is less than a month: 

 

09jan2016 1
15jan2016 .

On the other hand, the following dot should be left as missing as the two events happened over a span of more than one month:

 

 

 

21may2016 .
10jul2016 1

Now, here is a subset of the data set I'm dealing with in my workplace:

 

data my_data;
input ID Birthday date9. tpin;
label tpin="Temporary Personal Identification Number";
cards;
1 31mar1990 .
1 13apr1990 .
1 20apr1990 0
1 04jun1990 .
1 08jun1990 .
1 19jun1990 1
1 01jul1990 .
1 06aug1990 .
1 13aug1990 2
1 22feb1991 .
2 28feb1991 1
2 15apr1991 .
2 20may1991 0
2 25may1991 .
2 01jun1991 .
2 18jun1991 2
;
run;

If you've understood the rules to apply, the final table should look like this after an up and down algorithm, for each ID:

 

1 31mar1990 0
1 13apr1990 0
1 20apr1990 0
1 04jun1990 1
1 08jun1990 1
1 19jun1990 1
1 01jul1990 1
1 06aug1990 2
1 13aug1990 2
1 22feb1991 .
2 28feb1991 1
2 15apr1991 .
2 20may1991 0
2 25may1991 0
2 01jun1991 2
2 18jun1991 2

I've tried absolutely everything (merge, update, SQL etc.), unfortunately none of them yields the desired results. I'll be really grateful if anyone could help me out!

 

Thank you so much!


Accepted Solutions
Solution
‎08-30-2016 01:09 AM
Super User
Posts: 10,041

Re: Filling up and down missing values with non-missing values

Posted in reply to expertyejin


data my_data;
input ID Birthday date9. tpin;
format Birthday date9.;
label tpin="Temporary Personal Identification Number";
cards;
1 31mar1990 .
1 13apr1990 .
1 20apr1990 0
1 04jun1990 .
1 08jun1990 .
1 19jun1990 1
1 01jul1990 .
1 06aug1990 .
1 13aug1990 2
1 22feb1991 .
2 28feb1991 1
2 15apr1991 .
2 20may1991 0
2 25may1991 .
2 01jun1991 .
2 18jun1991 2
;
run;
proc sort data=my_data;by id Birthday;run;
data temp;
 set my_data;
 by id;
 dif_month=Birthday gt intnx('month',lag(Birthday),1,'s') ;
 if first.id then dif_month=.;
run;
data temp1;
 set temp;
 by id;
 if first.id or dif_month=1 then group+1;
 drop dif_month;
run;
data temp1;
 set temp1;
 by id group;
 retain temp;
 if first.group then call missing(temp);
 if not missing(tpin) then temp=tpin;
run;
proc sort data=temp1;by id descending Birthday;run;
data want;
 set temp1;
 by id group notsorted;
 retain want;
 if first.group then call missing(want);
 if not missing(temp) then want=temp;
 drop temp ;
run;
proc sort data=want;by id Birthday;run;



View solution in original post


All Replies
Super User
Posts: 11,343

Re: Filling up and down missing values with non-missing values

Posted in reply to expertyejin

You don't describe the rule for getting 31Mar1990 and 13Apr1990 to 0.

Also, are your dates actually SAS date values or character? From what you are displaying I would guess character.

When you have SAS date values there are lots of tools to calculate intervals and such.

Contributor
Posts: 26

Re: Filling up and down missing values with non-missing values

Dates are definitely in numeric (date9.) format. 

 

And as far as the rule is concerned, the tpin value for 31Mar1990 is assigned 0 because it is less than a month from 13Apr1990, which itself had been assigned a 0 because it is less than a month from 20Apr1990.

Solution
‎08-30-2016 01:09 AM
Super User
Posts: 10,041

Re: Filling up and down missing values with non-missing values

Posted in reply to expertyejin


data my_data;
input ID Birthday date9. tpin;
format Birthday date9.;
label tpin="Temporary Personal Identification Number";
cards;
1 31mar1990 .
1 13apr1990 .
1 20apr1990 0
1 04jun1990 .
1 08jun1990 .
1 19jun1990 1
1 01jul1990 .
1 06aug1990 .
1 13aug1990 2
1 22feb1991 .
2 28feb1991 1
2 15apr1991 .
2 20may1991 0
2 25may1991 .
2 01jun1991 .
2 18jun1991 2
;
run;
proc sort data=my_data;by id Birthday;run;
data temp;
 set my_data;
 by id;
 dif_month=Birthday gt intnx('month',lag(Birthday),1,'s') ;
 if first.id then dif_month=.;
run;
data temp1;
 set temp;
 by id;
 if first.id or dif_month=1 then group+1;
 drop dif_month;
run;
data temp1;
 set temp1;
 by id group;
 retain temp;
 if first.group then call missing(temp);
 if not missing(tpin) then temp=tpin;
run;
proc sort data=temp1;by id descending Birthday;run;
data want;
 set temp1;
 by id group notsorted;
 retain want;
 if first.group then call missing(want);
 if not missing(temp) then want=temp;
 drop temp ;
run;
proc sort data=want;by id Birthday;run;



Contributor
Posts: 26

Re: Filling up and down missing values with non-missing values

How amazing! I can't believe it works! Thank you so much @Ksharp !

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 414 views
  • 1 like
  • 3 in conversation