For data sorted by client_id/segm_date, the technique I would use is to look forward to the upcoming segm_date to see if there will be holes to fill with missing values. The benefit of this approach is that you can first output the current non-missing values, and then set them to missing in advance of filling holes. Something like (using the sample data provided by @ballardw
data example;
input SEGM_DATE datetime18. CLIENT_ID :$9. AVG_WEIGHT;
format SEGM_DATE datetime20.;
datalines;
01MAR2012:00:00:00 1-ET-1500 8
01APR2012:00:00:00 1-ET-1500 10
01JUN2012:00:00:00 1-ET-1500 13
01JAN2012:00:00:00 2-ET-1500 12
01JUN2012:00:00:00 2-ET-1500 18
run;
data want (drop=nxt_:);
set example (keep=client_id);
by client_id;
merge example
example (firstobs=2 keep=segm_date rename=(segm_date=nxt_date));
**** Other Needed Code Here ****;
output; /* Write out the non-missing values */
/* If there are upcoming holes, fill them with missing values. */
if last.client_id=0 and intck('dtmonth',segm_date,nxt_date)>1 then do; /*If holes follow ... */
call missing(avg_weight); /* Set the appropriate list of variables to missing */
do segm_date=intnx('dtmonth',segm_date,1,'same') by 0 while (segm_date<nxt_date);
output;
segm_date=intnx('dtmonth',segm_date,1,'same');
end;
end;
run;
The SET and BY statement combination is just to generate the first.client_id and last.client_id dummy vars. The MERGE statement merges the current obs with the next obs (but only the SEGM_DATE variable, renamed to NXT_DATE). Comparing SEGM_DATE to NXT_DATE provides a way to detect upcoming holes.
... View more