09-19-2014 02:27 PM
Can someone help me with data merge by date intervals?
I am looking to do this with the data step.
I am trying to merge Table1 with Table 2 to produce Final Table
I need the program codes (Table2) merged into location(Table1) within the effective dates at their location (Table1) So in the example I have customer id #1 at two different locations and I want to merge the program (Table 2) as they are effective at their respective location. Can someone help me? Thank you.
09-19-2014 04:58 PM
Two questions come to mind:
(1) Are START and END already stored as numerics on SAS's date scale?
(2) How would you handle a case of overlaps? For example, what if the first observation in Table 2 had start = 1/10/2013 and end = 7/5/2013? Would you want two locations?
The programming might be short, but the rules are important here.
09-19-2014 05:04 PM
1) I am able to convert the start and end dates to SAS dates if that will make the programming easier.
2) Yes- I would want the two different locations should the dates from Table 1 overlap Table 2
09-19-2014 05:15 PM
OK. I would imagine one location would be based on the start date, and the other on the end date. If a group spans 3 locations, that would not be captured here. (It could, but would add complexity.)
I would convert Table 1 to a format. Each range in the format would define ID + Start date through ID + End date. For example:
set table1 (rename=(start=start_dt end=end_dt location=label));
retain fmtname '$locate';
if end_dt=. then end_dt=9999999;
start = id || ' ' || put(start_dt, 7.);
end = id || ' ' || put(end_dt, 7.);
proc format cntlin=create_format;
Then use the format in a DATA step:
location1 = put(id || ' ' || put(start, 7.), $locate.);
location 2 = put(id || ' ' || put(end, 7.), $locate.);
The program is short, but not necessarily easy ... see if it makes sense to you. Some of the renaming is necessary because PROC FORMAT expects to use START, END, and LABEL as reserved variable names.
09-20-2014 08:12 AM
If I understand what you mean correctly.
data Table1 ; input id start : mmddyy10. end : mmddyy10. location $; format start end mmddyy10.; cards; 1 1/10/2013 6/30/2013 ap 1 7/1/2013 . ap2 ; run; data Table2 ; input id start : mmddyy10. end : mmddyy10. program $; format start end mmddyy10.; cards; 1 1/10/2013 2/5/2013 cop1 1 7/5/2013 10/31/2013 cop2 1 7/5/2013 . cop3 ; run; proc stdize data=table1 out=t1 missing=999999 reponly; var end; run; proc stdize data=table2 out=t2 missing=999999 reponly; var end; run; proc sql; create table want as select a.id,a.start,case when a.end=999999 then . else a.end end as end format=mmddyy10.,a.location ,b.program from t1 as a left join t2 as b on a.id=b.id and a.start le b.end and b.start le a.end; quit;
09-22-2014 12:34 PM
I was wondering if this can be accomplished using the Data Step and using a merge like below- where I can control the counts of the various merges?
DATA YFYS YFNS NFYS;
MERGE table1(in=a) table2_date1(in=b); by id ;
IF A AND B THEN OUTPUT YFYS;
IF A=1 AND B=0 THEN OUTPUT YFNS;
IF A=0 AND B=1 THEN OUTPUT NFYS; RUN;
09-23-2014 07:49 AM
No. I don't think so. Since you need to match an obs of Table1 to all of obs of Table2 , which means you need a Cartesian Product , this can't be done by Merge statement. However, You can use data step's Cartesian Product to get it like : ( not as efficient as SQL's)
do i=1 to nobs;
set Table2 nobs=nobs point=i ;