Merging two files with specific looping requierd

Reply
Contributor lei
Contributor
Posts: 37

Merging two files with specific looping requierd

I am not really sure where or how to start this merge.

I have two data tables and I need to merge these two based on dates and subject numbers.

Table 1 consists of the variables:

Subject_Num i.e. 123456

Date_Start i.e Jan1, 2013

Date_End i.e. Jan 31, 2013

Table 2 consists of the variables:

Subject_Num i.e. 123456

Date  i.e. November 2, 2012

Insurance_Status 1 i.e. Medicaid

Insurance_Status 2 i.e. Medicare

In Table 1, the data can look like this:

Subject_Num, Date_Start, Date_End

123456, 11/01/2011, 11/03/2011

123456, 12/04/2012, 01/04/2013

123456, 01/05/2013, 02/04/3013

In Table 2, each Subject_Num can have multiple roles of insurance status. For example a table for subject 123456, I can have

Subject_num, Date, Insurance_Status1, Insuance_Status2

123456, 11/02/2012, Medicaid, Medicare

123456, 10/05/2011, Medicaid, Private

I want the merged file to look like this:

Subject_Num, Date_Start, Date_End, Insurance_Status1, Insurance_Status2

123456, 11/01/2011, 11/03/2011, Medicaid, Medicare

123456, 12/04/2012, 01/04/2013, Medicaid, Private

123456, 01/05/2013, 02/04/3013, Medicaid, Private

I want the merge to search through Table 2, look for the same Subject_Num as Table 1, and then find the row of data that contains the largest date where Table2.Date>=Table1.Date_start.

I hope this make sense. Any help would be great. I think there might be looping involved, but I have no clue how to write this.

Trusted Advisor
Posts: 1,128

Re: Merging two files with specific looping requierd

Hi, Please try the below code

data table1;

  infile cards dlm=',';

  input Subject_Num Date_Start :ddmmyy10. Date_End :ddmmyy10.;

  format Date_Start Date_End date9.;

cards;

123456, 11/01/2011, 11/03/2011

123456, 12/04/2012, 01/04/2013

123456, 01/05/2013, 02/04/3013

;

data table2;

  infile cards dlm=',';

  input Subject_num Date :ddmmyy10. Insurance_Status1$ Insuance_Status2$;

  format date date9.;

  cards;

123456, 11/02/2012, Medicaid, Medicare

123456, 10/05/2011, Medicaid, Private

;

PROC SQL;

   CREATE TABLE QUERY_FOR_TABLE2 AS

   SELECT t1.Subject_num,

          t1.Date,

          t1.Insurance_Status1,

          t1.Insuance_Status2,

          t2.Subject_Num AS Subject_Num1,

          t2.Date_Start,

          t2.Date_End,

      FROM WORK.TABLE2 t1

           left JOIN WORK.TABLE1 t2 ON (t1.Subject_num = t2.Subject_Num)

    having Date >= t2.Date_Start;

QUIT;

i got the output something like below, if you want to get the rows with Table2.Date>=Table1.Date_start

Please let me know if this what you were expecting.

Thanks,

Jag

Thanks,
Jag
Occasional Contributor
Posts: 18

Re: Merging two files with specific looping requierd

Hi,

First off, I am not sure or rather confused whether your the dates you are working on are ddmmyy or mmddyy. However, i have assumed it is mmddyy. I don't think a loop is required, please correct my understanding if the below doesn;t work. So see if this is what you want-

data table1;

input subject_num $ date_start date_end ;

informat date_start mmddyy10.;

informat date_end mmddyy10.;

format date_start mmddyy10. date_end mmddyy10.;

datalines;

123456 11/01/2011 11/03/2011

123456 12/04/2012 01/04/2013

123456 01/05/2013 02/04/2013

;

data table2;

input Subject_num $ Date Insurance_Status1 $ Insuance_Status2 $;

informat date mmddyy10.;

format date mmddyy10.;

datalines;

123456 11/02/2012 Medicaid Medicare

123456 10/05/2011 Medicaid Private

;

proc sort date= table1;

by subject_num;

run;

proc sort data=table2;

by subject_num;

run;

data merged_file;

merge table1 table2;

by subject_num; /* if you run here, you get your merged file output you mentioned*/

if date>date_start;/* here is your condition*/

run;

Trusted Advisor
Posts: 1,128

Re: Merging two files with specific looping requierd

By using the merge statement in the current situation will generate the merge note in the log. In order to avoid this note it is good to use the proc sql;

if i consider the dates in mmddyy format, then i also get only one record.

Thanks,

Jag

Thanks,
Jag
Ask a Question
Discussion stats
  • 3 replies
  • 183 views
  • 0 likes
  • 3 in conversation