- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi. I'm creating 2 datasets named HAVE1 and HAVE2. Each have the same variables except for the last unique column in each (QCODE and MTYPE).
I sort each by BENE and then use the Merge statement to join the datasets by BENE and see both dataset's last columns in the output and all the BENE values as desired (see below).
However, when trying to replicated this using Proc Sql in my final coding step I don't see the data from both datasets (also see below). Is there a way to replicate the SAS Merge using Proc SQL? Any help would be appreciated. Thanks.
Code:
DATA HAVE1;
INPUT BENE $ CLM_LINE_FROM_DT : DATE9. HCPCS_CD $ CLM_BLG_PRVDR_NPI_NUM $10. PRVDR_ENRLMT_TIN_NUM $ QCODE $;
FORMAT CLM_LINE_FROM_DT DATE9. CLM_BLG_PRVDR_NPI_NUM $10.;
DATALINES;
92345 11NOV2015 C3456 1234567890 3234567 FGH
12345 23OCT2010 A1234 1234567890 1234567 FGH
92345 02NOV2015 B2345 1234567890 2234567 FGH
92345 11NOV2015 B2345 9234567890 2234567 FGH
12345 11NOV2015 B2345 1234567890 2234567 FGH
12345 11NOV2015 C3456 1234567890 3234567 FGH
12345 15DEC2011 C3456 1234567890 4234567 FGH
12345 11NOV2015 E5678 1234567891 9234567 FGH
12345 11NOV2015 F5678 1234567891 5234567 FGH
92345 23OCT2010 A1234 1234567890 1234567 FGH
;
RUN;
DATA HAVE2;
INPUT BENE $ CLM_LINE_FROM_DT : DATE9. HCPCS_CD $ CLM_BLG_PRVDR_NPI_NUM $10. PRVDR_ENRLMT_TIN_NUM $ MTYPE $;
FORMAT CLM_LINE_FROM_DT DATE9. CLM_BLG_PRVDR_NPI_NUM $10.;
DATALINES;
27653 11NOV2015 C3456 7765467890 3234567 ABC
77654 23OCT2010 A1234 7765467890 7765467 ABC
27653 02NOV2015 B2345 7765467890 2234567 ABC
27653 11NOV2015 B2345 2765367890 2234567 ABC
77654 11NOV2015 B2345 7765467890 2234567 ABC
77654 11NOV2015 C3456 7765467890 3234567 ABC
77654 15DEC2011 C3456 7765467890 4234567 ABC
77654 11NOV2015 E5678 7765467891 2765367 ABC
77654 11NOV2015 F5678 7765467891 5234567 ABC
27653 23OCT2010 A1234 7765467890 7765467 ABC
;
RUN;
PROC SORT DATA=HAVE1; BY BENE; RUN;
PROC SORT DATA=HAVE2; BY BENE; RUN;
DATA WANT1;
MERGE HAVE1 HAVE2;
BY BENE;
RUN;
PROC SQL;
CREATE TABLE WANT2 AS
SELECT T1.*
,T2.*
FROM HAVE1 AS T1
LEFT JOIN HAVE2 AS T2
ON T1.BENE = T2.BENE
;QUIT;
Desired Output Using Merge Statement:
BENE CLM_LINE_FROM_DT HCPCS_CD CLM_BLG_PRVDR_NPI_NUM PRVDR_ENRLMT_TIN_NUM QCODE MTYPE
12345 23OCT2010 A1234 1234567890 1234567 FGH
12345 11NOV2015 B2345 1234567890 2234567 FGH
12345 11NOV2015 C3456 1234567890 3234567 FGH
12345 15DEC2011 C3456 1234567890 4234567 FGH
12345 11NOV2015 E5678 1234567891 9234567 FGH
12345 11NOV2015 F5678 1234567891 5234567 FGH
27653 11NOV2015 C3456 7765467890 3234567 ABC
27653 02NOV2015 B2345 7765467890 2234567 ABC
27653 11NOV2015 B2345 2765367890 2234567 ABC
27653 23OCT2010 A1234 7765467890 7765467 ABC
77654 23OCT2010 A1234 7765467890 7765467 ABC
77654 11NOV2015 B2345 7765467890 2234567 ABC
77654 11NOV2015 C3456 7765467890 3234567 ABC
77654 15DEC2011 C3456 7765467890 4234567 ABC
77654 11NOV2015 E5678 7765467891 2765367 ABC
77654 11NOV2015 F5678 7765467891 5234567 ABC
92345 11NOV2015 C3456 1234567890 3234567 FGH
92345 02NOV2015 B2345 1234567890 2234567 FGH
92345 11NOV2015 B2345 9234567890 2234567 FGH
92345 23OCT2010 A1234 1234567890 1234567 FGH
Undesired Results Using Proc Sql
BENE CLM_LINE_FROM_DT HCPCS_CD CLM_BLG_PRVDR_NPI_NUM PRVDR_ENRLMT_TIN_NUM QCODE MTYPE
12345 23OCT2010 A1234 1234567890 1234567 FGH
12345 11NOV2015 B2345 1234567890 2234567 FGH
12345 11NOV2015 C3456 1234567890 3234567 FGH
12345 15DEC2011 C3456 1234567890 4234567 FGH
12345 11NOV2015 E5678 1234567891 9234567 FGH
12345 11NOV2015 F5678 1234567891 5234567 FGH
92345 11NOV2015 C3456 1234567890 3234567 FGH
92345 02NOV2015 B2345 1234567890 2234567 FGH
92345 11NOV2015 B2345 9234567890 2234567 FGH
92345 23OCT2010 A1234 1234567890 1234567 FGH
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Try OUTER UNION CORR
proc sql;
create table want as
select * from have1
outer union corr select * from have2;
quit;
or just outer union
proc sql;
create table want as
select * from have1
outer union select * from have2;
quit;
One of the above should work I think
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Use Full join instead of Left join.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the suggestion. I tried this but the results are still incomplete as I'm missing some BENE and their corresponding values that do appear when using SAS Merge:
PROC SQL;CREATE TABLE WANT2 ASSELECT T1.*,T2.*FROM HAVE1 AS T1FULL JOIN HAVE2 AS T2ON T1.BENE = T2.BENE;QUIT;BENE CLM_LINE_FROM_DT HCPCS_CD CLM_BLG_PRVDR_NPI_NUM PRVDR_ENRLMT_TIN_NUM QCODE MTYPE
12345 23OCT2010 A1234 1234567890 1234567 FGH
12345 11NOV2015 B2345 1234567890 2234567 FGH
12345 11NOV2015 C3456 1234567890 3234567 FGH
12345 15DEC2011 C3456 1234567890 4234567 FGH
12345 11NOV2015 E5678 1234567891 9234567 FGH
12345 11NOV2015 F5678 1234567891 5234567 FGH
. ABC
. ABC
. ABC
. ABC
. ABC
. ABC
. ABC
. ABC
. ABC
. ABC
92345 11NOV2015 C3456 1234567890 3234567 FGH
92345 02NOV2015 B2345 1234567890 2234567 FGH
92345 11NOV2015 B2345 9234567890 2234567 FGH
92345 23OCT2010 A1234 1234567890 1234567 FGH
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
proc sql;
create table want as
select * from have1
union select * from have2;
quit;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hmm...close, but this merges the two unique columns (QCODE and MTYPE) into one column (QCODE) instead of keeping each column separated as QCODE and MTYPE (like in the SAS Merge 'desired' output example)
BENE CLM_LINE_FROM_DT HCPCS_CD CLM_BLG_PRVDR_NPI_NUM PRVDR_ENRLMT_TIN_NUM QCODE
12345 23OCT2010 A1234 1234567890 1234567 FGH
12345 15DEC2011 C3456 1234567890 4234567 FGH
12345 11NOV2015 B2345 1234567890 2234567 FGH
12345 11NOV2015 C3456 1234567890 3234567 FGH
12345 11NOV2015 E5678 1234567891 9234567 FGH
12345 11NOV2015 F5678 1234567891 5234567 FGH
27653 23OCT2010 A1234 7765467890 7765467 ABC
27653 02NOV2015 B2345 7765467890 2234567 ABC
27653 11NOV2015 B2345 2765367890 2234567 ABC
27653 11NOV2015 C3456 7765467890 3234567 ABC
77654 23OCT2010 A1234 7765467890 7765467 ABC
77654 15DEC2011 C3456 7765467890 4234567 ABC
77654 11NOV2015 B2345 7765467890 2234567 ABC
77654 11NOV2015 C3456 7765467890 3234567 ABC
77654 11NOV2015 E5678 7765467891 2765367 ABC
77654 11NOV2015 F5678 7765467891 5234567 ABC
92345 23OCT2010 A1234 1234567890 1234567 FGH
92345 02NOV2015 B2345 1234567890 2234567 FGH
92345 11NOV2015 B2345 9234567890 2234567 FGH
92345 11NOV2015 C3456 1234567890 3234567 FGH
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Try OUTER UNION CORR
proc sql;
create table want as
select * from have1
outer union corr select * from have2;
quit;
or just outer union
proc sql;
create table want as
select * from have1
outer union select * from have2;
quit;
One of the above should work I think
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Perfect! Thank you so much. I appreciate your suggestions and time.
proc sql;
create table want as
select * from have1
outer union corr select * from have2;
quit;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You're welcome. It's good to recall my college days when we used to play with SET operators and Venn diagrams. I wish I could reverse the clock again while my hair is going grey. Have fun!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Can you clarify if you are trying to do a one to one merge (join) ? Or is it one to many, and if so which dataset should have unique values of BY variable and which can have duplicates?
If you are doing a many to many merge with SAS data step then your results are probably gibberish.