BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
buddha_d
Pyrite | Level 9

Fellow SAS Programmers,

             I am working on real time data and I came across an issue that I don't know how to code. I am presenting my sample data below. The real world data is more complex with many more fields. 

I have 2 tables with have1 and have2. Have1 table has both primary and secondary customers information. Whereas, Have2 table has Account number and Customers names.  I have some records that have SSN. I made some dummy SSNs. 

 

proc sql;
create table WORK.have1( bufsize=65536 )
  (
   primary_customer_first_name char(50) format=$50. informat=$50. label='primary_customer_first_name',
   primary_customer_last_name char(50) format=$50. informat=$50. label='primary_customer_last_name',
   primary_customer_SSN char(50) format=$50. informat=$50. label='primary_customer_SSN',
   secondary_customer_SSN char(50) format=$50. informat=$50. label='secondary_customer_SSN',
   SECONDARY_CUSTOMER_FIRST_NAME char(50) format=$50. informat=$50. label='SECONDARY_CUSTOMER_FIRST_NAME',
   SECONDARY_CUSTOMER_LAST_NAME char(50) format=$50. informat=$50. label='SECONDARY_CUSTOMER_LAST_NAME',
   SECONDARY_CUSTOMER_MID_NAME char(50) format=$50. informat=$50. label='SECONDARY_CUSTOMER_MID_NAME'
  );

insert into WORK.have1
values('CHARLES','SMITH','869ABC547','851ABC145','ALFRED','FLOYD','NELSON')
values('**bleep**','JOHNSON','865ABC997','189DEF589','JULIA','ROBERTS','ROSE')
values('PAUL','BAKER','567ABC598','963DEF546','WILLIAM','NELSON','' )
values('MICHAEL','HOLLMAN','867ABC596','523DEF951','JOHN','PARKER','LEE' )
values('MARK','CAMPBELL','475ABC862','321DEF698','THOMAS','CRUZ','JOSEPH' )   
values('BEN','COLLINS JR','785ABC456','865DEF864','KATHERINE','JONES','ZETA' )
values('CHRISTOPHER','EVANS','','587DEF521','JACKIE','PHILLIPS','')
;
quit;

proc sql;
create table WORK.have2( bufsize=65536 )
  (
   acct_num char(50) format=$50. informat=$50. label='acct_num',
   CUST_NAME char(50) format=$50. informat=$50. label='CUST_NAME',
   CUST_FIRST_NAME char(50) format=$50. informat=$50. label='CUST_FIRST_NAME',
   CUST_MIDDLE_NAME char(50) format=$50. informat=$50. label='CUST_MIDDLE_NAME',
   CUST_LAST_NAME char(50) format=$50. informat=$50. label='CUST_LAST_NAME',
   CUST_TAX_ID char(50) format=$50. informat=$50. label='CUST_TAX_ID'
  );
insert into WORK.have2
values('123456','JACKIE ANN PHILLIPS','JACKIE','ANN','PHILLIPS','587DEF521' )
values('123456','CHRISTOPHER ALAN EVANS','CHRISTOPHER','ALAN','EVANS','487ABC784' )
values('123456','TONY KAY WONG','TONY','KAY','WONG','444DEF555' )
values('654321','CHARLES DELYNN SMITH','CHARLES','DELYNN','SMITH','869ABC547' )
values('654321','ALLAN CHRISTOPHER ELLIOTT','ALLAN','CHRISTOPHER','ELLIOTT','0' )
values('654321','ALFRED NELSON FLOYD','ALFRED','NELSON','FLOYD','851ABC145' )
values('654321','JARED WATSON','JARED','','WATSON','0' )
values('654321','JOHNNY CASH','JOHNNY','','CASH','999DEF888' )
values('456789','CHRISTINE MAE FOX','CHRISTINE','MAE','FOX','111111111' )
values('456789','JAY SITA PATEL','JAY','SITA','PATEL','222222222' )
values('456789','DAVID EDWARD JENKINS','DAVID','EDWARD','JENKINS','444444444' )
values('456789','GEORGE ROBERT BELL','GEORGE','ROBERT','BELL','666666666' )
values('987654','BRIAN WEST','BRIAN','','WEST','123456789' )
values('987654','ANTHONY JAMES FORD','ANTHONY','JAMES','FORD','456124786' )
values('987654','JEFF PRITCHARD','JEFF','','PRITCHARD','458754688' )
values('987654','ERIC ALAN JOHNSON','ERIC','ALAN','JOHNSON','121212121' )
values('987654','PETER JAY PARKER','PETER','JAY','PARKER','999999999' )
values('987654','HAROLD EDWARD WEBB','HAROLD','EDWARD','WEBB','888888888' )
;
quit;

I need to match first and last names of have1 to first and last names of have2 table (as have2 table contains all the customers/account holders linked to an account number). If primary customer first & last names  and secondary customer first & last names ( from have1) match with cust_first_name & cust_last_name then I want to output those results. The challenge is we have to compare rows to columns and I can't think of a way to do this. One table sometimes have middle name and other times it doesn't have middle name, but first and last names are always populated. 

             The output result should have CHARLES SMITH & CHRISTOPHER EVANS records as they have matched the criteria mentioned above.  The SAS OUTPUT (want1 or want2) should look like this below. 

 

create table WORK.WANT1( bufsize=65536 )
  (
   primary_customer_first_name char(50) format=$50. informat=$50. label='primary_customer_first_name',
   primary_customer_last_name char(50) format=$50. informat=$50. label='primary_customer_last_name',
   primary_customer_SSN char(50) format=$50. informat=$50. label='primary_customer_SSN',
   secondary_customer_SSN char(50) format=$50. informat=$50. label='secondary_customer_SSN',
   SECONDARY_CUSTOMER_FIRST_NAME char(50) format=$50. informat=$50. label='SECONDARY_CUSTOMER_FIRST_NAME',
   SECONDARY_CUSTOMER_LAST_NAME char(50) format=$50. informat=$50. label='SECONDARY_CUSTOMER_LAST_NAME',
   SECONDARY_CUSTOMER_MID_NAME char(50) format=$50. informat=$50. label='SECONDARY_CUSTOMER_MID_NAME'
  );

insert into WORK.WANT1
values('CHARLES','SMITH','869ABC547','851ABC145','ALFRED','FLOYD','NELSON')
values('CHRISTOPHER','EVANS','','587DEF521','JACKIE','PHILLIPS','')
;
quit;

proc sql;
create table WORK.WANT2( bufsize=65536 )
  (
   acct_num char(50) format=$50. informat=$50. label='acct_num',
   CUST_NAME char(50) format=$50. informat=$50. label='CUST_NAME',
   CUST_FIRST_NAME char(50) format=$50. informat=$50. label='CUST_FIRST_NAME',
   CUST_MIDDLE_NAME char(50) format=$50. informat=$50. label='CUST_MIDDLE_NAME',
   CUST_LAST_NAME char(50) format=$50. informat=$50. label='CUST_LAST_NAME',
   CUST_TAX_ID char(50) format=$50. informat=$50. label='CUST_TAX_ID'
  );
insert into WORK.WANT2
values('123456','JACKIE ANN PHILLIPS','JACKIE','ANN','PHILLIPS','587DEF521' )
values('123456','CHRISTOPHER ALAN EVANS','CHRISTOPHER','ALAN','EVANS','487ABC784' )
values('123456','TONY KAY WONG','TONY','KAY','WONG','444DEF555' )
values('654321','CHARLES DELYNN SMITH','CHARLES','DELYNN','SMITH','869ABC547' )
values('654321','ALLAN CHRISTOPHER ELLIOTT','ALLAN','CHRISTOPHER','ELLIOTT','0' )
values('654321','ALFRED NELSON FLOYD','ALFRED','NELSON','FLOYD','851ABC145' )
values('654321','JARED WATSON','JARED','','WATSON','0' )
values('654321','JOHNNY CASH','JOHNNY','','CASH','999DEF888' )
;
QUIT;

Please help me how to get the desired output. Thanks in advance. 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hello @buddha_d,

 

I'd suggest a PROC SQL step using the INTERSECT set operator:

proc sql;
create table want as
select a.*, acct_num from have1 a, have2 where primary_customer_first_name = cust_first_name
                                             & primary_customer_last_name  = cust_last_name
intersect
select a.*, acct_num from have1 a, have2 where secondary_customer_first_name = cust_first_name 
                                             & secondary_customer_last_name  = cust_last_name
;
quit;

The result should be your WANT1 dataset plus variable ACCT_NUM. By including ACCT_NUM in the two SELECT statements it is ensured that the matching "CUST_..." names belong to the same account (assuming that matches in different accounts would not be desired). Moreover, you can use ACCT_NUM as a key variable in a join of datasets HAVE2 and WANT to obtain your WANT2 dataset (up to sort order differences).

View solution in original post

2 REPLIES 2
FreelanceReinh
Jade | Level 19

Hello @buddha_d,

 

I'd suggest a PROC SQL step using the INTERSECT set operator:

proc sql;
create table want as
select a.*, acct_num from have1 a, have2 where primary_customer_first_name = cust_first_name
                                             & primary_customer_last_name  = cust_last_name
intersect
select a.*, acct_num from have1 a, have2 where secondary_customer_first_name = cust_first_name 
                                             & secondary_customer_last_name  = cust_last_name
;
quit;

The result should be your WANT1 dataset plus variable ACCT_NUM. By including ACCT_NUM in the two SELECT statements it is ensured that the matching "CUST_..." names belong to the same account (assuming that matches in different accounts would not be desired). Moreover, you can use ACCT_NUM as a key variable in a join of datasets HAVE2 and WANT to obtain your WANT2 dataset (up to sort order differences).

buddha_d
Pyrite | Level 9

Thank you FreelanceReinh.

 

Thanks again. 

SAS INNOVATE 2024

Innovate_SAS_Blue.png

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Get the $99 certification deal.jpg

 

 

Back in the Classroom!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 220 views
  • 1 like
  • 2 in conversation