BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
buddha_d
Pyrite | Level 9

Fellow SAS Programmers,

             I am working on real time data and I came across an issue that I don't know how to code. I am presenting my sample data below. The real world data is more complex with many more fields. 

I have 2 tables with have1 and have2. Have1 table has both primary and secondary customers information. Whereas, Have2 table has Account number and Customers names.  I have some records that have SSN. I made some dummy SSNs. 

 

proc sql;
create table WORK.have1( bufsize=65536 )
  (
   primary_customer_first_name char(50) format=$50. informat=$50. label='primary_customer_first_name',
   primary_customer_last_name char(50) format=$50. informat=$50. label='primary_customer_last_name',
   primary_customer_SSN char(50) format=$50. informat=$50. label='primary_customer_SSN',
   secondary_customer_SSN char(50) format=$50. informat=$50. label='secondary_customer_SSN',
   SECONDARY_CUSTOMER_FIRST_NAME char(50) format=$50. informat=$50. label='SECONDARY_CUSTOMER_FIRST_NAME',
   SECONDARY_CUSTOMER_LAST_NAME char(50) format=$50. informat=$50. label='SECONDARY_CUSTOMER_LAST_NAME',
   SECONDARY_CUSTOMER_MID_NAME char(50) format=$50. informat=$50. label='SECONDARY_CUSTOMER_MID_NAME'
  );

insert into WORK.have1
values('CHARLES','SMITH','869ABC547','851ABC145','ALFRED','FLOYD','NELSON')
values('**bleep**','JOHNSON','865ABC997','189DEF589','JULIA','ROBERTS','ROSE')
values('PAUL','BAKER','567ABC598','963DEF546','WILLIAM','NELSON','' )
values('MICHAEL','HOLLMAN','867ABC596','523DEF951','JOHN','PARKER','LEE' )
values('MARK','CAMPBELL','475ABC862','321DEF698','THOMAS','CRUZ','JOSEPH' )   
values('BEN','COLLINS JR','785ABC456','865DEF864','KATHERINE','JONES','ZETA' )
values('CHRISTOPHER','EVANS','','587DEF521','JACKIE','PHILLIPS','')
;
quit;

proc sql;
create table WORK.have2( bufsize=65536 )
  (
   acct_num char(50) format=$50. informat=$50. label='acct_num',
   CUST_NAME char(50) format=$50. informat=$50. label='CUST_NAME',
   CUST_FIRST_NAME char(50) format=$50. informat=$50. label='CUST_FIRST_NAME',
   CUST_MIDDLE_NAME char(50) format=$50. informat=$50. label='CUST_MIDDLE_NAME',
   CUST_LAST_NAME char(50) format=$50. informat=$50. label='CUST_LAST_NAME',
   CUST_TAX_ID char(50) format=$50. informat=$50. label='CUST_TAX_ID'
  );
insert into WORK.have2
values('123456','JACKIE ANN PHILLIPS','JACKIE','ANN','PHILLIPS','587DEF521' )
values('123456','CHRISTOPHER ALAN EVANS','CHRISTOPHER','ALAN','EVANS','487ABC784' )
values('123456','TONY KAY WONG','TONY','KAY','WONG','444DEF555' )
values('654321','CHARLES DELYNN SMITH','CHARLES','DELYNN','SMITH','869ABC547' )
values('654321','ALLAN CHRISTOPHER ELLIOTT','ALLAN','CHRISTOPHER','ELLIOTT','0' )
values('654321','ALFRED NELSON FLOYD','ALFRED','NELSON','FLOYD','851ABC145' )
values('654321','JARED WATSON','JARED','','WATSON','0' )
values('654321','JOHNNY CASH','JOHNNY','','CASH','999DEF888' )
values('456789','CHRISTINE MAE FOX','CHRISTINE','MAE','FOX','111111111' )
values('456789','JAY SITA PATEL','JAY','SITA','PATEL','222222222' )
values('456789','DAVID EDWARD JENKINS','DAVID','EDWARD','JENKINS','444444444' )
values('456789','GEORGE ROBERT BELL','GEORGE','ROBERT','BELL','666666666' )
values('987654','BRIAN WEST','BRIAN','','WEST','123456789' )
values('987654','ANTHONY JAMES FORD','ANTHONY','JAMES','FORD','456124786' )
values('987654','JEFF PRITCHARD','JEFF','','PRITCHARD','458754688' )
values('987654','ERIC ALAN JOHNSON','ERIC','ALAN','JOHNSON','121212121' )
values('987654','PETER JAY PARKER','PETER','JAY','PARKER','999999999' )
values('987654','HAROLD EDWARD WEBB','HAROLD','EDWARD','WEBB','888888888' )
;
quit;

I need to match first and last names of have1 to first and last names of have2 table (as have2 table contains all the customers/account holders linked to an account number). If primary customer first & last names  and secondary customer first & last names ( from have1) match with cust_first_name & cust_last_name then I want to output those results. The challenge is we have to compare rows to columns and I can't think of a way to do this. One table sometimes have middle name and other times it doesn't have middle name, but first and last names are always populated. 

             The output result should have CHARLES SMITH & CHRISTOPHER EVANS records as they have matched the criteria mentioned above.  The SAS OUTPUT (want1 or want2) should look like this below. 

 

create table WORK.WANT1( bufsize=65536 )
  (
   primary_customer_first_name char(50) format=$50. informat=$50. label='primary_customer_first_name',
   primary_customer_last_name char(50) format=$50. informat=$50. label='primary_customer_last_name',
   primary_customer_SSN char(50) format=$50. informat=$50. label='primary_customer_SSN',
   secondary_customer_SSN char(50) format=$50. informat=$50. label='secondary_customer_SSN',
   SECONDARY_CUSTOMER_FIRST_NAME char(50) format=$50. informat=$50. label='SECONDARY_CUSTOMER_FIRST_NAME',
   SECONDARY_CUSTOMER_LAST_NAME char(50) format=$50. informat=$50. label='SECONDARY_CUSTOMER_LAST_NAME',
   SECONDARY_CUSTOMER_MID_NAME char(50) format=$50. informat=$50. label='SECONDARY_CUSTOMER_MID_NAME'
  );

insert into WORK.WANT1
values('CHARLES','SMITH','869ABC547','851ABC145','ALFRED','FLOYD','NELSON')
values('CHRISTOPHER','EVANS','','587DEF521','JACKIE','PHILLIPS','')
;
quit;

proc sql;
create table WORK.WANT2( bufsize=65536 )
  (
   acct_num char(50) format=$50. informat=$50. label='acct_num',
   CUST_NAME char(50) format=$50. informat=$50. label='CUST_NAME',
   CUST_FIRST_NAME char(50) format=$50. informat=$50. label='CUST_FIRST_NAME',
   CUST_MIDDLE_NAME char(50) format=$50. informat=$50. label='CUST_MIDDLE_NAME',
   CUST_LAST_NAME char(50) format=$50. informat=$50. label='CUST_LAST_NAME',
   CUST_TAX_ID char(50) format=$50. informat=$50. label='CUST_TAX_ID'
  );
insert into WORK.WANT2
values('123456','JACKIE ANN PHILLIPS','JACKIE','ANN','PHILLIPS','587DEF521' )
values('123456','CHRISTOPHER ALAN EVANS','CHRISTOPHER','ALAN','EVANS','487ABC784' )
values('123456','TONY KAY WONG','TONY','KAY','WONG','444DEF555' )
values('654321','CHARLES DELYNN SMITH','CHARLES','DELYNN','SMITH','869ABC547' )
values('654321','ALLAN CHRISTOPHER ELLIOTT','ALLAN','CHRISTOPHER','ELLIOTT','0' )
values('654321','ALFRED NELSON FLOYD','ALFRED','NELSON','FLOYD','851ABC145' )
values('654321','JARED WATSON','JARED','','WATSON','0' )
values('654321','JOHNNY CASH','JOHNNY','','CASH','999DEF888' )
;
QUIT;

Please help me how to get the desired output. Thanks in advance. 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hello @buddha_d,

 

I'd suggest a PROC SQL step using the INTERSECT set operator:

proc sql;
create table want as
select a.*, acct_num from have1 a, have2 where primary_customer_first_name = cust_first_name
                                             & primary_customer_last_name  = cust_last_name
intersect
select a.*, acct_num from have1 a, have2 where secondary_customer_first_name = cust_first_name 
                                             & secondary_customer_last_name  = cust_last_name
;
quit;

The result should be your WANT1 dataset plus variable ACCT_NUM. By including ACCT_NUM in the two SELECT statements it is ensured that the matching "CUST_..." names belong to the same account (assuming that matches in different accounts would not be desired). Moreover, you can use ACCT_NUM as a key variable in a join of datasets HAVE2 and WANT to obtain your WANT2 dataset (up to sort order differences).

View solution in original post

2 REPLIES 2
FreelanceReinh
Jade | Level 19

Hello @buddha_d,

 

I'd suggest a PROC SQL step using the INTERSECT set operator:

proc sql;
create table want as
select a.*, acct_num from have1 a, have2 where primary_customer_first_name = cust_first_name
                                             & primary_customer_last_name  = cust_last_name
intersect
select a.*, acct_num from have1 a, have2 where secondary_customer_first_name = cust_first_name 
                                             & secondary_customer_last_name  = cust_last_name
;
quit;

The result should be your WANT1 dataset plus variable ACCT_NUM. By including ACCT_NUM in the two SELECT statements it is ensured that the matching "CUST_..." names belong to the same account (assuming that matches in different accounts would not be desired). Moreover, you can use ACCT_NUM as a key variable in a join of datasets HAVE2 and WANT to obtain your WANT2 dataset (up to sort order differences).

buddha_d
Pyrite | Level 9

Thank you FreelanceReinh.

 

Thanks again. 

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 730 views
  • 1 like
  • 2 in conversation