New SAS User

dapenDaniel · Posted 01-28-2019 11:50 AM

Hello.

I have two data files. File A contains some company names and File B is the full list of company names. The variables are following below.

File A: ID, Name

File B: Name1, Name2, Name3 (3 names are for the same company but different expression. For example, Name1 = IBM and Name 2 = International Business Machines)

What I want is to search names from File A in File B. If names in File A can match one of the names in File B, then a new variable called NewName is created in File A and NewNames equals to Name1 from File B.

Can anyone tell me what code I need to use? Thanks.

For example

File A

ID Name

01 IBM

02 Apple

File B

Name1 Name 2 Name 3

International Business Machines IBM IBM Corp.

Apple Apple Inc. AAPL

Google LLC GOOGL Google

Expected File

ID Name NewName (equal to Name1)

01 IBM International Business Machines

02 Apple Apple

novinosrin · Posted 01-28-2019 12:14 PM


data a;
input id $ name $;
cards;
01        IBM
02        Apple
;

data b;
input (name1-name3) ( &:$50.);
cards;
International Business Machines                            IBM                                          IBM Corp.
Apple                                                       Apple Inc.                                     AAPL
Google LLC                                               GOOGL                                      Google      
;

proc sql;
create table want as
select a.*,b.name1 as new_name
from a a left join b b
on a.name=b.name1 or a.name=b.name2 or a.name=b.name3;
quit;

View solution in original post

novinosrin · Posted 01-28-2019 11:51 AM

Can you post a sample plz of both of your files that will help lazy people like me to also try

dapenDaniel · Posted 01-28-2019 12:08 PM

I have added an example. Thanks.

novinosrin · Posted 01-28-2019 12:14 PM


data a;
input id $ name $;
cards;
01        IBM
02        Apple
;

data b;
input (name1-name3) ( &:$50.);
cards;
International Business Machines                            IBM                                          IBM Corp.
Apple                                                       Apple Inc.                                     AAPL
Google LLC                                               GOOGL                                      Google      
;

proc sql;
create table want as
select a.*,b.name1 as new_name
from a a left join b b
on a.name=b.name1 or a.name=b.name2 or a.name=b.name3;
quit;

dapenDaniel · Posted 01-28-2019 02:00 PM

There are some duplicates in File B.

How can I remove these duplicates when I match these two files?

Thanks.

novinosrin · Posted 01-28-2019 02:02 PM

Do you mean like this?



data a;
input id $ name $;
cards;
01        IBM
02        Apple
;

data b;
input (name1-name3) ( &:$50.);
cards;
International Business Machines                            IBM                                          IBM Corp.
International Business Machines                            IBM                                          IBM Corp.
International Business Machines                            IBM                                          IBM Corp.
Apple                                                       Apple Inc.                                     AAPL
Apple                                                       Apple Inc.                                     AAPL
Google LLC                                               GOOGL                                      Google      
;
proc sql;
create table want as
select distinct a.*,b.name1 as new_name
from a a left join b b
on a.name=b.name1 or a.name=b.name2 or a.name=b.name3;
quit;

dapenDaniel · Posted 01-29-2019 01:43 AM

I mean duplicate like this:

File A

ID Name

01 IBM

02 Apple

File B

Name1 Name 2 Name 3

International Business Machines IBM IBM Corp.

Apple Apple Inc. AAPL

AAPL Apple Apple INC.

Google LLC GOOGL Google

Expected File

ID Name NewName (equal to Name1)

01 IBM International Business Machines

02 Apple Apple

02 Apple AAPL

I hope the expected file can list all possible "Name1" in File B for "Names" in File A.

Is it possible to make it?

novinosrin · Posted 01-29-2019 01:55 AM

Hi @dapenDaniel The existing code accomplishes just that. Here is another test


data a;
input id $ name $;
cards;
01        IBM
02        Apple
;

data b;
input (name1-name3) ( &:$50.);
cards;
International Business Machines                            IBM                                          IBM Corp.
Apple                                                       Apple Inc.                                     AAPL
AAPL                                                         Apple                                        Apple INC.
Google LLC                                               GOOGL                                      Google     
;

proc sql;
create table want as
select a.*,b.name1 as new_name
from a a left join b b
on a.name=b.name1 or a.name=b.name2 or a.name=b.name3;
quit;

proc print noobs;run;

Result

01    IBM      International Business Machines
02    Apple    Apple
02    Apple    AAPL

dapenDaniel · Posted 01-29-2019 02:15 AM

Thank you so much for your timely reply!!!

Sorry I have another questions. I found that in File B, there are other weird data format

File A

ID Name

01 IBM

02 Apple

03 bcd

File B

Name1 Name 2 Name 3

International Business Machines IBM IBM Corp.

Apple Apple Inc. AAPL

AAPL Apple Apple Inc

Google LLC GOOGL Google

abc abc | bcd bcd | abc | aef

Expected File

ID    Name      NewName
01    IBM      International Business Machines
02    Apple    Apple
02    Apple    AAPL
03    bcd      abc

For Name3 in File B, there are several names that are separate by "|". I hope that as long as the observation contains the same part (bcd in Name3), SAS helps me to get its corresponding Name1. It is unnecessary to be exactly the same.

Is it possible to do that?

Thank you very much!

novinosrin · Posted 01-29-2019 02:28 AM

Switch to contains logic


data a;
input id $ name $;
cards;
01        IBM
02        Apple
03        bcd
;

data b;
input (name1-name3) ( &:$50.);
cards;
International Business Machines                            IBM                                          IBM Corp.
Apple                                                       Apple Inc.                                     AAPL
AAPL                                                        Apple                                          Apple Inc
Google LLC                                               GOOGL                                         Google 
abc                                                         abc | bcd                                      bcd | abc | aef
;
proc sql;
create table want as
select id , name, name1
from a
left join
b
on catx('|', name1, name2, name3) contains trim(name);
quit;

SuryaKiran · Posted 01-28-2019 02:24 PM

You may need to add Case expression to @novinosrin solution to add the name from table A if no match found in table B

proc sql;
create table want as
select a.*,
		case when b.name1 is null then a.name 
			else b.name1 end as new_name
from a a 
left join b b
on a.name=b.name1 or a.name=b.name2 or a.name=b.name3;
quit;

Thanks,
Suryakiran

novinosrin · Posted 01-28-2019 02:37 PM

@SuryaKiran Good forward thinking, however personally i prefer to take advantage of coalsece ANSI or proc sql's proprietary version coalescec. Simpler, easy and more efficient

select distinct a.*,coalsecec(name1,a.name) as new_name

kiranv_ · Posted 01-28-2019 01:11 PM

one more way

proc sql;
select id , name, name1
from a
left join
b
on index(catx(',', name1, name2, name3), trim(name)) gt 0;

novinosrin · Posted 01-28-2019 01:25 PM

so will contains i think

proc sql;

select id , name, name1
from a
left join
b
on catx(',', name1, name2, name3) contains trim(name);
quit;

kiranv_ · Posted 01-28-2019 01:32 PM

both of them should work.

New SAS User

Match Observations

Re: Match Observations

Re: Match Observations

Re: Match Observations

Re: Match Observations

Re: Match Observations

Re: Match Observations

Re: Match Observations

Re: Match Observations

Re: Match Observations

Re: Match Observations

Re: Match Observations

Re: Match Observations

Re: Match Observations

Re: Match Observations

Re: Match Observations

Match observations to make new variable

SAS 360 Match Advanced Tracking and Viewability

Delete matched observations

Finding matching observations?

how to get non matching observations

Follow Us

What is...

New SAS User

Register Today!

Follow Us

What is...