BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
jimmychoi
Obsidian | Level 7

Hi all,

 

i'm trying to match two different firm names using COMPGED (maybe SPECID, SOUNDEX can be used as alternative method)

but before that, I am thinking of making firm names similar as possible, by removing abbreviations at the end

(e.g: CO LTD, PTE LTD, Limited, INC, Incorporated, AG, SpA, Corp)

 

simplest way would be using the function TRANWRD, but i'm afraid this would replace not only abbreviations but letters that are part of the firm names. (say, if I was trying to remove 'Corp' at the end of firm names but by using TRANWRD i made 'Corpastta SpA' to 'astta SpA')

 

Thus, what is the best way to do this and has anyone done the same work as me?

maybe I should use reg expression?

 

1 ACCEPTED SOLUTION

Accepted Solutions
SuryaKiran
Meteorite | Level 14

Hello,

 

You can use perl regular expression for pattern matching. 

 

data have;
infile datalines truncover;
input word $50.;
datalines;
Corpastta AB Crop
Corpastta Crop AB
AB Corpastta Crop
AB Corpastta
Crop AB Corpastta
ABCrop Corpastta
;
run;

data want;
set have;
position=prxmatch('m/ Crop | Crop|^Crop /io',word);
new_word1=ifc(position^=0,ifc(position>1,substr(word,1,prxmatch('m/ Crop | Crop|^Crop /io',word)-1),''),word);
new_word2=ifc(position^=0,substr(word,prxmatch('m/ Crop | Crop|^Crop /io',word)+5),'');
required_word=catx(' ',new_word1,new_word2);
run;

You need to include the blanks for the strings that your looking for. 

'm/ Crop | Crop|^Crop /io'

          |              |          |_ ^(cap) for starting of the word and blank at the end.

          |              |_______ Starting with blank and ends the line

          |_______________ Blank at starting and ending.

Thanks,
Suryakiran

View solution in original post

3 REPLIES 3
andreas_lds
Jade | Level 19

Please post example data in a usable form. See https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... for details on how to create usable data.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

If it has delimeters, then use that, e.g:

data want;
  length want $200;
  test="Something co";
  do i=1 to countw(test," ");
    if scan(test,i," ") ne "co" then want=catx(" ",want,scan(test,i," "));
  end;
run;

Of course that is only showing one removal and with spaces, but you get the idea, and no test data in the form of a datastep prevents anything further.

SuryaKiran
Meteorite | Level 14

Hello,

 

You can use perl regular expression for pattern matching. 

 

data have;
infile datalines truncover;
input word $50.;
datalines;
Corpastta AB Crop
Corpastta Crop AB
AB Corpastta Crop
AB Corpastta
Crop AB Corpastta
ABCrop Corpastta
;
run;

data want;
set have;
position=prxmatch('m/ Crop | Crop|^Crop /io',word);
new_word1=ifc(position^=0,ifc(position>1,substr(word,1,prxmatch('m/ Crop | Crop|^Crop /io',word)-1),''),word);
new_word2=ifc(position^=0,substr(word,prxmatch('m/ Crop | Crop|^Crop /io',word)+5),'');
required_word=catx(' ',new_word1,new_word2);
run;

You need to include the blanks for the strings that your looking for. 

'm/ Crop | Crop|^Crop /io'

          |              |          |_ ^(cap) for starting of the word and blank at the end.

          |              |_______ Starting with blank and ends the line

          |_______________ Blank at starting and ending.

Thanks,
Suryakiran

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 848 views
  • 1 like
  • 4 in conversation