BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Buzzy_Bee
Quartz | Level 8

Hello,

I can't find any examples that help me solve my prxchange problem where I want to remove a string up to including the search term. For example, I've got "Good Investments Trust 12 Moon St London" but I want it to search on the word "Trust" and remove everything up to and including Trust, so that I just have "12 Moon St London" remaining. My attempt below returns "Trust 12 Moon St London" so I've only been successful at removing the words before the word Trust, but it still returns the search term itself. Note that the trust names vary in length, so it might be named "Bad Trust" or "A trust with a very long name Trust." Thanks for your help.

 

 

data address;
	infile datalines dlm=',' dsd truncover;
	input Mail_addr1 :$30. City :$10. ;
	datalines;
Good Investments Trust 12 Moon St, London
Smith Family Trust 3 Hotel St, Paris
;

data prx;
	set address;
	Address=cats(Mail_addr1, City);
	corrected_address =prxchange("s/.*?(Trust.*)/$1/",1, Address);
drop Mail_addr1 City;
run;

 

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

You only need to match the substring that you want to change:

 

corrected_address = prxchange("s/.*\bTrust\s*//i", 1, Address);

 

Note The "i" suffix requests case insensitive match.

PG

View solution in original post

15 REPLIES 15
Shmuel
Garnet | Level 18

Why not use next code:

corrected_address = substr(address,find(address,"trust","i")+5);
Shmuel
Garnet | Level 18

You are right, @Ksharp , I have edited my proposed code.

andreas_lds
Jade | Level 19

@Buzzy_Bee: is it now and for all times secured that the word "trust" appears only once per observation? What should happen, if there is

Trustowsky Trust 42 Trust Street, Little Rock

in your data?

Buzzy_Bee
Quartz | Level 8

Good point Andreas. I should have mentioned that trusts tend to choose sensible names where the term 'trust' only appears once in the name, such as "Brown Family Trust" rather than "A trust with a long name Trust."

However, there are some cases where an address is entered as "C/o The Trustees, Brown Family Trust" so it is definitely possible to see the word trust appear twice in the address.

hhinohar
Quartz | Level 8

Can you clarify if "Trust" and "trust" is a different word?

e.g.

Case:If "trust" is a word that has to be replaced. 

Before:                                                                  After

Good Investments trust 12 Moon St London ---> 12 Moon St London

 

Case:If "trust" is not

Before:                                                                  After

Good Investments trust 12 Moon St London ---> (nothing appears)

 

Shmuel
Garnet | Level 18

If you insist using prx functions, for your specific example, check next code:

data _null_;
  adrs =  "Good Investments Trust 12 Moon St London";
  ad1  = substr(adrs,findw(adrs,"trust","i")+5);
  
  ad2  = prxchange("s/.*?(Trust.*)/$1/",1, adrs);
  prx_id= prxparse("/.*?(Trust.*)/");
  ad3  = prxchange("s/Trust/ /",1,ad2);
  put ad1 = / ad2= / ad3=;
run;

and be aware of all previous given remarks.

Shmuel
Garnet | Level 18

Next code is not complete but is worth to run and check:

data test;
     input adrs $50.;
cards;
Good Investments Trust 12 Moon St, London
Smith Family Trust 3 Hotel St, Paris
Trustowsky Trust 42 Trust Street, Little Rock
Good Investments trust 12 Moon St London
;
run;

data check;
    obs = _N_;
 set test;
  ad2  = prxchange("s/.*([T]|[t]rust.){1}/$1/",1,adrs);
  prx_id= prxparse("/.*(.{5})/");
  ad3  = prxchange("s/.{5}/ /",1,ad2);
  ad4  = prxchange("s/(.*)(?=\bTrust)//i",1, adrs);   
  drop prx_id;
run;
Ksharp
Super User
data address;
	infile datalines dlm=',' dsd truncover;
	input Mail_addr1 :$30. City :$10. ;
	datalines;
Good Investments Trust 12 Moon St, London
Smith Family Trust 3 Hotel St, Paris
;

data prx;
	set address;
	Address=cats(Mail_addr1, City);
	corrected_address =prxchange("s/(.*)(?=\bTrust)//i",1, Address);
drop Mail_addr1 City;
run;
Buzzy_Bee
Quartz | Level 8

Hi KSharp, your version produces the same problems as mine when I run it.

The result is: "Trust 12 Moon St London" so it still has Trust at the front.

PGStats
Opal | Level 21

You only need to match the substring that you want to change:

 

corrected_address = prxchange("s/.*\bTrust\s*//i", 1, Address);

 

Note The "i" suffix requests case insensitive match.

PG
Buzzy_Bee
Quartz | Level 8

Thank you - that is the right prxchange formula that I was trying to create.

I've got thousands of companies with names like "Very Amazing Investments Trust" and "Serious Lawyers Limited" so this prxchange is exactly what I needed as I can just change the term 'Trust' to 'Limited' and then change it to any of the other company suffixes that appear in the list. Your method means I only need to change the search term once in the formula.

whymath
Lapis Lazuli | Level 10

Hi, @PGStats , may I ask how this "s/.*\bTrust\s*//i" works? Is it just like "s/.*\bTrust\s(.*?)//i" ?

PGStats
Opal | Level 21

Sure, here is how it reads:

 

s/

.* Match anything, any length

\bTrust Match a word starting with Trust

\s* Match any number of spaces, including none

// Replace the whole matched substring (if any) with nothing

Make the matching case-insensitive

PG

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 15 replies
  • 2910 views
  • 7 likes
  • 7 in conversation