BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Ronein
Meteorite | Level 14

Hello

I have a char column (string).

I want to search for a word within the string.

If the word exists in the field then I want to return all word(s) that before the searched word.

For example:

Search word "bank"

If the string value is "HSBC business bank UK"   then will return "HSBC business"

If the string value is "Poland Bank Millenium"  then will return "Poland"

IF the string value is "Shopping centre Mokotow" then will return null value 

Please note that it is better that the search will not be sensitive to capital/non capital letters.

 

Data RawTbl;
input string $50.
HSBC buisness bank UK
Poland Bank Millenium
Shopping centre Mokotow
;
Run;

 

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
kelxxx
Quartz | Level 8

Hi,

data have;
set rawtbl;
length x $45;
if index(upcase(string),"BANK")-2 gt 0 then 
x=substr(string,1,index(upcase(string),"BANK")-2);
run; 

You can try it.

Have a nice day

View solution in original post

10 REPLIES 10
ed_sas_member
Meteorite | Level 14

Hi @Ronein 

Here is an approach to achieve this, using pearl regular expressions:

data want;
	length result $ 50;
	set rawtbl;
	result = prxchange('s/^(.*)\bbank\b.*$/$1/i',1,string);
run;

Capture d’écran 2020-04-16 à 09.22.07.png

Best,

Ronein
Meteorite | Level 14

Thank you,

However for string "Shopping centre Mokotow" I expect to get null value because word "Bank" is not found in this string.

Can you also explain the code?

Why do you write "bbank"  and not "bank"?

What is the meaning of number "1"  in the code?

kelxxx
Quartz | Level 8

It s not bbank , it is \b bank \b >>> \b … \b = boundary of word. the work here is "bank"

you can view more detail of syntax in the link regexp-tip-sheet 

kelxxx
Quartz | Level 8

Hi,

data have;
set rawtbl;
length x $45;
if index(upcase(string),"BANK")-2 gt 0 then 
x=substr(string,1,index(upcase(string),"BANK")-2);
run; 

You can try it.

Have a nice day

Ronein
Meteorite | Level 14

Thank you,

Can you please explain why you wrote number "2"  in your code?

May you please explain the code please?

kelxxx
Quartz | Level 8

because "bank" is in middle and before bank is a blank " "

1---5----10
abc bank

index() function Searches a character expression for a string of characters, and returns the position of the string's first character for the first occurrence of the string.

index() return the position of "b" of first occurrence of "bank" >>> 5

but length of 'abc' is 3 >>> 5-2

sorry, i m not good in English to explain

andreas_lds
Jade | Level 19

@Ronein: what would you expect if string is "Commerzbank"?

@kelxxx : i would replace index with findw and use "> 1" instead of "> 0" - if the string starts with bank there is nothing on the left to be copied.

Ronein
Meteorite | Level 14

IF string is Commezbank then I expect to get Commerz in reult.

Here is the code that is working fine.

My question: Is it possible to use SCAN function in order to do the task?


Data wanted;
set RawTbl;
Len=LENGTH(String);
Loc=INDEX(upcase(string),"BANK");
IF Lox>=2   then x=substr(String,1,Loc-1);
else x=' ';
Run;

 

kelxxx
Quartz | Level 8

I think that It s impossible to use SCAN here

We cannot use a string like a delimiter with SCAN()

and if we define "bank" like a delimiter, the case of ""Shopping centre Mokotow" " is not resolved.

Data RawTbl;
infile cards dlmstr="bank" dlmsopt='i';
input string1 :$50. ;
cards;
HSBC buisness bank UK
Poland Bank Millenium
Shopping centre Mokotow
CommerzBank abc
;
Run;
kelxxx
Quartz | Level 8

sorry to revive the subject

I found how to use SCAN() here

 

Data RawTbl;
input string $50. ;
cards;
HSBC buisness bank UK
Poland Bank Millenium
Shopping centre Mokotow
CommerzBank abc
;
Run;

data have;
set rawtbl;
string1=prxchange('s/bank/#/i',1,string);
if index(string1,"#") then result=scan(string1,1,"#");
run;

1. change the word "bank" to a specific character that cannot be exited in string

explain of prxchange('s/bank/#/i',1,string)

s= to remplace

/= separate the argument

bank = key word  "bank"

# = character to change

i = case insensitive

1 = replace the first occurrence of "bank"  (-1 if want to replace all "bank" in string)

string= source of text;

2. use SCAN() with delimiter = the character specified and IF for control then case of "Shopping centre Mokotow"

 

Have a good day.

 

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 667 views
  • 7 likes
  • 4 in conversation