Hello
I have a char column (string).
I want to search for a word within the string.
If the word exists in the field then I want to return all word(s) that before the searched word.
For example:
Search word "bank"
If the string value is "HSBC business bank UK" then will return "HSBC business"
If the string value is "Poland Bank Millenium" then will return "Poland"
IF the string value is "Shopping centre Mokotow" then will return null value
Please note that it is better that the search will not be sensitive to capital/non capital letters.
Data RawTbl;
input string $50.
HSBC buisness bank UK
Poland Bank Millenium
Shopping centre Mokotow
;
Run;
Hi,
data have; set rawtbl; length x $45; if index(upcase(string),"BANK")-2 gt 0 then x=substr(string,1,index(upcase(string),"BANK")-2); run;
You can try it.
Have a nice day
Hi @Ronein
Here is an approach to achieve this, using pearl regular expressions:
data want;
length result $ 50;
set rawtbl;
result = prxchange('s/^(.*)\bbank\b.*$/$1/i',1,string);
run;
Best,
Thank you,
However for string "Shopping centre Mokotow" I expect to get null value because word "Bank" is not found in this string.
Can you also explain the code?
Why do you write "bbank" and not "bank"?
What is the meaning of number "1" in the code?
It s not bbank , it is \b bank \b >>> \b … \b = boundary of word. the work here is "bank"
you can view more detail of syntax in the link regexp-tip-sheet
Hi,
data have; set rawtbl; length x $45; if index(upcase(string),"BANK")-2 gt 0 then x=substr(string,1,index(upcase(string),"BANK")-2); run;
You can try it.
Have a nice day
Thank you,
Can you please explain why you wrote number "2" in your code?
May you please explain the code please?
because "bank" is in middle and before bank is a blank " "
1---5----10 abc bank
index() function Searches a character expression for a string of characters, and returns the position of the string's first character for the first occurrence of the string.
index() return the position of "b" of first occurrence of "bank" >>> 5
but length of 'abc' is 3 >>> 5-2
sorry, i m not good in English to explain
IF string is Commezbank then I expect to get Commerz in reult.
Here is the code that is working fine.
My question: Is it possible to use SCAN function in order to do the task?
Data wanted;
set RawTbl;
Len=LENGTH(String);
Loc=INDEX(upcase(string),"BANK");
IF Lox>=2 then x=substr(String,1,Loc-1);
else x=' ';
Run;
I think that It s impossible to use SCAN here
We cannot use a string like a delimiter with SCAN()
and if we define "bank" like a delimiter, the case of ""Shopping centre Mokotow" " is not resolved.
Data RawTbl;
infile cards dlmstr="bank" dlmsopt='i';
input string1 :$50. ;
cards;
HSBC buisness bank UK
Poland Bank Millenium
Shopping centre Mokotow
CommerzBank abc
;
Run;
sorry to revive the subject
I found how to use SCAN() here
Data RawTbl; input string $50. ; cards; HSBC buisness bank UK Poland Bank Millenium Shopping centre Mokotow CommerzBank abc ; Run; data have; set rawtbl; string1=prxchange('s/bank/#/i',1,string); if index(string1,"#") then result=scan(string1,1,"#"); run;
1. change the word "bank" to a specific character that cannot be exited in string
explain of prxchange('s/bank/#/i',1,string)
s= to remplace
/= separate the argument
bank = key word "bank"
# = character to change
i = case insensitive
1 = replace the first occurrence of "bank" (-1 if want to replace all "bank" in string)
string= source of text;
2. use SCAN() with delimiter = the character specified and IF for control then case of "Shopping centre Mokotow"
Have a good day.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.