Text recognition

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 9
Accepted Solution

Text recognition

Hi all,

I have a variable that is character that is comprised by various words like "ING Funds Trust: Ing Arbondel International bond Fund; Class A Shares". I want a command that searches these words and signals whether they contain specific words. For example, i want to to search for the word "bond". Notice that the word Arbondel contains bond but i am not interested in that. I only want SAS to flag the standalone "bond" word. I have experimented with various functions but cannot get exactly what i want. Any help would be very appreciated. Many thanks. Costas


Accepted Solutions
Solution
‎05-29-2014 03:08 PM
Trusted Advisor
Posts: 1,228

Re: Text recognition

My try

data want;

word="ING Funds Trust: Ing Arbondel International bond Fund; Class A Shares";

a=substr(word,findw(word,'bond'),length('bond'));

run;

View solution in original post


All Replies
Super User
Posts: 5,503

Re: Text recognition

It sounds like you are trying the INDEX function.  There are some newer functions that may do exactly what you want, such as INDEXW or FINDW.  Or, you can do it the old-fashioned way (adding a leading and trailing blank):

if index(' ' || string || ' ', ' bond ') then do;

Good luck.

Solution
‎05-29-2014 03:08 PM
Trusted Advisor
Posts: 1,228

Re: Text recognition

My try

data want;

word="ING Funds Trust: Ing Arbondel International bond Fund; Class A Shares";

a=substr(word,findw(word,'bond'),length('bond'));

run;

Occasional Contributor
Posts: 9

Re: Text recognition

Thanks, this works!

Occasional Contributor
Posts: 9

Re: Text recognition

Hi

One more question

Suppose i am looking for the word growth, and my string variable is called fund_name.

I am using this below:

word=UPCASE(fund_name);

Growth1=substr(word,findw(word,'GROWTH'),length('GROWTH'));

When word equals "AMERICAN PENSION INVESTORS TRUST: GROWTH FUND" the code works fine.

But when it equals "API TRUST:GROWTH FUND" it does not.

I think this is because there is no gap between : and GROWTH.

is there a solution to this? Note that i cannot include all the trailing characters (i.e., TRUST) because these are different in different rows.

The quest continues....

thanks a lot to all for the help

Best,

Costas

Trusted Advisor
Posts: 1,228

Re: Text recognition

Hi,

FindW considers blank as a default delimiter. If you have : in your text then we need to tell the FINDW fucntion treat this as a delimiter. Try the code given below. Hope this will solve the problem.

Thanks,

Naeem

data have;

word1="AMERICAN PENSION INVESTORS TRUST: GROWTH FUND";

word2="API TRUST:GROWTH FUND";

Growth1=substr(word1,findw(word1,'GROWTH'),length('GROWTH'));

Growth2=substr(word2,findw(word2,'GROWTH',': '),length('GROWTH'));

run;

proc print data=have;

run;

Occasional Contributor
Posts: 9

Re: Text recognition

Thanks this works!

Super User
Posts: 11,343

Re: Text recognition

Or

BondFlag = (findw('bond',string,' ,:;','IE') >0);
should return 1 if bond occurs as a single word in the string, 0 otherwise.

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 277 views
  • 3 likes
  • 4 in conversation