SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

How to search for text within a string with leading characters

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 14
Accepted Solution

How to search for text within a string with leading characters

Hello,

I'm having trouble with doing the following, can anyone help?

The dataset has the keyword sale or sales at the beginning, middle, or end of a string that is bracketed[], quoted "", or neither.  The dataset has an impressions variable and clicks variable for each observation and I need to indicate if sale or sales are at the beginning, middle or end of the string for sale or sales (separately) as well as keep the impressions and clicks to add up for each instance of each appearance of 'sale' or 'sales' in the string.    I need this to create a summary table of the number of impressions and the number of clicks for sale at the beginning, middle, and end of the string then do the same for sales.

Ex:

Keyword                                               Impressions                                  Clicks

[sale on gas grills]                                   10                                                  2

"Gas grill sales"                                        0                                                  0

green egg sale grills                                 420                                             336

I wrote code last night that worked, but its highly specific to this dataset.  Is there a better way?  Thanks very much for your help.


Accepted Solutions
Solution
‎09-08-2013 06:24 PM
Respected Advisor
Posts: 4,930

Re: How to search for text within a string with leading characters

You could use pattern matching like this:

data have;
input Keyword :&$80. Impressions Clicks;
datalines;
[sale on gas grills]        10        2
"Gas grill sales"      0         0
green egg sale grills      420          336
;

data want;
set have;
beg = prxmatch('/^[\["]?\s*sale(s)?\b/io', Keyword)>0;

mid = prxmatch('/\w+.*\b\bsale(s)?\b.*\w+/io',Keyword)>0;

end = prxmatch('/\w+.*\bsale(s)?\s*["\]]?$/io',trim(Keyword))>0;

run;

proc print data=want noobs; run;

Whatever solution you choose, it should be tested with more examples.

PG

PG

View solution in original post


All Replies
PROC Star
Posts: 7,487

Re: How to search for text within a string with leading characters

Do you mind if it also selects the string sale if that is part of another word (e.g., saleroom, salesperson or wholesale)?

PROC Star
Posts: 7,487

Re: How to search for text within a string with leading characters

Here is one way:

data have;

  informat Keyword $80.;

  input Keyword & Impressions Clicks;

  cards;

[sale on gas grills]     10    2

"Gas grill sales"         0    0

green egg sale grills   420  336

wholesale garbage        23   32

;

data want;

  set have;

  _n_=1;

  do until (scan(compress(keyword,'"[]'),_n_) eq "");

    if scan(compress(upcase(keyword),'"[]'),_n_) in ('SALE','SALES') then contains=1;

    _n_+1;

  end;

run;

Solution
‎09-08-2013 06:24 PM
Respected Advisor
Posts: 4,930

Re: How to search for text within a string with leading characters

You could use pattern matching like this:

data have;
input Keyword :&$80. Impressions Clicks;
datalines;
[sale on gas grills]        10        2
"Gas grill sales"      0         0
green egg sale grills      420          336
;

data want;
set have;
beg = prxmatch('/^[\["]?\s*sale(s)?\b/io', Keyword)>0;

mid = prxmatch('/\w+.*\b\bsale(s)?\b.*\w+/io',Keyword)>0;

end = prxmatch('/\w+.*\bsale(s)?\s*["\]]?$/io',trim(Keyword))>0;

run;

proc print data=want noobs; run;

Whatever solution you choose, it should be tested with more examples.

PG

PG
🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 3 replies
  • 296 views
  • 3 likes
  • 3 in conversation