BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
CharlesC
Calcite | Level 5

Hello,

I'm having trouble with doing the following, can anyone help?

The dataset has the keyword sale or sales at the beginning, middle, or end of a string that is bracketed[], quoted "", or neither.  The dataset has an impressions variable and clicks variable for each observation and I need to indicate if sale or sales are at the beginning, middle or end of the string for sale or sales (separately) as well as keep the impressions and clicks to add up for each instance of each appearance of 'sale' or 'sales' in the string.    I need this to create a summary table of the number of impressions and the number of clicks for sale at the beginning, middle, and end of the string then do the same for sales.

Ex:

Keyword                                               Impressions                                  Clicks

[sale on gas grills]                                   10                                                  2

"Gas grill sales"                                        0                                                  0

green egg sale grills                                 420                                             336

I wrote code last night that worked, but its highly specific to this dataset.  Is there a better way?  Thanks very much for your help.

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

You could use pattern matching like this:

data have;
input Keyword :&$80. Impressions Clicks;
datalines;
[sale on gas grills]        10        2
"Gas grill sales"      0         0
green egg sale grills      420          336
;

data want;
set have;
beg = prxmatch('/^[\["]?\s*sale(s)?\b/io', Keyword)>0;

mid = prxmatch('/\w+.*\b\bsale(s)?\b.*\w+/io',Keyword)>0;

end = prxmatch('/\w+.*\bsale(s)?\s*["\]]?$/io',trim(Keyword))>0;

run;

proc print data=want noobs; run;

Whatever solution you choose, it should be tested with more examples.

PG

PG

View solution in original post

3 REPLIES 3
art297
Opal | Level 21

Do you mind if it also selects the string sale if that is part of another word (e.g., saleroom, salesperson or wholesale)?

art297
Opal | Level 21

Here is one way:

data have;

  informat Keyword $80.;

  input Keyword & Impressions Clicks;

  cards;

[sale on gas grills]     10    2

"Gas grill sales"         0    0

green egg sale grills   420  336

wholesale garbage        23   32

;

data want;

  set have;

  _n_=1;

  do until (scan(compress(keyword,'"[]'),_n_) eq "");

    if scan(compress(upcase(keyword),'"[]'),_n_) in ('SALE','SALES') then contains=1;

    _n_+1;

  end;

run;

PGStats
Opal | Level 21

You could use pattern matching like this:

data have;
input Keyword :&$80. Impressions Clicks;
datalines;
[sale on gas grills]        10        2
"Gas grill sales"      0         0
green egg sale grills      420          336
;

data want;
set have;
beg = prxmatch('/^[\["]?\s*sale(s)?\b/io', Keyword)>0;

mid = prxmatch('/\w+.*\b\bsale(s)?\b.*\w+/io',Keyword)>0;

end = prxmatch('/\w+.*\bsale(s)?\s*["\]]?$/io',trim(Keyword))>0;

run;

proc print data=want noobs; run;

Whatever solution you choose, it should be tested with more examples.

PG

PG

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 892 views
  • 3 likes
  • 3 in conversation