Hello,
I'm having trouble with doing the following, can anyone help?
The dataset has the keyword sale or sales at the beginning, middle, or end of a string that is bracketed[], quoted "", or neither. The dataset has an impressions variable and clicks variable for each observation and I need to indicate if sale or sales are at the beginning, middle or end of the string for sale or sales (separately) as well as keep the impressions and clicks to add up for each instance of each appearance of 'sale' or 'sales' in the string. I need this to create a summary table of the number of impressions and the number of clicks for sale at the beginning, middle, and end of the string then do the same for sales.
Ex:
Keyword Impressions Clicks
[sale on gas grills] 10 2
"Gas grill sales" 0 0
green egg sale grills 420 336
I wrote code last night that worked, but its highly specific to this dataset. Is there a better way? Thanks very much for your help.
You could use pattern matching like this:
data have;
input Keyword :&$80. Impressions Clicks;
datalines;
[sale on gas grills]        10        2
"Gas grill sales"      0         0
green egg sale grills      420          336
;
data want;
set have;
beg = prxmatch('/^[\["]?\s*sale(s)?\b/io', Keyword)>0;
mid = prxmatch('/\w+.*\b\bsale(s)?\b.*\w+/io',Keyword)>0;
end = prxmatch('/\w+.*\bsale(s)?\s*["\]]?$/io',trim(Keyword))>0;
run;
proc print data=want noobs; run;
Whatever solution you choose, it should be tested with more examples.
PG
Do you mind if it also selects the string sale if that is part of another word (e.g., saleroom, salesperson or wholesale)?
Here is one way:
data have;
informat Keyword $80.;
input Keyword & Impressions Clicks;
cards;
[sale on gas grills] 10 2
"Gas grill sales" 0 0
green egg sale grills 420 336
wholesale garbage 23 32
;
data want;
set have;
_n_=1;
do until (scan(compress(keyword,'"[]'),_n_) eq "");
if scan(compress(upcase(keyword),'"[]'),_n_) in ('SALE','SALES') then contains=1;
_n_+1;
end;
run;
You could use pattern matching like this:
data have;
input Keyword :&$80. Impressions Clicks;
datalines;
[sale on gas grills]        10        2
"Gas grill sales"      0         0
green egg sale grills      420          336
;
data want;
set have;
beg = prxmatch('/^[\["]?\s*sale(s)?\b/io', Keyword)>0;
mid = prxmatch('/\w+.*\b\bsale(s)?\b.*\w+/io',Keyword)>0;
end = prxmatch('/\w+.*\bsale(s)?\s*["\]]?$/io',trim(Keyword))>0;
run;
proc print data=want noobs; run;
Whatever solution you choose, it should be tested with more examples.
PG
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.
