Hi all. I have two arrays I’m trying to combine. They both use regular expressions. One is looking for all the digits after categoryId= and one is looking for all the digits after productId=. I want to combine them into one so the array looks for either categoryId= or productId= so I don't have to use two data sets. I’m having trouble getting the regular expression syntax correct.
data Camp1;
set Campaigns;
array url_array(*) $ url_:;
do i=1 to dim (url_array) ;
category_Id=input(prxchange('s/.+categoryId=(\d+).+/$1/o',-1,URL_array(i)), 12.);
output;
end;
drop i;run;
data Camp2;
set Campaigns;
array url_array(*) $ url_:;
do i=1 to dim (url_array) ;
category_Id=input(prxchange('s/.+productId=(\d+).+/$1/o',-1,URL_array(i)), 12.);
output;
end;
drop i;run;
Assuming "productid" and "categoryid" are mutually exclusive in your strings then the following could work:
data Camp1;
set Campaigns;
array url_array(*) $ url_:;
do i=1 to dim (url_array);
category_Id=input(prxchange('s/.+[[:^alnum:]](categoryId=|productId=)(\d+).+/$2/oi',1,URL_array(i)), 12.);
output;
end;
drop i;
run;
Please try this untested code
data Camp1;
set Campaigns;
array url_array(*) $ url_:;
do i=1 to dim (url_array) ;
if prxmatch('m/.+categoryId=(\d+).+/',URL_array(i)) >0 then category_Id=input(prxchange('s/.+categoryId=(\d+).+/$1/o',-1,URL_array(i)), 12.);
else if prxmatch('m/.+productId=(\d+).+/',URL_array(i)) >0 then category_Id=input(prxchange('s/.+productId=(\d+).+/$1/o',-1,URL_array(i)), 12.);
output;
end;
drop i;
run;
I like to use CALL PRXNEXT() for this kind of extraction. Here I also use look behind buffers (?<=...)
data test;
url_id = 1;
url = "The productId=123 and the categoryId=5555 or categoryId=6666";
if not catID then
catID + prxParse("/(?<=categoryId=)(\d+)/");
if not proID then
proID + prxParse("/(?<=productId=)(\d+)/");
type = "category";
start = 1; stop = length(url);
call prxnext(catID, start, stop, url, pos, len);
do while (pos> 0);
category_id = input(substr(url, pos, len), best32.);
output;
call prxnext(catID, start, stop, url, pos, len);
end;
type = "product";
start = 1; stop = length(url);
call prxnext(proID, start, stop, url, pos, len);
do while (pos> 0);
category_id = input(substr(url, pos, len), best32.);
output;
call prxnext(proID, start, stop, url, pos, len);
end;
keep url_Id type category_id;
run;
proc print data=test noobs; run;
Assuming "productid" and "categoryid" are mutually exclusive in your strings then the following could work:
data Camp1;
set Campaigns;
array url_array(*) $ url_:;
do i=1 to dim (url_array);
category_Id=input(prxchange('s/.+[[:^alnum:]](categoryId=|productId=)(\d+).+/$2/oi',1,URL_array(i)), 12.);
output;
end;
drop i;
run;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.