Hi all. I have two arrays I’m trying to combine. They both use regular expressions. One is looking for all the digits after categoryId= and one is looking for all the digits after productId=. I want to combine them into one so the array looks for either categoryId= or productId= so I don't have to use two data sets. I’m having trouble getting the regular expression syntax correct.
data Camp1;
set Campaigns;
array url_array(*) $ url_:;
do i=1 to dim (url_array) ;
category_Id=input(prxchange('s/.+categoryId=(\d+).+/$1/o',-1,URL_array(i)), 12.);
output;
end;
drop i;run;
data Camp2;
set Campaigns;
array url_array(*) $ url_:;
do i=1 to dim (url_array) ;
category_Id=input(prxchange('s/.+productId=(\d+).+/$1/o',-1,URL_array(i)), 12.);
output;
end;
drop i;run;
Assuming "productid" and "categoryid" are mutually exclusive in your strings then the following could work:
data Camp1;
set Campaigns;
array url_array(*) $ url_:;
do i=1 to dim (url_array);
category_Id=input(prxchange('s/.+[[:^alnum:]](categoryId=|productId=)(\d+).+/$2/oi',1,URL_array(i)), 12.);
output;
end;
drop i;
run;
Please try this untested code
data Camp1;
set Campaigns;
array url_array(*) $ url_:;
do i=1 to dim (url_array) ;
if prxmatch('m/.+categoryId=(\d+).+/',URL_array(i)) >0 then category_Id=input(prxchange('s/.+categoryId=(\d+).+/$1/o',-1,URL_array(i)), 12.);
else if prxmatch('m/.+productId=(\d+).+/',URL_array(i)) >0 then category_Id=input(prxchange('s/.+productId=(\d+).+/$1/o',-1,URL_array(i)), 12.);
output;
end;
drop i;
run;
I like to use CALL PRXNEXT() for this kind of extraction. Here I also use look behind buffers (?<=...)
data test;
url_id = 1;
url = "The productId=123 and the categoryId=5555 or categoryId=6666";
if not catID then
catID + prxParse("/(?<=categoryId=)(\d+)/");
if not proID then
proID + prxParse("/(?<=productId=)(\d+)/");
type = "category";
start = 1; stop = length(url);
call prxnext(catID, start, stop, url, pos, len);
do while (pos> 0);
category_id = input(substr(url, pos, len), best32.);
output;
call prxnext(catID, start, stop, url, pos, len);
end;
type = "product";
start = 1; stop = length(url);
call prxnext(proID, start, stop, url, pos, len);
do while (pos> 0);
category_id = input(substr(url, pos, len), best32.);
output;
call prxnext(proID, start, stop, url, pos, len);
end;
keep url_Id type category_id;
run;
proc print data=test noobs; run;
Assuming "productid" and "categoryid" are mutually exclusive in your strings then the following could work:
data Camp1;
set Campaigns;
array url_array(*) $ url_:;
do i=1 to dim (url_array);
category_Id=input(prxchange('s/.+[[:^alnum:]](categoryId=|productId=)(\d+).+/$2/oi',1,URL_array(i)), 12.);
output;
end;
drop i;
run;
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Select SAS Training centers are offering in-person courses. View upcoming courses for: