Hi all. I have two arrays I’m trying to combine. They both use regular expressions. One is looking for all the digits after categoryId= and one is looking for all the digits after productId=. I want to combine them into one so the array looks for either categoryId= or productId= so I don't have to use two data sets. I’m having trouble getting the regular expression syntax correct.
data Camp1;
set Campaigns;
array url_array(*) $ url_:;
do i=1 to dim (url_array) ;
category_Id=input(prxchange('s/.+categoryId=(\d+).+/$1/o',-1,URL_array(i)), 12.);
output;
end;
drop i;run;
data Camp2;
set Campaigns;
array url_array(*) $ url_:;
do i=1 to dim (url_array) ;
category_Id=input(prxchange('s/.+productId=(\d+).+/$1/o',-1,URL_array(i)), 12.);
output;
end;
drop i;run;
Assuming "productid" and "categoryid" are mutually exclusive in your strings then the following could work:
data Camp1;
set Campaigns;
array url_array(*) $ url_:;
do i=1 to dim (url_array);
category_Id=input(prxchange('s/.+[[:^alnum:]](categoryId=|productId=)(\d+).+/$2/oi',1,URL_array(i)), 12.);
output;
end;
drop i;
run;
Please try this untested code
data Camp1;
set Campaigns;
array url_array(*) $ url_:;
do i=1 to dim (url_array) ;
if prxmatch('m/.+categoryId=(\d+).+/',URL_array(i)) >0 then category_Id=input(prxchange('s/.+categoryId=(\d+).+/$1/o',-1,URL_array(i)), 12.);
else if prxmatch('m/.+productId=(\d+).+/',URL_array(i)) >0 then category_Id=input(prxchange('s/.+productId=(\d+).+/$1/o',-1,URL_array(i)), 12.);
output;
end;
drop i;
run;
I like to use CALL PRXNEXT() for this kind of extraction. Here I also use look behind buffers (?<=...)
data test;
url_id = 1;
url = "The productId=123 and the categoryId=5555 or categoryId=6666";
if not catID then
catID + prxParse("/(?<=categoryId=)(\d+)/");
if not proID then
proID + prxParse("/(?<=productId=)(\d+)/");
type = "category";
start = 1; stop = length(url);
call prxnext(catID, start, stop, url, pos, len);
do while (pos> 0);
category_id = input(substr(url, pos, len), best32.);
output;
call prxnext(catID, start, stop, url, pos, len);
end;
type = "product";
start = 1; stop = length(url);
call prxnext(proID, start, stop, url, pos, len);
do while (pos> 0);
category_id = input(substr(url, pos, len), best32.);
output;
call prxnext(proID, start, stop, url, pos, len);
end;
keep url_Id type category_id;
run;
proc print data=test noobs; run;
Assuming "productid" and "categoryid" are mutually exclusive in your strings then the following could work:
data Camp1;
set Campaigns;
array url_array(*) $ url_:;
do i=1 to dim (url_array);
category_Id=input(prxchange('s/.+[[:^alnum:]](categoryId=|productId=)(\d+).+/$2/oi',1,URL_array(i)), 12.);
output;
end;
drop i;
run;
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.