Hi,
I need to C.
The ID always follows “categoryId=” in the URL string and its always numbers. I can’t use SCAN because the ID can be in a different place in the URL and is not always the same length .
Again, its always the numbers which follow "categoryId=" in the string. There may be other IDs in the URL which don't follow category ID so I need to make sure I'm not picking those up. This is the output I'm looking for:
Any assistance is greatly appreciated. Thanks!
/*this one gives you the id if your raw data needs to be read in*/
data have;
infile cards truncover dlm='&';
input @'categoryId=' id @1 url $200.;
cards;
http://www.mywebsite.com/family/index.jsp?categoryId=61765546&cp=1766205&ab=en_US_MLP_SLOT_1_S1_SHOP
http://www.mywebsite.com/shop/index.jsp?categoryId=62593996&AB=en_US_HP_S2_Men_slot_1_S2_ShopNow
http://www.mywebsite.com/family/index.jsp?categoryId=56906456
http://www.mywebsite.com/shop/index.jsp?categoryId=62593866&cp=1766205&ab=ln_men_cs_theshirt
http://www.mywebsite.com/shop/index.jsp?categoryId=57155616
http://www.mywebsite.com/shop/index.jsp?categoryId=1766618&ab=tn_women_golfandTennis&cp=17666
;
/*This one gives you the id if your data is already in a table*/
data want;
set have;
id_prx=prxchange('s/.+categoryId=(\d+).+/$1/o',-1,url);
run;
data have;
infile cards truncover ;
input url $200.;
cards;
http://www.mywebsite.com/family/index.jsp?categoryId=61765546&cp=1766205&ab=en_US_MLP_SLOT_1_S1_SHOP
http://www.mywebsite.com/shop/index.jsp?categoryId=62593996&AB=en_US_HP_S2_Men_slot_1_S2_ShopNow
http://www.mywebsite.com/family/index.jsp?categoryId=62593846&cp=1766205&ab=ln_men_cs_thetrend:tropi...
http://www.mywebsite.com/family/index.jsp?categoryId=62594006&cp=1766205&AB=en_US_MLP_P_slot_10_S1_S...
http://www.mywebsite.com/family/index.jsp?categoryId=56906456
http://www.mywebsite.com/shop/index.jsp?categoryId=62593866&cp=1766205&ab=ln_men_cs_theshirt
http://www.mywebsite.com/family/index.jsp?categoryId=62593876&cp=1795710&AB=en_US_MLP__slot_2_S1_Sho...
http://www.mywebsite.com/shop/index.jsp?categoryId=57155616
http://www.mywebsite.com/shop/index.jsp?categoryId=1766618&ab=tn_women_golfandTennis&cp=17666
;
data want;
set have;
pid= prxparse('/(?<=categoryId=)\d+/i');
call prxsubstr(pid, url, position, length);
if position ne 0 then do;
match = substr(url, position, length);
end;
drop pid position length;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.