Hi,
I need to C.
The ID always follows “categoryId=” in the URL string and its always numbers. I can’t use SCAN because the ID can be in a different place in the URL and is not always the same length .
Again, its always the numbers which follow "categoryId=" in the string. There may be other IDs in the URL which don't follow category ID so I need to make sure I'm not picking those up. This is the output I'm looking for:
Any assistance is greatly appreciated. Thanks!
/*this one gives you the id if your raw data needs to be read in*/
data have;
infile cards truncover dlm='&';
input @'categoryId=' id @1 url $200.;
cards;
http://www.mywebsite.com/family/index.jsp?categoryId=61765546&cp=1766205&ab=en_US_MLP_SLOT_1_S1_SHOP
http://www.mywebsite.com/shop/index.jsp?categoryId=62593996&AB=en_US_HP_S2_Men_slot_1_S2_ShopNow
http://www.mywebsite.com/family/index.jsp?categoryId=56906456
http://www.mywebsite.com/shop/index.jsp?categoryId=62593866&cp=1766205&ab=ln_men_cs_theshirt
http://www.mywebsite.com/shop/index.jsp?categoryId=57155616
http://www.mywebsite.com/shop/index.jsp?categoryId=1766618&ab=tn_women_golfandTennis&cp=17666
;
/*This one gives you the id if your data is already in a table*/
data want;
set have;
id_prx=prxchange('s/.+categoryId=(\d+).+/$1/o',-1,url);
run;
data have;
infile cards truncover ;
input url $200.;
cards;
http://www.mywebsite.com/family/index.jsp?categoryId=61765546&cp=1766205&ab=en_US_MLP_SLOT_1_S1_SHOP
http://www.mywebsite.com/shop/index.jsp?categoryId=62593996&AB=en_US_HP_S2_Men_slot_1_S2_ShopNow
http://www.mywebsite.com/family/index.jsp?categoryId=62593846&cp=1766205&ab=ln_men_cs_thetrend:tropi...
http://www.mywebsite.com/family/index.jsp?categoryId=62594006&cp=1766205&AB=en_US_MLP_P_slot_10_S1_S...
http://www.mywebsite.com/family/index.jsp?categoryId=56906456
http://www.mywebsite.com/shop/index.jsp?categoryId=62593866&cp=1766205&ab=ln_men_cs_theshirt
http://www.mywebsite.com/family/index.jsp?categoryId=62593876&cp=1795710&AB=en_US_MLP__slot_2_S1_Sho...
http://www.mywebsite.com/shop/index.jsp?categoryId=57155616
http://www.mywebsite.com/shop/index.jsp?categoryId=1766618&ab=tn_women_golfandTennis&cp=17666
;
data want;
set have;
pid= prxparse('/(?<=categoryId=)\d+/i');
call prxsubstr(pid, url, position, length);
if position ne 0 then do;
match = substr(url, position, length);
end;
drop pid position length;
run;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.