Hi,
suppose I have a data set with datalines in the following manner:
the first site is www.abc.com
www.123.com is the second website.
I am trying to figure out how to extract the websites, that is, the part of the string which is between (and including) "www." and ".com"
Thank you!
one way
data have;
input x $37.;
cards;
the first site is www.abc.com
www.187878723.com is the second website.
www.computer.com is the second website.
;
run;
data want ;
set have;
www=index(x, "www.");
com=index(substr(x,www+4), ".com");
if www and com then website=substr(x,www,com+7);
drop www com;
run;
one way
data have;
input x $37.;
cards;
the first site is www.abc.com
www.187878723.com is the second website.
www.computer.com is the second website.
;
run;
data want ;
set have;
www=index(x, "www.");
com=index(substr(x,www+4), ".com");
if www and com then website=substr(x,www,com+7);
drop www com;
run;
updated to handle more cases
Hi Mohamed,
thank you for answering my question, everything works nicely!
Just on the sidenote, if I add a dataline " a pseudo site www. name .com", the code will still select "www. name .com" into want, but it isn't a real website becasue of the space after www. and before .com
So is there a way to avoide it by specifying that right after www. and right before .com there should be a character?
Thank you!
if www and com then website=compress(substr(x,www,com+7));
Do you still want to extreact it correctly? .... or to consider it wrong and neglect it?
I ran the code with your new input and understand that it corrects it.
Could you please also show me the option to neglect such a case?
thnak you!
data want ;
set have;
www=index(x, "www.");
com=index(substr(x,www+4), ".com");
website=substr(x,www,com+7);
if www and com;
if index(trim(website),' ')> 0 then website="";
drop www com;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.