BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
melassiri
Calcite | Level 5

how can you use regular expression in sas to select the name of the website from a data and display it.

Examples:

A1 = google.com   the result should be equal to:  google

A2 = http://twitter.com/Marko_met_een_K/status/1725797169897021653  the result should be equal to:  twitter

A3 = https://regioonline.nl/regio-den-bosch/schade-aan-stuw-lith/   then the result should be equal to:  regioonline 

A4 = https://www.aa.com/en/how-to-regex?id=123   the result should be equal to:   aa

1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

The following code generated by chatGPT using prompts: Using SAS code <copy/paste your question>

The chatGPT returned code required only one small fix to make it work. 

data websites;
    input url :$100.;
    datalines;
google.com
http://twitter.com/Marko_met_een_K/status/1725797169897021653
https://regioonline.nl/regio-den-bosch/schade-aan-stuw-lith/
https://www.aa.com/en/how-to-regex?id=123
;
run;

data extracted_names;
    set websites;
    /* Use PRX to define a regex pattern to extract the website name */
    retain pattern;
    if _N_ = 1 then pattern = prxparse('/(?:https?:\/\/)?(?:www\.)?([^\/\.]+)\./');

    /* Apply the regex to the url and store the result in website_name */
    if prxmatch(pattern, url) then do;
        call prxsubstr(pattern, url, start_pos);
        website_name = prxposn(pattern, 1, url);
    end;

    /* Keep only the relevant columns */
    keep url website_name;
run;

proc print data=extracted_names noobs;
    title "Extracted Website Names";
run;

Patrick_0-1726702422278.png

 

View solution in original post

4 REPLIES 4
Patrick
Opal | Level 21

The following code generated by chatGPT using prompts: Using SAS code <copy/paste your question>

The chatGPT returned code required only one small fix to make it work. 

data websites;
    input url :$100.;
    datalines;
google.com
http://twitter.com/Marko_met_een_K/status/1725797169897021653
https://regioonline.nl/regio-den-bosch/schade-aan-stuw-lith/
https://www.aa.com/en/how-to-regex?id=123
;
run;

data extracted_names;
    set websites;
    /* Use PRX to define a regex pattern to extract the website name */
    retain pattern;
    if _N_ = 1 then pattern = prxparse('/(?:https?:\/\/)?(?:www\.)?([^\/\.]+)\./');

    /* Apply the regex to the url and store the result in website_name */
    if prxmatch(pattern, url) then do;
        call prxsubstr(pattern, url, start_pos);
        website_name = prxposn(pattern, 1, url);
    end;

    /* Keep only the relevant columns */
    keep url website_name;
run;

proc print data=extracted_names noobs;
    title "Extracted Website Names";
run;

Patrick_0-1726702422278.png

 

melassiri
Calcite | Level 5

Thank you for your quick response i really appreciate it

Ksharp
Super User

Why you have to use PRX ? using classic sas function would be a lot easy.

 

data websites;
    input url :$100.;
    datalines;
google.com
http://twitter.com/Marko_met_een_K/status/1725797169897021653
https://regioonline.nl/regio-den-bosch/schade-aan-stuw-lith/
https://www.aa.com/en/how-to-regex?id=123
;
run;
data want;
 set websites;
temp=scan(substrn(url,find(url,'//')),1,'/');
if scan(temp,1,'.')='www' then want=scan(temp,2,'.');
 else want=scan(temp,1,'.');
run;
melassiri
Calcite | Level 5

thank you for your response.

That is very nice of you.

 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 483 views
  • 2 likes
  • 3 in conversation