BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
sameer112217
Quartz | Level 8

I need help

 

I have a variable with text description name as description

 

 

Description

i am repairing the system with 1.11.1.123 

I am done with system 1.23.1.120 and not 192.1.21.1 that lives in apt 5

i am not doing for 1.22.3.130 for person living in apartment 4

I am working on system 1.21.1.119

 

I just want to extract only the ip address. the ip address starts with special character 1. or by 192.

 

so the output should be like this

 

1.11.1.123

1.23.1.120    192.1.21.1

122.3.130

1.21.1.119

 

I tried substring using index but it doesnt work out

 

Sameer

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

If you find regular expressions too confusing you could just use some simple SAS functions.  SCAN() and COMPRESS().  If you treat every non digit or period as a delimiter then SCAN() will find the candidates for you.  You will still need to do a little more validation, but you could just look for those that have exactly 3 periods.

So if your dataset is named HAVE and your variable is named DESCRIPTION your code might look like this:

data want;
  set have ;
  cnt=1;
  array ip[10] $15;
  do idx=1 by 1 ;
    ip[cnt]=scan(Description,idx,compress(Description,'.','d'));
    if ip[cnt]=' ' then leave;
    if countc(ip[cnt],'.')=3 then do;
      cnt+1;
      if cnt>dim(ip) then leave;
    end;
    else ip[cnt]=' ';
  end;
  cnt=cnt-1;
  drop idx;
run;

 

Obs   Description

 1    i am repairing the system with 1.11.1.123
 2    I am done with system 1.23.1.120 and not 192.1.21.1 that lives in apt 5
 3    i am not doing for 1.22.3.130 for person living in apartment 4
 4    I am working on system 1.21.1.119
 5    There are 0 ip addresses here

Obs   cnt       ip1           ip2        ip3    ip4    ip5    ip6    ip7    ip8    ip9    ip10

 1     1     1.11.1.123
 2     2     1.23.1.120    192.1.21.1
 3     1     1.22.3.130
 4     1     1.21.1.119
 5     0

View solution in original post

5 REPLIES 5
s_lassen
Meteorite | Level 14

The safest option is probably to use PRX, e.g.:

data want;
  set have;
  prx_id=prxparse('/\b\d{1,3}\.d{1,3}\.d{1,3}\.d{1,3}\.\b/');
  start=1;
  stop=-1;
  do while(1);
    call prxnext(prx_id,start,stop,text_variable,position,length);
    if position=0 then leave;
    ip_addr=substr(text_variable,position,length);
    output;
    end;
run;

Yo can find the documentation for the PRX string here: Tables of Perl Regular Expression (PRX) Metacharacters 

ed_sas_member
Meteorite | Level 14

Hi @sameer112217 

Here is an approach to achieve this:

data have;
	infile datalines truncover;
	input Description $500.;
	datalines;
i am repairing the system with 1.11.1.123 
I am done with system 1.23.1.120 and not 192.1.21.1 that lives in apt 5
i am not doing for 1.22.3.130 for person living in apartment 4
I am working on system 1.21.1.119
;
run;

proc sql noprint;
	select max(countw(Description,' ')), max(count(description,' 1.') + count(description,' 192.')) into:nb1,:nb2 from have;
quit;

data want;
	set have;
	array _IP (&nb1) $ 200;
	array IP (&nb2) $ 20;
	do i=1 to countw(Description,' ');
		if prxmatch('/(1\.)|(192\.)/',scan(Description,i,' ')) then _IP(i) = scan(Description,i,' ');
	end;
	IP_extract = catx(',',of _IP(*));
	do j=1 to dim(IP);
		IP(j) = scan(IP_extract,j,',');
	end;
	drop i j IP_extract _:;
run;

Capture d’écran 2020-02-29 à 09.47.43.png

 

sameer112217
Quartz | Level 8

Thank you this is what i have been looking... i thank you so much for the help.

Tom
Super User Tom
Super User

If you find regular expressions too confusing you could just use some simple SAS functions.  SCAN() and COMPRESS().  If you treat every non digit or period as a delimiter then SCAN() will find the candidates for you.  You will still need to do a little more validation, but you could just look for those that have exactly 3 periods.

So if your dataset is named HAVE and your variable is named DESCRIPTION your code might look like this:

data want;
  set have ;
  cnt=1;
  array ip[10] $15;
  do idx=1 by 1 ;
    ip[cnt]=scan(Description,idx,compress(Description,'.','d'));
    if ip[cnt]=' ' then leave;
    if countc(ip[cnt],'.')=3 then do;
      cnt+1;
      if cnt>dim(ip) then leave;
    end;
    else ip[cnt]=' ';
  end;
  cnt=cnt-1;
  drop idx;
run;

 

Obs   Description

 1    i am repairing the system with 1.11.1.123
 2    I am done with system 1.23.1.120 and not 192.1.21.1 that lives in apt 5
 3    i am not doing for 1.22.3.130 for person living in apartment 4
 4    I am working on system 1.21.1.119
 5    There are 0 ip addresses here

Obs   cnt       ip1           ip2        ip3    ip4    ip5    ip6    ip7    ip8    ip9    ip10

 1     1     1.11.1.123
 2     2     1.23.1.120    192.1.21.1
 3     1     1.22.3.130
 4     1     1.21.1.119
 5     0
sameer112217
Quartz | Level 8

thanks tom, this is what i was looking exactly....

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1274 views
  • 1 like
  • 4 in conversation