I need help
I have a variable with text description name as description
Description
i am repairing the system with 1.11.1.123
I am done with system 1.23.1.120 and not 192.1.21.1 that lives in apt 5
i am not doing for 1.22.3.130 for person living in apartment 4
I am working on system 1.21.1.119
I just want to extract only the ip address. the ip address starts with special character 1. or by 192.
so the output should be like this
1.11.1.123
1.23.1.120 192.1.21.1
122.3.130
1.21.1.119
I tried substring using index but it doesnt work out
Sameer
If you find regular expressions too confusing you could just use some simple SAS functions. SCAN() and COMPRESS(). If you treat every non digit or period as a delimiter then SCAN() will find the candidates for you. You will still need to do a little more validation, but you could just look for those that have exactly 3 periods.
So if your dataset is named HAVE and your variable is named DESCRIPTION your code might look like this:
data want;
set have ;
cnt=1;
array ip[10] $15;
do idx=1 by 1 ;
ip[cnt]=scan(Description,idx,compress(Description,'.','d'));
if ip[cnt]=' ' then leave;
if countc(ip[cnt],'.')=3 then do;
cnt+1;
if cnt>dim(ip) then leave;
end;
else ip[cnt]=' ';
end;
cnt=cnt-1;
drop idx;
run;
Obs Description 1 i am repairing the system with 1.11.1.123 2 I am done with system 1.23.1.120 and not 192.1.21.1 that lives in apt 5 3 i am not doing for 1.22.3.130 for person living in apartment 4 4 I am working on system 1.21.1.119 5 There are 0 ip addresses here Obs cnt ip1 ip2 ip3 ip4 ip5 ip6 ip7 ip8 ip9 ip10 1 1 1.11.1.123 2 2 1.23.1.120 192.1.21.1 3 1 1.22.3.130 4 1 1.21.1.119 5 0
The safest option is probably to use PRX, e.g.:
data want;
set have;
prx_id=prxparse('/\b\d{1,3}\.d{1,3}\.d{1,3}\.d{1,3}\.\b/');
start=1;
stop=-1;
do while(1);
call prxnext(prx_id,start,stop,text_variable,position,length);
if position=0 then leave;
ip_addr=substr(text_variable,position,length);
output;
end;
run;
Yo can find the documentation for the PRX string here: Tables of Perl Regular Expression (PRX) Metacharacters
Here is an approach to achieve this:
data have;
infile datalines truncover;
input Description $500.;
datalines;
i am repairing the system with 1.11.1.123
I am done with system 1.23.1.120 and not 192.1.21.1 that lives in apt 5
i am not doing for 1.22.3.130 for person living in apartment 4
I am working on system 1.21.1.119
;
run;
proc sql noprint;
select max(countw(Description,' ')), max(count(description,' 1.') + count(description,' 192.')) into:nb1,:nb2 from have;
quit;
data want;
set have;
array _IP (&nb1) $ 200;
array IP (&nb2) $ 20;
do i=1 to countw(Description,' ');
if prxmatch('/(1\.)|(192\.)/',scan(Description,i,' ')) then _IP(i) = scan(Description,i,' ');
end;
IP_extract = catx(',',of _IP(*));
do j=1 to dim(IP);
IP(j) = scan(IP_extract,j,',');
end;
drop i j IP_extract _:;
run;
Thank you this is what i have been looking... i thank you so much for the help.
If you find regular expressions too confusing you could just use some simple SAS functions. SCAN() and COMPRESS(). If you treat every non digit or period as a delimiter then SCAN() will find the candidates for you. You will still need to do a little more validation, but you could just look for those that have exactly 3 periods.
So if your dataset is named HAVE and your variable is named DESCRIPTION your code might look like this:
data want;
set have ;
cnt=1;
array ip[10] $15;
do idx=1 by 1 ;
ip[cnt]=scan(Description,idx,compress(Description,'.','d'));
if ip[cnt]=' ' then leave;
if countc(ip[cnt],'.')=3 then do;
cnt+1;
if cnt>dim(ip) then leave;
end;
else ip[cnt]=' ';
end;
cnt=cnt-1;
drop idx;
run;
Obs Description 1 i am repairing the system with 1.11.1.123 2 I am done with system 1.23.1.120 and not 192.1.21.1 that lives in apt 5 3 i am not doing for 1.22.3.130 for person living in apartment 4 4 I am working on system 1.21.1.119 5 There are 0 ip addresses here Obs cnt ip1 ip2 ip3 ip4 ip5 ip6 ip7 ip8 ip9 ip10 1 1 1.11.1.123 2 2 1.23.1.120 192.1.21.1 3 1 1.22.3.130 4 1 1.21.1.119 5 0
thanks tom, this is what i was looking exactly....
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.