DATA Step, Macro, Functions and more

Extracting Certain Information from a String

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 17
Accepted Solution

Extracting Certain Information from a String

 

Hi All,

 

I am trying to easily extract specific distance information from a string.

 

I am interested in creating a new variable with the extraction of distance in meters (m).

 

Here is an example of my data and desired result. I think there will be quite a bit of string work to get the answer.

 

Any help would be greatly appreciated.

 

DataResultComments
RED 1500M STBL_2035 ZN042 MOND TER1500 
V 2km STL_28 ZN011 RT2000KM' or 'km' is 1000 time m (meters)
EEP 360M _47 Z012 BRAN360 
1400m 48047b TN0099 BR1400 
RET 117M CC S23_ITHACA S 1117 
ECOE 1711T DUP DE0no m / M or km / KM
RP HB/OB8 S4 P0no m / M or km / KM
TEP 200m 11P & 700m 5200Return the first m only
P 1110m HUP CN27151110 
RF FD 79 LE U5 TO L SU80no m / M or km / KM
TTP 39V Mea Ceder ge 8_1371m1371 

Accepted Solutions
Solution
‎07-18-2018 03:33 AM
Super User
Posts: 2,512

Re: Extracting Certain Information from a String

Like this?

 

data HAVE;
  input STR $80.;
  if prxmatch('/\d+k?m/i',STR) then
    STR1=prxchange('s/.*?(\d+k?m).*/$1/i',-1,STR);
  DIST=input(compress(STR1,,'dk'),8.)*ifn(find(STR1,'k','i'),1000,1);
cards;
RED 1500M STBL_2035 ZN042 MOND TER	1500	 
V 2km STL_28 ZN011 RT	2000	KM' or 'km' is 1000 time m (meters)
EEP 360M _47 Z012 BRAN	360	 
1400m 48047b TN0099 BR	1400	 
RET 117M CC S23_ITHACA S 1	117	 
ECOE 1711T DUP DE	0	no m / M or km / KM
RP HB/OB8 S4 P	0	no m / M or km / KM
TEP 200m 11P & 700m 5	200	Return the first m only
P 1110m HUP CN2715	1110	 
RF FD 79 LE U5 TO L SU8	0	no m / M or km / KM
TTP 39V Mea Ceder ge 8_1371m	1371	 
run;

 

STR STR1 DIST
RED 1500M STBL_2035 ZN042 MOND TER 1500 1500M 1500
V 2km STL_28 ZN011 RT 2000 KM' or 'km' is 1000 time m (meters) 2km 2000
EEP 360M _47 Z012 BRAN 360 360M 360
1400m 48047b TN0099 BR 1400 1400m 1400
RET 117M CC S23_ITHACA S 1 117 117M 117
ECOE 1711T DUP DE 0 no m / M or km / KM   .
RP HB/OB8 S4 P 0 no m / M or km / KM   .
TEP 200m 11P & 700m 5 200 Return the first m only 200m 200
P 1110m HUP CN2715 1110 1110m 1110
RF FD 79 LE U5 TO L SU8 0 no m / M or km / KM   .
TTP 39V Mea Ceder ge 8_1371m 1371 1371m 1371

View solution in original post


All Replies
Solution
‎07-18-2018 03:33 AM
Super User
Posts: 2,512

Re: Extracting Certain Information from a String

Like this?

 

data HAVE;
  input STR $80.;
  if prxmatch('/\d+k?m/i',STR) then
    STR1=prxchange('s/.*?(\d+k?m).*/$1/i',-1,STR);
  DIST=input(compress(STR1,,'dk'),8.)*ifn(find(STR1,'k','i'),1000,1);
cards;
RED 1500M STBL_2035 ZN042 MOND TER	1500	 
V 2km STL_28 ZN011 RT	2000	KM' or 'km' is 1000 time m (meters)
EEP 360M _47 Z012 BRAN	360	 
1400m 48047b TN0099 BR	1400	 
RET 117M CC S23_ITHACA S 1	117	 
ECOE 1711T DUP DE	0	no m / M or km / KM
RP HB/OB8 S4 P	0	no m / M or km / KM
TEP 200m 11P & 700m 5	200	Return the first m only
P 1110m HUP CN2715	1110	 
RF FD 79 LE U5 TO L SU8	0	no m / M or km / KM
TTP 39V Mea Ceder ge 8_1371m	1371	 
run;

 

STR STR1 DIST
RED 1500M STBL_2035 ZN042 MOND TER 1500 1500M 1500
V 2km STL_28 ZN011 RT 2000 KM' or 'km' is 1000 time m (meters) 2km 2000
EEP 360M _47 Z012 BRAN 360 360M 360
1400m 48047b TN0099 BR 1400 1400m 1400
RET 117M CC S23_ITHACA S 1 117 117M 117
ECOE 1711T DUP DE 0 no m / M or km / KM   .
RP HB/OB8 S4 P 0 no m / M or km / KM   .
TEP 200m 11P & 700m 5 200 Return the first m only 200m 200
P 1110m HUP CN2715 1110 1110m 1110
RF FD 79 LE U5 TO L SU8 0 no m / M or km / KM   .
TTP 39V Mea Ceder ge 8_1371m 1371 1371m 1371
Occasional Contributor
Posts: 17

Re: Extracting Certain Information from a String

That's awesome Chris. I had a number of string functions that were strarting to do my head in. Thanks for your help.
Occasional Contributor
Posts: 17

Re: Extracting Certain Information from a String

I haven't seen prxmatch function before. Cheers
Super User
Posts: 2,512

Re: Extracting Certain Information from a String

Regular expressions are off-putting at first, but they are a must if you manipulate strings.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 81 views
  • 3 likes
  • 2 in conversation