I am trying to extract a specific word from a character string, and it may be written differently. Everything I need to extract contains or starts with "MSP" I need not only the work MSP but also the numbers following (there's over 100 number combinations and I do not have a master list):
Examples of text:
Valve, SQUARE, CAST IRON, 3 IN, THD, BLACK, MSP 54-37, FITTING (need to return "MSP 54-37)
Pipe, SQUARE HEAD, 3 IN, THD, 175 PSIG, BLACK, MSP54-37, FITTING (need to return "MSP 54-37)
PLUG, SQUARE HEAD, CAST IRON, 3 IN, THD, 175 PSIG, BLACK, MSP 54-37.1, FITTING (need to return "MSP 54-37.1)
PLUG, SQUARE HEAD, CAST IRON, 3 IN, STEEL, THD, 175 PSIG, BLACK, MSP 54-37.1, MSP 11-1.1, FITTING (need to return "MSP 54-37.1 & MSP 11-1.1 in separate columns)
does anyone know how to do this? I am using Enterprise Guide but I know I can cut & paste code directly into the window as well.
Thank you!!!!
Here is another approach to achieve this:
data have;
infile datalines truncover;
input text $200.;
datalines;
Valve, SQUARE, CAST IRON, 3 IN, THD, BLACK, MSP 54-37, FITTING
Pipe, SQUARE HEAD, 3 IN, THD, 175 PSIG, BLACK, MSP54-37, FITTING
PLUG, SQUARE HEAD, CAST IRON, 3 IN, THD, 175 PSIG, BLACK, MSP 54-37.1, FITTING
PLUG, SQUARE HEAD, CAST IRON, 3 IN, STEEL, THD, 175 PSIG, BLACK, MSP 54-37.1, MSP 11-1.1, FITTING
;
proc sql noprint;
select max(countw(text,',')), max(count(text,"MSP")) into:nb1,:nb2 from have;
quit;
data want;
set have;
array _MSP (&nb1) $ 200;
array MSP (&nb2) $ 20;
do i=1 to countw(text,',');
if prxmatch('/MSP\s*\d+-*\d+\.*\d*/',scan(text,i,',')) then _MSP(i) = scan(text,i,',');
end;
MSP_extract = catx(',',of _MSP(*));
do j=1 to count(text,"MSP");
MSP(j) = scan(MSP_extract,j,',');
end;
drop i j MSP_extract _:;
run;
I need to understand your input data better. Show it in the form of a data step using CARDS.
Also, show what you WANT.
Test this:
data have;
infile datalines truncover;
input line $200.;
datalines;
Valve, SQUARE, CAST IRON, 3 IN, THD, BLACK, MSP 54-37, FITTING
Pipe, SQUARE HEAD, 3 IN, THD, 175 PSIG, BLACK, MSP54-37, FITTING
PLUG, SQUARE HEAD, CAST IRON, 3 IN, THD, 175 PSIG, BLACK, MSP 54-37.1, FITTING
PLUG, SQUARE HEAD, CAST IRON, 3 IN, STEEL, THD, 175 PSIG, BLACK, MSP 54-37.1, MSP 11-1.1, FITTING
;
data want;
if not prxID then prxID + prxparse("/\b(\w*MSP\w*)\s*([-.0-9 ]+)/i");
set have;
length found $20;
start = 1;
stop = length(line);
call prxnext(prxID, start, stop, line, position, length);
do while (position > 0);
found = catx(" ", prxposn(prxId, 1, line), prxposn(prxId, 2, line));
output;
call prxnext(prxID, start, stop, line, position, length);
end;
drop prxID start stop position length;
run;
proc print data=want noobs; run;
line found Valve, SQUARE, CAST IRON, 3 IN, THD, BLACK, MSP 54-37, FITTING MSP 54-37 Pipe, SQUARE HEAD, 3 IN, THD, 175 PSIG, BLACK, MSP54-37, FITTING MSP54 -37 PLUG, SQUARE HEAD, CAST IRON, 3 IN, THD, 175 PSIG, BLACK, MSP 54-37.1, FITTING MSP 54-37.1 PLUG, SQUARE HEAD, CAST IRON, 3 IN, STEEL, THD, 175 PSIG, BLACK, MSP 54-37.1, MSP 11-1.1, FITTING MSP 54-37.1 PLUG, SQUARE HEAD, CAST IRON, 3 IN, STEEL, THD, 175 PSIG, BLACK, MSP 54-37.1, MSP 11-1.1, FITTING MSP 11-1.1
Thank you so much!!!!
Here is another approach to achieve this:
data have;
infile datalines truncover;
input text $200.;
datalines;
Valve, SQUARE, CAST IRON, 3 IN, THD, BLACK, MSP 54-37, FITTING
Pipe, SQUARE HEAD, 3 IN, THD, 175 PSIG, BLACK, MSP54-37, FITTING
PLUG, SQUARE HEAD, CAST IRON, 3 IN, THD, 175 PSIG, BLACK, MSP 54-37.1, FITTING
PLUG, SQUARE HEAD, CAST IRON, 3 IN, STEEL, THD, 175 PSIG, BLACK, MSP 54-37.1, MSP 11-1.1, FITTING
;
proc sql noprint;
select max(countw(text,',')), max(count(text,"MSP")) into:nb1,:nb2 from have;
quit;
data want;
set have;
array _MSP (&nb1) $ 200;
array MSP (&nb2) $ 20;
do i=1 to countw(text,',');
if prxmatch('/MSP\s*\d+-*\d+\.*\d*/',scan(text,i,',')) then _MSP(i) = scan(text,i,',');
end;
MSP_extract = catx(',',of _MSP(*));
do j=1 to count(text,"MSP");
MSP(j) = scan(MSP_extract,j,',');
end;
drop i j MSP_extract _:;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.