BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
maliksmom2000
Obsidian | Level 7

I am trying to extract a specific word from a character string, and it may be written differently.  Everything I need to extract contains or starts with "MSP" I need not only the work MSP but also the numbers following (there's over 100 number combinations and I do not have a master list):

 

Examples of text:

Valve, SQUARE, CAST IRON, 3 IN, THD, BLACK, MSP 54-37, FITTING (need to return "MSP 54-37)

Pipe, SQUARE HEAD, 3 IN, THD, 175 PSIG, BLACK, MSP54-37, FITTING (need to return "MSP 54-37)

PLUG, SQUARE HEAD, CAST IRON, 3 IN, THD, 175 PSIG, BLACK, MSP 54-37.1, FITTING (need to return "MSP 54-37.1)

PLUG, SQUARE HEAD, CAST IRON, 3 IN, STEEL, THD, 175 PSIG, BLACK, MSP 54-37.1, MSP 11-1.1, FITTING (need to return "MSP 54-37.1 & MSP 11-1.1 in separate columns)

 

does anyone know how to do this?  I am using Enterprise Guide but I know I can cut & paste code directly into the window as well.

 

Thank you!!!!

1 ACCEPTED SOLUTION

Accepted Solutions
ed_sas_member
Meteorite | Level 14

Hi @maliksmom2000 

 

Here is another approach to achieve this:

data have;
infile datalines truncover;
input text $200.;
datalines;
Valve, SQUARE, CAST IRON, 3 IN, THD, BLACK, MSP 54-37, FITTING
Pipe, SQUARE HEAD, 3 IN, THD, 175 PSIG, BLACK, MSP54-37, FITTING
PLUG, SQUARE HEAD, CAST IRON, 3 IN, THD, 175 PSIG, BLACK, MSP 54-37.1, FITTING
PLUG, SQUARE HEAD, CAST IRON, 3 IN, STEEL, THD, 175 PSIG, BLACK, MSP 54-37.1, MSP 11-1.1, FITTING
;

proc sql noprint;
	select max(countw(text,',')), max(count(text,"MSP")) into:nb1,:nb2 from have;
quit;

data want;
	set have;
	array _MSP (&nb1) $ 200;
	array MSP (&nb2) $ 20;
	do i=1 to countw(text,',');
		if prxmatch('/MSP\s*\d+-*\d+\.*\d*/',scan(text,i,',')) then _MSP(i) = scan(text,i,',');
	end;
	MSP_extract = catx(',',of _MSP(*));
	do j=1 to count(text,"MSP");
		MSP(j) = scan(MSP_extract,j,',');
	end;
	drop i j MSP_extract _:;
run;

Capture d’écran 2020-02-25 à 12.01.59.png

 

View solution in original post

4 REPLIES 4
data_null__
Jade | Level 19

I need to understand your input data better.  Show it in the form of a data step using CARDS.

 

Also, show what you WANT.

PGStats
Opal | Level 21

Test this:

 

data have;
infile datalines truncover;
input line $200.;
datalines;
Valve, SQUARE, CAST IRON, 3 IN, THD, BLACK, MSP 54-37, FITTING
Pipe, SQUARE HEAD, 3 IN, THD, 175 PSIG, BLACK, MSP54-37, FITTING
PLUG, SQUARE HEAD, CAST IRON, 3 IN, THD, 175 PSIG, BLACK, MSP 54-37.1, FITTING
PLUG, SQUARE HEAD, CAST IRON, 3 IN, STEEL, THD, 175 PSIG, BLACK, MSP 54-37.1, MSP 11-1.1, FITTING
;

data want;
if not prxID then prxID + prxparse("/\b(\w*MSP\w*)\s*([-.0-9 ]+)/i");
set have;
length found $20;
start = 1;
stop = length(line);
call prxnext(prxID, start, stop, line, position, length);
do while (position > 0);
   found = catx(" ", prxposn(prxId, 1, line), prxposn(prxId, 2, line));
   output;
   call prxnext(prxID, start, stop, line, position, length);
   end;
drop prxID start stop position length;
run;

proc print data=want noobs; run;
line 	found
Valve, SQUARE, CAST IRON, 3 IN, THD, BLACK, MSP 54-37, FITTING 	MSP 54-37
Pipe, SQUARE HEAD, 3 IN, THD, 175 PSIG, BLACK, MSP54-37, FITTING 	MSP54 -37
PLUG, SQUARE HEAD, CAST IRON, 3 IN, THD, 175 PSIG, BLACK, MSP 54-37.1, FITTING 	MSP 54-37.1
PLUG, SQUARE HEAD, CAST IRON, 3 IN, STEEL, THD, 175 PSIG, BLACK, MSP 54-37.1, MSP 11-1.1, FITTING 	MSP 54-37.1
PLUG, SQUARE HEAD, CAST IRON, 3 IN, STEEL, THD, 175 PSIG, BLACK, MSP 54-37.1, MSP 11-1.1, FITTING 	MSP 11-1.1
PG
maliksmom2000
Obsidian | Level 7

Thank you so much!!!!

ed_sas_member
Meteorite | Level 14

Hi @maliksmom2000 

 

Here is another approach to achieve this:

data have;
infile datalines truncover;
input text $200.;
datalines;
Valve, SQUARE, CAST IRON, 3 IN, THD, BLACK, MSP 54-37, FITTING
Pipe, SQUARE HEAD, 3 IN, THD, 175 PSIG, BLACK, MSP54-37, FITTING
PLUG, SQUARE HEAD, CAST IRON, 3 IN, THD, 175 PSIG, BLACK, MSP 54-37.1, FITTING
PLUG, SQUARE HEAD, CAST IRON, 3 IN, STEEL, THD, 175 PSIG, BLACK, MSP 54-37.1, MSP 11-1.1, FITTING
;

proc sql noprint;
	select max(countw(text,',')), max(count(text,"MSP")) into:nb1,:nb2 from have;
quit;

data want;
	set have;
	array _MSP (&nb1) $ 200;
	array MSP (&nb2) $ 20;
	do i=1 to countw(text,',');
		if prxmatch('/MSP\s*\d+-*\d+\.*\d*/',scan(text,i,',')) then _MSP(i) = scan(text,i,',');
	end;
	MSP_extract = catx(',',of _MSP(*));
	do j=1 to count(text,"MSP");
		MSP(j) = scan(MSP_extract,j,',');
	end;
	drop i j MSP_extract _:;
run;

Capture d’écran 2020-02-25 à 12.01.59.png

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 841 views
  • 0 likes
  • 4 in conversation