BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
keen_sas
Quartz | Level 8

Hi All, 

i am having a text embedded in square brackets with multiple delimiters in the text . Text in square brackets are considered as variable names. I want to extract these variable names and store in another variable. In some cases the square brackets itself  act as delimiter.I have tried this using SCAN and COUNTW, but countw is counting each '[' separately but not combinely as single '[]' into one,hence the count is becoming many and loop is executing multiple times. Any suggestions ,may be i am missing the basic logic here.


data have;
length have $100. want $50.;
 have='[ONE] "/" [TWO] "/(" [THREE] "-" [FOUR] ")\" [FIVE]';
 want="ONE,TWO,THREE,FOUR,FIVE";OUTPUT;
 have='[SIX] "(" [SEVEN] ")" [EIGHT] "-" [NINE] "--" ';
 want="SIX,SEVEN,EIGHT,NINE";OUTPUT;
 have='[ONE] "#" [TWO] "have" [THREE] "$" [NINE] "with" ';
 want="ONE,TWO,THREE,NINE";output;
 have=' "combine" [TWO] "with" [THREE] "[" [NINE] "]" ';
 want="TWO,THREE,NINE";OUTPUT;
run ;
1 ACCEPTED SOLUTION

Accepted Solutions
ChrisNZ
Tourmaline | Level 20

Another way:

  WANT=compress(transtrn(prxchange('s/( *|"[^"]*")//' ,-1,HAVE),'][',','),'[]');
  put WANT=;

WANT=ONE,TWO,THREE,FOUR,FIVE
WANT=SIX,SEVEN,EIGHT,NINE
WANT=ONE,TWO,THREE,NINE
WANT=TWO,THREE,NINE

 

View solution in original post

5 REPLIES 5
jimbarbour
Meteorite | Level 14

@keen_sas

 

That's a bit tricky, but it's just text, and it can be parsed.  I added a little sub-routine to your DATA step.  It puts out two lines for each "have" data that you start with.  The first of the two is "Want" -- this is what you want to end up with.  The second is "Got" which is what the sub-routine is actually producing.  Want and Got should be identical if the sub-routine is working properly.

 

Here's the code:

data have;
	DROP	have;
	LENGTH	Status	$6;
	length have $100. want $50.;

	have='[ONE] "/" [TWO] "/(" [THREE] "-" [FOUR] ")\" [FIVE]';
	want="ONE,TWO,THREE,FOUR,FIVE";	Status='Want'; OUTPUT;
	LINK	Transform_Data;

	have='[SIX] "(" [SEVEN] ")" [EIGHT] "-" [NINE] "--" ';
	want="SIX,SEVEN,EIGHT,NINE";Status='Want'; OUTPUT;
	LINK	Transform_Data;

	have='[ONE] "#" [TWO] "have" [THREE] "$" [NINE] "with" ';
	want="ONE,TWO,THREE,NINE";Status='Want'; output;
	LINK	Transform_Data;

	have=' "combine" [TWO] "with" [THREE] "[" [NINE] "]" ';
	want="TWO,THREE,NINE";Status='Want'; OUTPUT;
	LINK	Transform_Data;

	RETURN;

	Transform_Data:
		Want=COMPRESS(TRANWRD(TRANWRD(TRANWRD(TRANSLATE(TRANWRD(COMPRESS(TRANWRD(TRANWRD(TRANWRD(have, '/(', '/'),')\','\'),'--',''),']" '),'[[',','),',,,,,,,','/\-()#$'),'with',','),'have',','),'combine',''),'[ ');
		IF	SUBSTR(Want, LENGTH(Want), 1)	=	','	THEN
			WANT							=	SUBSTR(Want, 1, (LENGTH(Want) - 1));
		Status='Got';
		OUTPUT;
	RETURN;
run;

And below are the results.  The "Want" and the "Got" are in fact identical, so the sub-routine is confirmed to be working correctly.

Obs    Status    want

 1      Want     ONE,TWO,THREE,FOUR,FIVE
 2      Got      ONE,TWO,THREE,FOUR,FIVE
 3      Want     SIX,SEVEN,EIGHT,NINE   
 4      Got      SIX,SEVEN,EIGHT,NINE   
 5      Want     ONE,TWO,THREE,NINE     
 6      Got      ONE,TWO,THREE,NINE     
 7      Want     TWO,THREE,NINE         
 8      Got      TWO,THREE,NINE       

Hope that helps,

 

Jim

 

 

 

ChrisNZ
Tourmaline | Level 20

Another way:

  WANT=compress(transtrn(prxchange('s/( *|"[^"]*")//' ,-1,HAVE),'][',','),'[]');
  put WANT=;

WANT=ONE,TWO,THREE,FOUR,FIVE
WANT=SIX,SEVEN,EIGHT,NINE
WANT=ONE,TWO,THREE,NINE
WANT=TWO,THREE,NINE

 

ChrisNZ
Tourmaline | Level 20

A slightly better one-liner:

WANT=compress(transtrn(prxchange('s/.*?(\[\w*\])[^\[]*/\1/' ,-1,HAVE),'][',','),'[]');

ONE,TWO,THREE,FOUR,FIVE
SIX,SEVEN,EIGHT,NINE
ONE,TWO,THREE,NINE
TWO,THREE,NINE

Ksharp
Super User
data have;
length have $100.;
 have='[ONE] "/" [TWO] "/(" [THREE] "-" [FOUR] ")\" [FIVE]'; output;
 have='[SIX] "(" [SEVEN] ")" [EIGHT] "-" [NINE] "--" ';	output;
 have='[ONE] "#" [TWO] "have" [THREE] "$" [NINE] "with" '; output;
 have=' "combine" [TWO] "with" [THREE] "[" [NINE] "]" '; output;
run ;

data want;
 set have;
 length want $ 100;
 pid=prxparse('/\[\w+?\]/');
 s=1;e=length(have);
 call prxnext(pid,s,e,have,p,l);
 do while(p>0);
   want=catx(',',want,compress(substr(have,p,l),'[]'));
   call prxnext(pid,s,e,have,p,l);
 end;
drop pid s e p l;
run;
Patrick
Opal | Level 21

Here a variation to what @Ksharp proposed.

data have;
  length have $100.;
  have='[ONE] "/" [TWO] "/(" [THREE] "-" [FOUR] ")\" [FIVE]';
  output;
  have='[SIX] "(" [SEVEN] ")" [EIGHT] "-" [NINE] "--" ';
  output;
  have='[ONE] "#" [TWO] "have" [THREE] "$" [NINE] "with" ';
  output;
  have=' "combine" [TWO] "with" [THREE] "[" [NINE] "]" ';
  output;
  have='"[" [] "]"  "combine" [TWO] "with" [THREE] "[" [x] "]"  "[" [NINE] "]" ';
  output;
run;

data want(drop=_:);
  set have;
  length want $ 100;
  _pid=prxparse('/\[([^\[]*?)\]/');
  _start=1;
  _stop=length(have);
  do until(_pos<=0);
    call prxnext(_pid,_start,_stop,have,_pos,_len);
    want=catx(',',want,prxposn(_pid, 1, have));
  end;
run;

proc print data=want;
run;

Patrick_0-1596329470466.png

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1414 views
  • 2 likes
  • 5 in conversation