BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
keen_sas
Quartz | Level 8

Hi All, 

i am having a text embedded in square brackets with multiple delimiters in the text . Text in square brackets are considered as variable names. I want to extract these variable names and store in another variable. In some cases the square brackets itself  act as delimiter.I have tried this using SCAN and COUNTW, but countw is counting each '[' separately but not combinely as single '[]' into one,hence the count is becoming many and loop is executing multiple times. Any suggestions ,may be i am missing the basic logic here.


data have;
length have $100. want $50.;
 have='[ONE] "/" [TWO] "/(" [THREE] "-" [FOUR] ")\" [FIVE]';
 want="ONE,TWO,THREE,FOUR,FIVE";OUTPUT;
 have='[SIX] "(" [SEVEN] ")" [EIGHT] "-" [NINE] "--" ';
 want="SIX,SEVEN,EIGHT,NINE";OUTPUT;
 have='[ONE] "#" [TWO] "have" [THREE] "$" [NINE] "with" ';
 want="ONE,TWO,THREE,NINE";output;
 have=' "combine" [TWO] "with" [THREE] "[" [NINE] "]" ';
 want="TWO,THREE,NINE";OUTPUT;
run ;
1 ACCEPTED SOLUTION

Accepted Solutions
ChrisNZ
Tourmaline | Level 20

Another way:

  WANT=compress(transtrn(prxchange('s/( *|"[^"]*")//' ,-1,HAVE),'][',','),'[]');
  put WANT=;

WANT=ONE,TWO,THREE,FOUR,FIVE
WANT=SIX,SEVEN,EIGHT,NINE
WANT=ONE,TWO,THREE,NINE
WANT=TWO,THREE,NINE

 

View solution in original post

5 REPLIES 5
jimbarbour
Meteorite | Level 14

@keen_sas

 

That's a bit tricky, but it's just text, and it can be parsed.  I added a little sub-routine to your DATA step.  It puts out two lines for each "have" data that you start with.  The first of the two is "Want" -- this is what you want to end up with.  The second is "Got" which is what the sub-routine is actually producing.  Want and Got should be identical if the sub-routine is working properly.

 

Here's the code:

data have;
	DROP	have;
	LENGTH	Status	$6;
	length have $100. want $50.;

	have='[ONE] "/" [TWO] "/(" [THREE] "-" [FOUR] ")\" [FIVE]';
	want="ONE,TWO,THREE,FOUR,FIVE";	Status='Want'; OUTPUT;
	LINK	Transform_Data;

	have='[SIX] "(" [SEVEN] ")" [EIGHT] "-" [NINE] "--" ';
	want="SIX,SEVEN,EIGHT,NINE";Status='Want'; OUTPUT;
	LINK	Transform_Data;

	have='[ONE] "#" [TWO] "have" [THREE] "$" [NINE] "with" ';
	want="ONE,TWO,THREE,NINE";Status='Want'; output;
	LINK	Transform_Data;

	have=' "combine" [TWO] "with" [THREE] "[" [NINE] "]" ';
	want="TWO,THREE,NINE";Status='Want'; OUTPUT;
	LINK	Transform_Data;

	RETURN;

	Transform_Data:
		Want=COMPRESS(TRANWRD(TRANWRD(TRANWRD(TRANSLATE(TRANWRD(COMPRESS(TRANWRD(TRANWRD(TRANWRD(have, '/(', '/'),')\','\'),'--',''),']" '),'[[',','),',,,,,,,','/\-()#$'),'with',','),'have',','),'combine',''),'[ ');
		IF	SUBSTR(Want, LENGTH(Want), 1)	=	','	THEN
			WANT							=	SUBSTR(Want, 1, (LENGTH(Want) - 1));
		Status='Got';
		OUTPUT;
	RETURN;
run;

And below are the results.  The "Want" and the "Got" are in fact identical, so the sub-routine is confirmed to be working correctly.

Obs    Status    want

 1      Want     ONE,TWO,THREE,FOUR,FIVE
 2      Got      ONE,TWO,THREE,FOUR,FIVE
 3      Want     SIX,SEVEN,EIGHT,NINE   
 4      Got      SIX,SEVEN,EIGHT,NINE   
 5      Want     ONE,TWO,THREE,NINE     
 6      Got      ONE,TWO,THREE,NINE     
 7      Want     TWO,THREE,NINE         
 8      Got      TWO,THREE,NINE       

Hope that helps,

 

Jim

 

 

 

ChrisNZ
Tourmaline | Level 20

Another way:

  WANT=compress(transtrn(prxchange('s/( *|"[^"]*")//' ,-1,HAVE),'][',','),'[]');
  put WANT=;

WANT=ONE,TWO,THREE,FOUR,FIVE
WANT=SIX,SEVEN,EIGHT,NINE
WANT=ONE,TWO,THREE,NINE
WANT=TWO,THREE,NINE

 

ChrisNZ
Tourmaline | Level 20

A slightly better one-liner:

WANT=compress(transtrn(prxchange('s/.*?(\[\w*\])[^\[]*/\1/' ,-1,HAVE),'][',','),'[]');

ONE,TWO,THREE,FOUR,FIVE
SIX,SEVEN,EIGHT,NINE
ONE,TWO,THREE,NINE
TWO,THREE,NINE

Ksharp
Super User
data have;
length have $100.;
 have='[ONE] "/" [TWO] "/(" [THREE] "-" [FOUR] ")\" [FIVE]'; output;
 have='[SIX] "(" [SEVEN] ")" [EIGHT] "-" [NINE] "--" ';	output;
 have='[ONE] "#" [TWO] "have" [THREE] "$" [NINE] "with" '; output;
 have=' "combine" [TWO] "with" [THREE] "[" [NINE] "]" '; output;
run ;

data want;
 set have;
 length want $ 100;
 pid=prxparse('/\[\w+?\]/');
 s=1;e=length(have);
 call prxnext(pid,s,e,have,p,l);
 do while(p>0);
   want=catx(',',want,compress(substr(have,p,l),'[]'));
   call prxnext(pid,s,e,have,p,l);
 end;
drop pid s e p l;
run;
Patrick
Opal | Level 21

Here a variation to what @Ksharp proposed.

data have;
  length have $100.;
  have='[ONE] "/" [TWO] "/(" [THREE] "-" [FOUR] ")\" [FIVE]';
  output;
  have='[SIX] "(" [SEVEN] ")" [EIGHT] "-" [NINE] "--" ';
  output;
  have='[ONE] "#" [TWO] "have" [THREE] "$" [NINE] "with" ';
  output;
  have=' "combine" [TWO] "with" [THREE] "[" [NINE] "]" ';
  output;
  have='"[" [] "]"  "combine" [TWO] "with" [THREE] "[" [x] "]"  "[" [NINE] "]" ';
  output;
run;

data want(drop=_:);
  set have;
  length want $ 100;
  _pid=prxparse('/\[([^\[]*?)\]/');
  _start=1;
  _stop=length(have);
  do until(_pos<=0);
    call prxnext(_pid,_start,_stop,have,_pos,_len);
    want=catx(',',want,prxposn(_pid, 1, have));
  end;
run;

proc print data=want;
run;

Patrick_0-1596329470466.png

 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1270 views
  • 2 likes
  • 5 in conversation