BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mozty
Calcite | Level 5

Dear community,

 

I am trying to subtract a specific group of words separated by dot.

The first word will always be LIB followed by a dataset name and a variable name. However, in the same description LIB can be present multiple times, but only once will have both dataset and variable name attached. 

 

I've tried using PRX functions but without success.

 

data have;
length text $200;
infile datalines dsd;
input text;
datalines;
Random text LIB.XX_YYY.ZZZZZZZZ random text LIB.XX_YYY random text.
Random text LIB.AA random text LIB.AA.BB_BBB
LIB.XX3_X.YZYZYZ3
Random text LIB.AA1: random text LIB.AA1.ABCD, random text LIB. 
;
run;

So from the above dataset I would like to keep in a variable: 'LIB.XX_YYY.ZZZZZZZZ' for first row, 'LIB.AA.BB_BBB' for the second row etc.

 

Thank you.

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

Get possibly multiple matches per string with prxNext:

 

data want;
if not prxId then prxId + prxParse("/LIB\.\w+\.\w+/");
set have;
start = 1; stop = length(text);
call prxnext(prxID, start, stop, text, position, length);
  do while (position > 0);
     word = substr(text, position, length);
     output;
     call prxnext(prxID, start, stop, text, position, length);
  end;
drop start stop position length;
run;

Add if position=0 then output; after the first call to prxNext if you want some output for empty matches.

PG

View solution in original post

3 REPLIES 3
ballardw
Super User

Please show what the exact desired output for the given data would be.

 

And you should provide a general rule. We cannot tell what the "etc" results would be because there is no general rule to apply.

mozty
Calcite | Level 5

The general rule is on each row to locate and keep in a new variable only the group of 3 words starting with LIB and having '.' as separator. 

 

My desired output is:

keep
LIB.XX_YYY.ZZZZZZZZ
LIB.AA.BB_BBB
LIB.XX3_X.YZYZYZ3
LIB.AA1.ABCD
PGStats
Opal | Level 21

Get possibly multiple matches per string with prxNext:

 

data want;
if not prxId then prxId + prxParse("/LIB\.\w+\.\w+/");
set have;
start = 1; stop = length(text);
call prxnext(prxID, start, stop, text, position, length);
  do while (position > 0);
     word = substr(text, position, length);
     output;
     call prxnext(prxID, start, stop, text, position, length);
  end;
drop start stop position length;
run;

Add if position=0 then output; after the first call to prxNext if you want some output for empty matches.

PG

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 459 views
  • 1 like
  • 3 in conversation