BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Jedrek369
Fluorite | Level 6

Hello!

I would like to derive "1514#0_2021Q3_2022Q1$" substring from the following string:

 

1514#0 Deep Dive$   1514#0_2021Q3_2022Q1$   1515#0 Deep Dive$   1515#0_2021Q1_2022Q1$   XYZ$  Dictionary$

The following conditions need to be met: 1) the substring has at least one "Q" letter, 2) substrings in the list are ordered randomly.

 

I have used the following pattern, but it fell short:

 

/1514.*?Q.*?\$?/i

 

Please see the code for your reference:

%macro prxsubstr(pattern, string);
	%let regex_id = %sysfunc(prxparse(&pattern.));
	%let position = 0;
	%let length = 0;
	%syscall prxsubstr(regex_id, string, position, length);
	%global substring;
	%let substring = %substr(&string., &position., &length.);
	%put &substring.;
%mend;

%let pattern = /1514.*?Q.*?\$?/i;
%let list = 1514#0 Deep Dive$   1514#0_2021Q3_2022Q1$   1515#0 Deep Dive$   1515#0_2021Q1_2022Q1$   ARS_CRS KPI Results$  Dictionary$;

%prxsubstr(&pattern., &list.);

Any help will be appreciated.

 

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hello @Jedrek369,

 

Is the "specific character" mentioned in the subject line the dollar sign? If so, why do you make it optional in your pattern?


@Jedrek369 wrote:
/1514.*?Q.*?\$?/i

Perhaps because a missing dollar sign would be acceptable at the end of the list? In this case I would put

(\$|$)

to the end of the pattern (where the first "$" is the dollar sign and the second the end-of-string mark).

 

I think it would help to replace the periods (.) in your pattern with "negative character sets" (see Class Groupings) excluding "Q" or "$" or both:

/1514[^Q\$]*Q[^\$]*\$/i

(Also, would a leading word boundary (\b) help or would "161514..." be fine?)

View solution in original post

4 REPLIES 4
FreelanceReinh
Jade | Level 19

Hello @Jedrek369,

 

Is the "specific character" mentioned in the subject line the dollar sign? If so, why do you make it optional in your pattern?


@Jedrek369 wrote:
/1514.*?Q.*?\$?/i

Perhaps because a missing dollar sign would be acceptable at the end of the list? In this case I would put

(\$|$)

to the end of the pattern (where the first "$" is the dollar sign and the second the end-of-string mark).

 

I think it would help to replace the periods (.) in your pattern with "negative character sets" (see Class Groupings) excluding "Q" or "$" or both:

/1514[^Q\$]*Q[^\$]*\$/i

(Also, would a leading word boundary (\b) help or would "161514..." be fine?)

Jedrek369
Fluorite | Level 6

Thank you! "/1514[^Q\$]*Q[^\$]*\$/i" pattern worked.

 

Would you mind explaining why replacement of dots to "[^Q\$]" and "[^\$]" worked?

 

Sorry for the late response. I had so many work assignments I did not find the time for the code.

FreelanceReinh
Jade | Level 19

You're welcome. Glad to hear that it worked.

 

There are three fixed parts in your pattern: the "1514", at least one "Q" and the "$" sign -- in this order, with the dollar sign ending the pattern. Between the "1514" and the first "Q" almost any characters are allowed, but a "Q" is logically impossible because then this would be the first "Q" and a "$" sign is not acceptable either as it would indicate the end of a string not containing a "Q". This explains why "Q" and "$" had to be excluded in that place. The greedy "*" repetition factor could be used with the [^Q\$] pattern because the search would stop at the "Q" (but not earlier) anyway.

 

Between the first "Q" and the closing "$", again, almost any characters are allowed (this time including "Q"), but logically not a (prematurely closing) "$" sign. This explains why I used [^\$] there. As above, the greedy "*" repetition factor is possible because of the explicit "$" closing the pattern. Alternatively, however, you could leave your original .*? in that place because the lazy "*?" repetition factor would prevent the pattern from including another "$" sign.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 596 views
  • 2 likes
  • 2 in conversation