BookmarkSubscribeRSS Feed
jsberger
Obsidian | Level 7

Hi All,

 

I have a somewhat large text field that I need to parse out by comma, but only when the comma is followed by a space AND an uppercase letter.  So in the following example, the single variable containing the following text:

 

"Laws, regulations and policies, Parental monitoring and supervision, Positive contributions to peer group"

 

Would result in 3 separate variables as:

Laws, regulations and policies

Parental monitoring and supervision

Positive contributions to peer group

 

I've parsed fields by commas before, I suspect it has to involve ANYUPPER, but i'm unsure how to go about it. Here's how i've approached the simpler version:

 

**identify the maximum number of values in the target minority group field and populate max_elem5 field;

proc sql noprint;

select max(count(textvar,','))+1 into :max_elem

      from have;

 

data want;

      set have;

      **create character and string substance vars;

      array tsub_vars $ 50 targsub1-targsub%eval(&max_elem);

      do i = 1 to &max_elem;

           tsub_vars{i} = strip(scan(textvar,i,','));

      end;

      run;

 

Any guidance is greatly appreciated!

 

J

5 REPLIES 5
ChrisNZ
Tourmaline | Level 20

Try this:

data T;
  STR="Laws, regulations and policies, Parental monitoring and supervision, Positive contributions to peer group";
  POS=prxmatch('/, [A-Z]/',STR);
run;

This looks for:  comma then space then an uppercase letter.
 

jsberger
Obsidian | Level 7

Awesome ChrisNZ....thanks much!

Ksharp
Super User

Base on @ChrisNZ 's idea.

 

data x;
x="Laws, regulations and policies, Parental monitoring and supervision, Positive contributions to peer group";
do i=1 to 99;
 p=prxmatch('/,\s+[A-Z]/',x);
 temp=substr(x,1,p+1);
 if p=0 then temp=x;
 output;
 x=substr(x,p+1);
 if p=0 then leave;
end;
run;
proc print noobs;run;
Ksharp
Super User
data x;
x="Laws, regulations and policies, Parental monitoring and supervision, Positive contributions to peer group";
do i=1 to 99;
 p=prxmatch('/,\s+[A-Z]/',x);
 temp=substr(x,1,p+1);
 if p=0 then temp=x;
 output;
 x=substr(x,p+1);
 if p=0 then leave;
end;
run;
proc print noobs;run;
jsberger
Obsidian | Level 7

Hi Kevin,

 

This is even better...i was working through this when you posted.  Thanks both!

 

Jason

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1671 views
  • 5 likes
  • 3 in conversation