BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mariko5797
Pyrite | Level 9

I want to make the days listed in the "Comments" column as it's own column. I would prefer each day to be it's own row.

Example: If COMMENTS = "No assessment made on Day 16, Day 17, Day 19" then I would extract DAYS = 16, 17, 19

 

1 ACCEPTED SOLUTION

Accepted Solutions
s_lassen
Meteorite | Level 14

Something like this may work:

data want;
  set have;
  prxid=prxparse('/\bDay\s+\d+\b/i');
  start=1;
  end=length(comments);
  do until(0);
    call prxnext(prxid,start,end,comments,pos,len);
    if pos=0 then leave;
    day=input(scan(substr(comments,pos,len),2),8.0);
    output;
    end;
  drop start end pos len prxid;
run;

The PRX expression looks for a word boundary (\b), followed by "Day" and 1 or more whitespaces (blanks or tab characters) (\s+), 1 or more digits (\d+) and a word boundary again. The "i" at the end means to look for e.g. both "day", "DAY" and "Day".

 

If you want the DAY as a character variable, just drop the INPUT function (and perhaps declare the DAY variable with e.g "length day $8;" as it will otherwise get the same length as the COMMENTS variable).

View solution in original post

3 REPLIES 3
PaigeMiller
Diamond | Level 26

@mariko5797 wrote:

I want to make the days listed in the "Comments" column as it's own column. I would prefer each day to be it's own row.

Example: If COMMENTS = "No assessment made on Day 16, Day 17, Day 19" then I would extract DAYS = 16, 17, 19

 


Do you mean:

Example: If COMMENTS = "No assessment made on Day 16, Day 17, Day 19" then I would extract DAYS = 16 on one row, DAYS = 17 on another row, and DAYS= 19 on another row

 

 

--
Paige Miller
tarheel13
Rhodochrosite | Level 12

so you want days to be a character variable with the days from comment variable? could you post some sample data as datelines and desired output? 

s_lassen
Meteorite | Level 14

Something like this may work:

data want;
  set have;
  prxid=prxparse('/\bDay\s+\d+\b/i');
  start=1;
  end=length(comments);
  do until(0);
    call prxnext(prxid,start,end,comments,pos,len);
    if pos=0 then leave;
    day=input(scan(substr(comments,pos,len),2),8.0);
    output;
    end;
  drop start end pos len prxid;
run;

The PRX expression looks for a word boundary (\b), followed by "Day" and 1 or more whitespaces (blanks or tab characters) (\s+), 1 or more digits (\d+) and a word boundary again. The "i" at the end means to look for e.g. both "day", "DAY" and "Day".

 

If you want the DAY as a character variable, just drop the INPUT function (and perhaps declare the DAY variable with e.g "length day $8;" as it will otherwise get the same length as the COMMENTS variable).

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1096 views
  • 0 likes
  • 4 in conversation