BookmarkSubscribeRSS Feed
CynthiaWei
Obsidian | Level 7

Hi SAS Pro,

 

I'm having a question that I want to extract "14 days" or "10 days" and any "digit days" texts from sentences. The tricky thing is that the position of the substring could be various.

 

data have;

input X $80.;

datalines;

I spent 14 days in Boston. I love it.

Exercise twice a day for 10 days until May.

I left there. It was about 7 days ago.

;

run;

 

What I want is to have

1. a new character variable Y including 14 days, 10 days, and 7 days. 

2. a numeric variable Z with the values of 14, 10, and 7.

 

Any help would be highly appreciated!!

Best regards,

C

 

6 REPLIES 6
Astounding
PROC Star

What result would you like for this input?

 

Exercise twice per day for 14 days, then reduce to once per day for 5 days.
ErikLund_Jensen
Rhodochrosite | Level 12

Hi @CynthiaWei 

Updated solution

 

My previous attempt worked with your data, this is more robust:

 

data have;
input X $80.;
datalines;
I spent 14 days in Boston. I love it.
Exercise twice a day for 10 days until May.
I left there. It was about 7 days ago.
I had to wait 7 days
8 days is more than I care to wait
I have waited 12 days 3 times
;
run;

data want;
  set have;
  length Y $8;
  y = prxchange('s/(.*\s*)(\d+\s)(days\s)(.*)/$2$3/',-1,X);
  Z = input(scan(Y,1,' '),8.);
run;
CynthiaWei
Obsidian | Level 7
Thank you so much for the code. Just out of curiosity, what do those statements mean within the PRX function?
CynthiaWei
Obsidian | Level 7
Thank you for asking this. What are the syntax if a. I only want the first text and b. any text that meets the target text.

Thanks a lot!
ErikLund_Jensen
Rhodochrosite | Level 12

Hi @CynthiaWei 

 

SAS PRX functions are great for this kind of work.

 

data have;
input X $80.;
datalines;
I spent 14 days in Boston. I love it.
Exercise twice a day for 10 days until May.
I left there. It was about 7 days ago.
;
run;

data want;
  set have;
  length Y $8;
  y = prxchange('s/(.*)(\s\d+\s\D+\s) (.*)/$2/',-1,X);
  Z = input(scan(Y,1,' '),8.);
run;

extractdays.gif

PaigeMiller
Diamond | Level 26

You want to find the word "days" somewhere in this text string, and then extract the previous "word" which is probably going to be the number you want.

 

data want;
    set have;
    do i=1 to countw(x,' ');
        if scan(x,i,' ')='days' then do;
           number_of_days=input(scan(x,i-1,' '),4.); /* Input turns this number of days into a numeric value */
           output;
        end;
    end;
    drop i;
run; 

 

--
Paige Miller

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 684 views
  • 0 likes
  • 4 in conversation