BookmarkSubscribeRSS Feed
CynthiaWei
Obsidian | Level 7

Hi SAS Pro,

 

I'm having a question that I want to extract "14 days" or "10 days" and any "digit days" texts from sentences. The tricky thing is that the position of the substring could be various.

 

data have;

input X $80.;

datalines;

I spent 14 days in Boston. I love it.

Exercise twice a day for 10 days until May.

I left there. It was about 7 days ago.

;

run;

 

What I want is to have

1. a new character variable Y including 14 days, 10 days, and 7 days. 

2. a numeric variable Z with the values of 14, 10, and 7.

 

Any help would be highly appreciated!!

Best regards,

C

 

6 REPLIES 6
Astounding
PROC Star

What result would you like for this input?

 

Exercise twice per day for 14 days, then reduce to once per day for 5 days.
ErikLund_Jensen
Rhodochrosite | Level 12

Hi @CynthiaWei 

Updated solution

 

My previous attempt worked with your data, this is more robust:

 

data have;
input X $80.;
datalines;
I spent 14 days in Boston. I love it.
Exercise twice a day for 10 days until May.
I left there. It was about 7 days ago.
I had to wait 7 days
8 days is more than I care to wait
I have waited 12 days 3 times
;
run;

data want;
  set have;
  length Y $8;
  y = prxchange('s/(.*\s*)(\d+\s)(days\s)(.*)/$2$3/',-1,X);
  Z = input(scan(Y,1,' '),8.);
run;
CynthiaWei
Obsidian | Level 7
Thank you so much for the code. Just out of curiosity, what do those statements mean within the PRX function?
CynthiaWei
Obsidian | Level 7
Thank you for asking this. What are the syntax if a. I only want the first text and b. any text that meets the target text.

Thanks a lot!
ErikLund_Jensen
Rhodochrosite | Level 12

Hi @CynthiaWei 

 

SAS PRX functions are great for this kind of work.

 

data have;
input X $80.;
datalines;
I spent 14 days in Boston. I love it.
Exercise twice a day for 10 days until May.
I left there. It was about 7 days ago.
;
run;

data want;
  set have;
  length Y $8;
  y = prxchange('s/(.*)(\s\d+\s\D+\s) (.*)/$2/',-1,X);
  Z = input(scan(Y,1,' '),8.);
run;

extractdays.gif

PaigeMiller
Diamond | Level 26

You want to find the word "days" somewhere in this text string, and then extract the previous "word" which is probably going to be the number you want.

 

data want;
    set have;
    do i=1 to countw(x,' ');
        if scan(x,i,' ')='days' then do;
           number_of_days=input(scan(x,i-1,' '),4.); /* Input turns this number of days into a numeric value */
           output;
        end;
    end;
    drop i;
run; 

 

--
Paige Miller

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1188 views
  • 0 likes
  • 4 in conversation