BookmarkSubscribeRSS Feed
CynthiaWei
Obsidian | Level 7

Hi SAS Pro,

 

I'm having a question that I want to extract "14 days" or "10 days" and any "digit days" texts from sentences. The tricky thing is that the position of the substring could be various.

 

data have;

input X $80.;

datalines;

I spent 14 days in Boston. I love it.

Exercise twice a day for 10 days until May.

I left there. It was about 7 days ago.

;

run;

 

What I want is to have

1. a new character variable Y including 14 days, 10 days, and 7 days. 

2. a numeric variable Z with the values of 14, 10, and 7.

 

Any help would be highly appreciated!!

Best regards,

C

 

6 REPLIES 6
Astounding
Opal | Level 21

What result would you like for this input?

 

Exercise twice per day for 14 days, then reduce to once per day for 5 days.
ErikLund_Jensen
Rhodochrosite | Level 12

Hi @CynthiaWei 

Updated solution

 

My previous attempt worked with your data, this is more robust:

 

data have;
input X $80.;
datalines;
I spent 14 days in Boston. I love it.
Exercise twice a day for 10 days until May.
I left there. It was about 7 days ago.
I had to wait 7 days
8 days is more than I care to wait
I have waited 12 days 3 times
;
run;

data want;
  set have;
  length Y $8;
  y = prxchange('s/(.*\s*)(\d+\s)(days\s)(.*)/$2$3/',-1,X);
  Z = input(scan(Y,1,' '),8.);
run;
CynthiaWei
Obsidian | Level 7
Thank you so much for the code. Just out of curiosity, what do those statements mean within the PRX function?
CynthiaWei
Obsidian | Level 7
Thank you for asking this. What are the syntax if a. I only want the first text and b. any text that meets the target text.

Thanks a lot!
ErikLund_Jensen
Rhodochrosite | Level 12

Hi @CynthiaWei 

 

SAS PRX functions are great for this kind of work.

 

data have;
input X $80.;
datalines;
I spent 14 days in Boston. I love it.
Exercise twice a day for 10 days until May.
I left there. It was about 7 days ago.
;
run;

data want;
  set have;
  length Y $8;
  y = prxchange('s/(.*)(\s\d+\s\D+\s) (.*)/$2/',-1,X);
  Z = input(scan(Y,1,' '),8.);
run;

extractdays.gif

PaigeMiller
Diamond | Level 26

You want to find the word "days" somewhere in this text string, and then extract the previous "word" which is probably going to be the number you want.

 

data want;
    set have;
    do i=1 to countw(x,' ');
        if scan(x,i,' ')='days' then do;
           number_of_days=input(scan(x,i-1,' '),4.); /* Input turns this number of days into a numeric value */
           output;
        end;
    end;
    drop i;
run; 

 

--
Paige Miller

SAS INNOVATE 2024

Innovate_SAS_Blue.png

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Get the $99 certification deal.jpg

 

 

Back in the Classroom!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 290 views
  • 0 likes
  • 4 in conversation