BookmarkSubscribeRSS Feed
SarahW13
Obsidian | Level 7

I have a dataset in which each row is a long block of text, and I need to extract specific information from each row.

 

Below is an example:

 

id       Text

1        Physical Exam. Vital Signs. BP: 134/93

2        Patient's physical exam was notable for BP of 142/100

3        Physical Exam. BP: 100/80

 

Below is what I need:

 

id       Text

1        BP: 134/93

2        BP of 142/100

3        BP: 100/80

 

Any advice?

7 REPLIES 7
PeterClemmensen
Tourmaline | Level 20

Do something like this

 

data have;
input id Text :$200.;
infile datalines4 dlm=',';
datalines;
1,Physical Exam. Vital Signs. BP: 134/93
2,Patient's physical exam was notable for BP of 142/100
3,Physical Exam. BP: 100/80
;

data want;
    set have;
    NewString=substr(Text, index(Text, 'BP'));
run;
alexpat
Fluorite | Level 6

data cardiac;
input id Text :$200.;
infile datalines4 dlm=',';
datalines;
1,Physical Exam. Vital Signs. BP: 134/93
2,Patient's physical exam was notable for BP of 142/100
3,Physical Exam. BP: 100/80
;

data strepto;
set cardiac;
NewString=substr(Text, index(Text, 'BP'));
run;

Jagadishkatam
Amethyst | Level 16

Alternatively with prxchange function

 

data have;
input id Text&$100.;
text=prxchange('s/(.*)(bp.*)/$2/i',-1,text);
datalines4;
1 Physical Exam. Vital Signs. BP: 134/93
2 Patient's physical exam was notable for BP of 142/100
3 Physical Exam. BP: 100/80
;;;;
Thanks,
Jag
alexpat
Fluorite | Level 6
Rows 1-3
Total rows: 3Total columns: 3
 
id
 
Text
 
NewString
 
11Physical Exam. Vital Signs. BP: 134/93BP: 134/93
22Patient's physical exam was notable for BP of 142/100BP of 142/100
33Physical Exam. BP: 100/80BP: 100/80
Ksharp
Super User
data have;
input id Text :$200.;
infile datalines4 dlm=',';
datalines;
1,Physical Exam. Vital Signs. BP: 134/93
2,Patient's physical exam was notable for BP of 142/100
3,Physical Exam. BP: 100/80
;

data want;
    set have;
    p=prxmatch('/\bBP\b/i',text);
    if p then NewString=substr(Text, p);
run;

proc print;run;
alexpat
Fluorite | Level 6

data sle;
set autoimmun;
p=prxmatch('/\bBP\b/i',text);
if p then NewString=substr(Text, p);
run;

 

Thanks Ksharp, another way to estract ///

ballardw
Super User

@SarahW13 wrote:

I have a dataset in which each row is a long block of text, and I need to extract specific information from each row.

 

Below is an example:

 

id       Text

1        Physical Exam. Vital Signs. BP: 134/93

2        Patient's physical exam was notable for BP of 142/100

3        Physical Exam. BP: 100/80

 

Below is what I need:

 

id       Text

1        BP: 134/93

2        BP of 142/100

3        BP: 100/80

 

Any advice?


is BP always the last item entered? Is it always recorded with the / dividing the measurements?

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1033 views
  • 0 likes
  • 6 in conversation