BookmarkSubscribeRSS Feed
SarahW13
Obsidian | Level 7

I have a dataset in which each row is a long block of text, and I need to extract specific information from each row.

 

Below is an example:

 

id       Text

1        Physical Exam. Vital Signs. BP: 134/93

2        Patient's physical exam was notable for BP of 142/100

3        Physical Exam. BP: 100/80

 

Below is what I need:

 

id       Text

1        BP: 134/93

2        BP of 142/100

3        BP: 100/80

 

Any advice?

7 REPLIES 7
PeterClemmensen
Tourmaline | Level 20

Do something like this

 

data have;
input id Text :$200.;
infile datalines4 dlm=',';
datalines;
1,Physical Exam. Vital Signs. BP: 134/93
2,Patient's physical exam was notable for BP of 142/100
3,Physical Exam. BP: 100/80
;

data want;
    set have;
    NewString=substr(Text, index(Text, 'BP'));
run;
alexpat
Fluorite | Level 6

data cardiac;
input id Text :$200.;
infile datalines4 dlm=',';
datalines;
1,Physical Exam. Vital Signs. BP: 134/93
2,Patient's physical exam was notable for BP of 142/100
3,Physical Exam. BP: 100/80
;

data strepto;
set cardiac;
NewString=substr(Text, index(Text, 'BP'));
run;

Jagadishkatam
Amethyst | Level 16

Alternatively with prxchange function

 

data have;
input id Text&$100.;
text=prxchange('s/(.*)(bp.*)/$2/i',-1,text);
datalines4;
1 Physical Exam. Vital Signs. BP: 134/93
2 Patient's physical exam was notable for BP of 142/100
3 Physical Exam. BP: 100/80
;;;;
Thanks,
Jag
alexpat
Fluorite | Level 6
Rows 1-3
Total rows: 3Total columns: 3
 
id
 
Text
 
NewString
 
11Physical Exam. Vital Signs. BP: 134/93BP: 134/93
22Patient's physical exam was notable for BP of 142/100BP of 142/100
33Physical Exam. BP: 100/80BP: 100/80
Ksharp
Super User
data have;
input id Text :$200.;
infile datalines4 dlm=',';
datalines;
1,Physical Exam. Vital Signs. BP: 134/93
2,Patient's physical exam was notable for BP of 142/100
3,Physical Exam. BP: 100/80
;

data want;
    set have;
    p=prxmatch('/\bBP\b/i',text);
    if p then NewString=substr(Text, p);
run;

proc print;run;
alexpat
Fluorite | Level 6

data sle;
set autoimmun;
p=prxmatch('/\bBP\b/i',text);
if p then NewString=substr(Text, p);
run;

 

Thanks Ksharp, another way to estract ///

ballardw
Super User

@SarahW13 wrote:

I have a dataset in which each row is a long block of text, and I need to extract specific information from each row.

 

Below is an example:

 

id       Text

1        Physical Exam. Vital Signs. BP: 134/93

2        Patient's physical exam was notable for BP of 142/100

3        Physical Exam. BP: 100/80

 

Below is what I need:

 

id       Text

1        BP: 134/93

2        BP of 142/100

3        BP: 100/80

 

Any advice?


is BP always the last item entered? Is it always recorded with the / dividing the measurements?

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 1686 views
  • 0 likes
  • 6 in conversation