BookmarkSubscribeRSS Feed
SarahW13
Obsidian | Level 7

I have a dataset in which each row is a long block of text, and I need to extract specific information from each row.

 

Below is an example:

 

id       Text

1        Physical Exam. Vital Signs. BP: 134/93

2        Patient's physical exam was notable for BP of 142/100

3        Physical Exam. BP: 100/80

 

Below is what I need:

 

id       Text

1        BP: 134/93

2        BP of 142/100

3        BP: 100/80

 

Any advice?

7 REPLIES 7
PeterClemmensen
Tourmaline | Level 20

Do something like this

 

data have;
input id Text :$200.;
infile datalines4 dlm=',';
datalines;
1,Physical Exam. Vital Signs. BP: 134/93
2,Patient's physical exam was notable for BP of 142/100
3,Physical Exam. BP: 100/80
;

data want;
    set have;
    NewString=substr(Text, index(Text, 'BP'));
run;
alexpat
Fluorite | Level 6

data cardiac;
input id Text :$200.;
infile datalines4 dlm=',';
datalines;
1,Physical Exam. Vital Signs. BP: 134/93
2,Patient's physical exam was notable for BP of 142/100
3,Physical Exam. BP: 100/80
;

data strepto;
set cardiac;
NewString=substr(Text, index(Text, 'BP'));
run;

Jagadishkatam
Amethyst | Level 16

Alternatively with prxchange function

 

data have;
input id Text&$100.;
text=prxchange('s/(.*)(bp.*)/$2/i',-1,text);
datalines4;
1 Physical Exam. Vital Signs. BP: 134/93
2 Patient's physical exam was notable for BP of 142/100
3 Physical Exam. BP: 100/80
;;;;
Thanks,
Jag
alexpat
Fluorite | Level 6
Rows 1-3
Total rows: 3Total columns: 3
 
id
 
Text
 
NewString
 
11Physical Exam. Vital Signs. BP: 134/93BP: 134/93
22Patient's physical exam was notable for BP of 142/100BP of 142/100
33Physical Exam. BP: 100/80BP: 100/80
Ksharp
Super User
data have;
input id Text :$200.;
infile datalines4 dlm=',';
datalines;
1,Physical Exam. Vital Signs. BP: 134/93
2,Patient's physical exam was notable for BP of 142/100
3,Physical Exam. BP: 100/80
;

data want;
    set have;
    p=prxmatch('/\bBP\b/i',text);
    if p then NewString=substr(Text, p);
run;

proc print;run;
alexpat
Fluorite | Level 6

data sle;
set autoimmun;
p=prxmatch('/\bBP\b/i',text);
if p then NewString=substr(Text, p);
run;

 

Thanks Ksharp, another way to estract ///

ballardw
Super User

@SarahW13 wrote:

I have a dataset in which each row is a long block of text, and I need to extract specific information from each row.

 

Below is an example:

 

id       Text

1        Physical Exam. Vital Signs. BP: 134/93

2        Patient's physical exam was notable for BP of 142/100

3        Physical Exam. BP: 100/80

 

Below is what I need:

 

id       Text

1        BP: 134/93

2        BP of 142/100

3        BP: 100/80

 

Any advice?


is BP always the last item entered? Is it always recorded with the / dividing the measurements?

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 1565 views
  • 0 likes
  • 6 in conversation