BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
rrevans
Calcite | Level 5

I have a fairly large data set of roughly 350,000 medical record numbers that were sent from a hospital database in the format of "X-#########" where x is the initial of the hospital and the digits are the actual MRN numbers. How do i completely eliminate the initial and hyphen for each observation in that column, leaving only the MRN?

e.g.  H-364803055 --> 364803055

      

Thank You

1 ACCEPTED SOLUTION

Accepted Solutions
DBailey
Lapis Lazuli | Level 10

data have;

length full_mrn $20;

input full_mrn $;

datalines;

H-364803055

run;

/*if you want character*/

data want;

set have;

mrn=substr(full_mrn,indexc(full_mrn,"-")+1);

run;

/*if you want numeric*/

data want2;

set have;

mrn=input(substr(full_mrn,indexc(full_mrn,"-")+1),20.);

run;

View solution in original post

7 REPLIES 7
DBailey
Lapis Lazuli | Level 10

data have;

length full_mrn $20;

input full_mrn $;

datalines;

H-364803055

run;

/*if you want character*/

data want;

set have;

mrn=substr(full_mrn,indexc(full_mrn,"-")+1);

run;

/*if you want numeric*/

data want2;

set have;

mrn=input(substr(full_mrn,indexc(full_mrn,"-")+1),20.);

run;

rrevans
Calcite | Level 5

Thank you, this works:

data want2;

set have;

mrn=input(substr(mrn, indexc(mrn, "-") +1), 20.);

run;

One issue though, var does not convert to numeric. I get this kind of note:

NOTE: Numeric values have been converted to character values at the places given by:

           (LINE):(COLUMN).

           224:19

Any quick fix? I am going to be concatenating this list with another where the mrn is numeric.

DBailey
Lapis Lazuli | Level 10

You can't redefine mrn.  Create a different variable to hold the numeric value:

mrn_num=input(substr(mrn, indexc(mrn, "-") +1), 20.);

Haikuo
Onyx | Level 15

There are many ways to get what you want, following are just a few examples:

data test;

  var='H-364803055';

  var1=compress(var,,'kd');

  var2=substr(var,3);

  var3=scan(var,2);

  var4=prxchange("s/^.*-(\d+)/$1/o",-1,var);

run;

Haikuo

Patrick
Opal | Level 21

@Hai.kuo

If this is only about replacing characters then why formulate a RegEx for the whole string with a capture buffer instead of simply formulate a RegEx for the stuff you don't want and then replace it with a NULL string? ...Being a bit on a mission here because I've seen PGStats doing the same as well.

var5=prxchange("s/[^\d]//o",-1,var);

Haikuo
Onyx | Level 15

Good Point, Patrick. Can't speak for PG, but for me is nothing more than a habit. Always good to have another way to think.

Haikuo

SandyH
Calcite | Level 5

Why not scan(mm,1,'-')?

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 2014 views
  • 3 likes
  • 5 in conversation