Calcite | Level 5

## Conversion of MRN numbers in a data set

I have a fairly large data set of roughly 350,000 medical record numbers that were sent from a hospital database in the format of "X-#########" where x is the initial of the hospital and the digits are the actual MRN numbers. How do i completely eliminate the initial and hyphen for each observation in that column, leaving only the MRN?

e.g.  H-364803055 --> 364803055

Thank You

1 ACCEPTED SOLUTION

Accepted Solutions
Lapis Lazuli | Level 10

## Re: Conversion of MRN numbers in a data set

data have;

length full_mrn \$20;

input full_mrn \$;

datalines;

H-364803055

run;

/*if you want character*/

data want;

set have;

mrn=substr(full_mrn,indexc(full_mrn,"-")+1);

run;

/*if you want numeric*/

data want2;

set have;

mrn=input(substr(full_mrn,indexc(full_mrn,"-")+1),20.);

run;

7 REPLIES 7
Lapis Lazuli | Level 10

## Re: Conversion of MRN numbers in a data set

data have;

length full_mrn \$20;

input full_mrn \$;

datalines;

H-364803055

run;

/*if you want character*/

data want;

set have;

mrn=substr(full_mrn,indexc(full_mrn,"-")+1);

run;

/*if you want numeric*/

data want2;

set have;

mrn=input(substr(full_mrn,indexc(full_mrn,"-")+1),20.);

run;

Calcite | Level 5

## Re: Conversion of MRN numbers in a data set

Thank you, this works:

data want2;

set have;

mrn=input(substr(mrn, indexc(mrn, "-") +1), 20.);

run;

One issue though, var does not convert to numeric. I get this kind of note:

NOTE: Numeric values have been converted to character values at the places given by:

(LINE):(COLUMN).

224:19

Any quick fix? I am going to be concatenating this list with another where the mrn is numeric.

Lapis Lazuli | Level 10

## Re: Conversion of MRN numbers in a data set

You can't redefine mrn.  Create a different variable to hold the numeric value:

mrn_num=input(substr(mrn, indexc(mrn, "-") +1), 20.);

Onyx | Level 15

## Re: Conversion of MRN numbers in a data set

There are many ways to get what you want, following are just a few examples:

data test;

var='H-364803055';

var1=compress(var,,'kd');

var2=substr(var,3);

var3=scan(var,2);

var4=prxchange("s/^.*-(\d+)/\$1/o",-1,var);

run;

Haikuo

Opal | Level 21

## Re: Conversion of MRN numbers in a data set

@Hai.kuo

If this is only about replacing characters then why formulate a RegEx for the whole string with a capture buffer instead of simply formulate a RegEx for the stuff you don't want and then replace it with a NULL string? ...Being a bit on a mission here because I've seen PGStats doing the same as well.

var5=prxchange("s/[^\d]//o",-1,var);

Onyx | Level 15

## Re: Conversion of MRN numbers in a data set

Good Point, Patrick. Can't speak for PG, but for me is nothing more than a habit. Always good to have another way to think.

Haikuo

Calcite | Level 5

## Re: Conversion of MRN numbers in a data set

Why not scan(mm,1,'-')?

Discussion stats
• 7 replies
• 1955 views
• 3 likes
• 5 in conversation