BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Cruise
Ammonite | Level 13

I have two datasets, "codes" and "lookup_ranges". The first one has mixture of medical procedure codes including CPT billing codes. My goal is to define and label CPT codes in the crowd. Second dataset has ranges of values for CPT procedure codes. Look up range code file doesn't have individual values for each codes consistently but expressed in ranges mostly, i.e., if codes fall in the range of 10030-10499 then Muskuloskeletal Procedure, so forth so on. I tried proc format. SAS gave no errors in log but codes apparently defined in the proc format was not taken up in the proc print output. I also had no idea how to make permanent dataset with code labels as formatted in proc format. For ICD-10 procedure code I had look-up table with values for each procedure codes instead in ranges like this for CPT codes. Thus, I was able to simply merge look-up table to codes dataset on the codes with no problem. 

 

How to use look up range codes to define continuous codes listed in the first table? I really appreciate your help guys.

Thanks in advance.

 

/*say, you had time to look at my data attached and import*/
proc import datafile="......lookup_ranges.csv"
out=lookup_ranges
dbms=csv 
replace;
getnames=yes;
guessingrows=max;
run;

proc import datafile="......codes.txt"
out=codes
dbms=tab
replace;
getnames=yes;
guessingrows=max;
run;

 

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

See this code for converting your lookup_ranges dataset to a feed for proc format, and then applying the format to your dataset codes:

data cntlin (keep=fmtname start label type hlo);
set lookup_ranges (rename=(description=label)) end=eof;
length
  trailchar $1
  start $5
  hlo $1
;
fmtname = 'range_lookup';
type = 'C';
if missing(var4) then var4 = code_ranges;
if anyalpha(code_ranges) = 5
then do;
  trailchar = substr(code_ranges,5,1);
  fromval = input(substr(code_ranges,1,4),4.);
  toval = input(substr(var4,1,4),4.);
end;
else do;
  trailchar = '';
  fromval = input(code_ranges,5.);
  toval = input(var4,5.);
end;
do i = fromval to toval;
  if trailchar > ''
  then start = put(i,z4.) !! trailchar;
  else start = put(i,z5.);
  output;
end;
if eof
then do;
  start = '';
  label = '**OTHER**';
  hlo = 'O';
  output;
end;
run;

proc format lib=work cntlin=cntlin;
run;

data test;
set codes;
format proc_cd $range_lookup.;
run;

Note:

  • there is an option for ranges (usin end in addition to start) in a cntlin file, but that won't work because of the trailing characters. So I opted to write out single values all the way through.
  • you will have to normalize your codes dataset, as there are all kinds of lengths in there:
if anyalpha(proc_cd) ne 0
then proc_cd = substr(proc_cd,length(proc_cd - 4));
else proc_cd = put(input(proc_cd,8.),z5.);

When scanning through the codes dataset, I saw that there are codes with 2 trailing letters, which will necessitate a slightly expanded code for the cntlin file.

View solution in original post

9 REPLIES 9
ChrisNZ
Tourmaline | Level 20

Formats should work.

proc format ;
  value cpt 10030-10499 ='Muskuloskeletal Procedure' ;
data _null_;
  A=10050; put A= cpt.; 
run;

A=Muskuloskeletal Procedure

 

Cruise
Ammonite | Level 13
Hi Chris, Thanks. Is there any way to output the result to permanent dataset? I need a separate in actual dataset as a column which defines the codes not only for presenting as visual.
Cruise
Ammonite | Level 13
I just updated the file attached. Thanks Kurt.
Kurt_Bremser
Super User

I see at least one logical problem that needs to be checked design-wise.

See these two lines in lookup_ranges.csv:

Path_lab,81105,-,81479  ,Molecular_Pathology
Path_lab,81410,-,81471  ,Genomicequencing_Other_Molecular_MultianalyteAssays

or these:

Eval_Mgmt,99217,-,99226  ,HospitalObservation
Eval_Mgmt,99221,-,99239  ,HospitalInpatient_

From any programmer's view, these ranges overlap, and useful SAS tools like proc format don't allow that.

 

For the rest, I'd create a piece of code that expands ranges into sequences of single values (ie dealing with the special cases where the character at the end seems to act as the most significant digit), and then feeds that into proc format.

Cruise
Ammonite | Level 13
Good catch! I'll look into the overlapped ends right now. However, how to output proc format feed to a dataset? I need values defined in separate column in actual dataset. I eventually will conduct survival analysis where category of these variables needed to be controlled in modelling.
Kurt_Bremser
Super User

See this code for converting your lookup_ranges dataset to a feed for proc format, and then applying the format to your dataset codes:

data cntlin (keep=fmtname start label type hlo);
set lookup_ranges (rename=(description=label)) end=eof;
length
  trailchar $1
  start $5
  hlo $1
;
fmtname = 'range_lookup';
type = 'C';
if missing(var4) then var4 = code_ranges;
if anyalpha(code_ranges) = 5
then do;
  trailchar = substr(code_ranges,5,1);
  fromval = input(substr(code_ranges,1,4),4.);
  toval = input(substr(var4,1,4),4.);
end;
else do;
  trailchar = '';
  fromval = input(code_ranges,5.);
  toval = input(var4,5.);
end;
do i = fromval to toval;
  if trailchar > ''
  then start = put(i,z4.) !! trailchar;
  else start = put(i,z5.);
  output;
end;
if eof
then do;
  start = '';
  label = '**OTHER**';
  hlo = 'O';
  output;
end;
run;

proc format lib=work cntlin=cntlin;
run;

data test;
set codes;
format proc_cd $range_lookup.;
run;

Note:

  • there is an option for ranges (usin end in addition to start) in a cntlin file, but that won't work because of the trailing characters. So I opted to write out single values all the way through.
  • you will have to normalize your codes dataset, as there are all kinds of lengths in there:
if anyalpha(proc_cd) ne 0
then proc_cd = substr(proc_cd,length(proc_cd - 4));
else proc_cd = put(input(proc_cd,8.),z5.);

When scanning through the codes dataset, I saw that there are codes with 2 trailing letters, which will necessitate a slightly expanded code for the cntlin file.

Cruise
Ammonite | Level 13

@Kurt_Bremser

This is awesome! I can use "Cntlin" data with codes laid out to single values all the way through as a reference data. I would merge cntlin data to my original file with patients associated with codes and use the label column from Cntlin data you created for code categories. I lost the data with code list by single values when I lose CD-ROM came with our CPT codebook.

Kurt_Bremser
Super User

When you create and use the format, you can avoid the usual hassles of merging (sort needed, creation of SQL utility file etc.). Just one sequential pass through the main dataset is needed.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 2032 views
  • 3 likes
  • 3 in conversation