Help using Base SAS procedures

Format Variable based on description in csv file

Accepted Solution Solved
Reply
Contributor
Posts: 21
Accepted Solution

Format Variable based on description in csv file

I have diagnosis variable in my dataset which is ICD Code. I want to format Diagnosis variable in  CCS Category which available in csv format (attached).

Attached csv file has 3 columns: 1. ICD-Code ( Diagnosis in my data-set), 2. CCS Category (I want  Diagnosis formatted  in CCS Category), 3. CCS Category Description (I want CCS Category to be labeled with this information)

Below is how csv file column looks like. 15,000 ICD Codes are grouped in clinically meaningful  200 (approx) categories and I am trying to format Diagnosis Variable in my data set in those Categories. How can I do it?

Thanks.

'ICD-9-CM CODE''CCS CATEGORY''CCS CATEGORY DESCRIPTION'
'3220 ''76   ''Meningitis'
'3221 ''76   ''Meningitis'
'3222 ''76   ''Meningitis'
'3229 ''76   ''Meningitis'
'0361 ''77   ''Encephalitis'
'0462 ''77   ''Encephalitis'
'0498 ''77   ''Encephalitis'
'0499 ''77   ''Encephalitis'
'0520 ''77   ''Encephalitis'
'0543 ''77   ''Encephalitis'
'0550 ''77   ''Encephalitis'
'05601''77   ''Encephalitis'
'05821''77   ''Encephalitis'
'05829''77   ''Encephalitis'
'0620 ''77   ''Encephalitis'
Attachment

Accepted Solutions
Solution
‎02-12-2015 04:36 PM
Super User
Super User
Posts: 7,079

Re: Format Variable based on description in csv file

You could make more than one format from the data.

So one could map ICD9 -> CCS code

Another could map ICD9 -> CCS Description

Another could map ICD9 -> CCS code + CCS Description.

filename zipfile zip 'C:\Downloads\$dxref 2015.csv.zip';

data ccs ;

  infile zipfile('$dxref 2015.csv') dsd firstobs=2 truncover ;

  length diag_cd $5 ccs_cd $4 ccs_desc $70 ;

  input diag_cd ccs_cd ccs_desc ;

  if ccs_cd ne ' ';

run;

data cntlin ;

  length fmtname $32 type $1 start $80 label $80 ;

  keep fmtname type start label  ;

  type='C';

  set ccs ;

  fmtname='CCS';

  start=diag_cd;

  label=ccs_cd ;

  output;

  fmtname='CCSD';

  start=diag_cd;

  label=ccs_desc ;

  output;

  fmtname='CCSB';

  start=diag_cd;

  label=catx(' - ',ccs_cd,ccs_desc) ;

  output;

run;

proc sort nodupkey ; by fmtname start label ; run;

proc format cntlin=cntlin; run;

data test;

  input diag_cd $5. ;

  ccs=put(diag_cd,$ccs.);

  ccsd=put(diag_cd,$ccsd.);

  ccsb=put(diag_cd,$ccsb.);

  put (_all_) (=);

cards;

01080

7141

;;;;

View solution in original post


All Replies
Super User
Super User
Posts: 7,997

Re: Format Variable based on description in csv file

Well, firstly you will need to import the CSV file into a dataset.  Once you have it in a dataset you can do a few things.  First one most people will recommend is to create format library from it, look up proc format in the SAS help.  From the above ID column is value, decode is in the last column.  You can then apply this format to you data item.  This method doesn't change the original data only the way it is displayed.

Another way is to use the data as a lookup:

proc sql;

     create table WANT as

     select     A.*,

                   (select CCS_CAT from CODE_LIST where ICD9=A.ICD9) as FORMATTED_VALUE

     from       HAVE;

quit;

In the above statement I keep all the original data and then lookup the result value based on the data value into a new variable formatted_value.  The benefit of this is that you gain a column which has the formatted value as the data, nothing hidden.  So really depends on what you want to do with the data/where its going to .

Contributor
Posts: 21

Re: Format Variable based on description in csv file

Thanks RW9,

By "..firstly you will need to import the CSV file into a dataset";  you mean should I make csv file separate SAS dataset or combine csv columns in existing dataset?

Super User
Super User
Posts: 7,997

Re: Format Variable based on description in csv file

Just import the CSV to a new dataset:

data code_list;

     infile "thecsv.csv" dlm=",";

     input icd9 $ css $ csscat $;

run;

Contributor
Posts: 21

Re: Format Variable based on description in csv file

OK, I created SAS data-set of csv file.  How do I create format library from it. I see proc format in the SAS help only has way to create library based on value not by SAS data-set. How do I format with SAS data-set? Could you suggest codes too.

Thanks.

Super User
Posts: 19,877

Re: Format Variable based on description in csv file

Contributor
Posts: 21

Re: Format Variable based on description in csv file

I don't know what this is but it seems to work with modification.It has used cntlin function. I found this on Agency for Health Care Research website. Still I am not able to label CCS group.Thanks to all.


Filename inraw "c:\tools\ccs\$DXREF 2013.csv" ;
/*****************************************/
/* Make a SAS format*/
/*****************************************/ data new;
      infile inraw dsd dlm=',' end = eof firstobs=3 missover;
         input start : $char5.
              Oldccs : $char4.
              Value1 : $char70.
              Value2 : $char70.
              Label : $char4.
              Value3 : $char70.
              ;
         retain hlo "   ";
      fmtname = "$ccs" ;
      type = "   " ;
      output;

      if eof then do ;
         start = "   " ;
         vlabel = "   " ;
         hlo = "o";
         output ;
      end ;
run;
proc format cntlin = new ;
run;
/*****************************************/
/* Reassign CCS  */
/*****************************************/
data test;
      Set data;
      Ccs = put(ecode Diagnosis,$ccs.)
run;

Super User
Posts: 19,877

Re: Format Variable based on description in csv file

bikash wrote:

Still I am not able to label CCS group.

You'll need to provide more details than that.

Contributor
Posts: 21

Re: Format Variable based on description in csv file

Reeza,

If you see attached csv file, it has 3 columns: Icd9 Code, CCS Group and CCS Description. With above syntax, I can group ICD Code in CCS group but cannot label it with CCS Description..

e.g. ICD code '3220',  or '3221' is formatted as  76 (CCS Group) in proc freq result but  I wanted it as 'Meningitis' in result.

Solution
‎02-12-2015 04:36 PM
Super User
Super User
Posts: 7,079

Re: Format Variable based on description in csv file

You could make more than one format from the data.

So one could map ICD9 -> CCS code

Another could map ICD9 -> CCS Description

Another could map ICD9 -> CCS code + CCS Description.

filename zipfile zip 'C:\Downloads\$dxref 2015.csv.zip';

data ccs ;

  infile zipfile('$dxref 2015.csv') dsd firstobs=2 truncover ;

  length diag_cd $5 ccs_cd $4 ccs_desc $70 ;

  input diag_cd ccs_cd ccs_desc ;

  if ccs_cd ne ' ';

run;

data cntlin ;

  length fmtname $32 type $1 start $80 label $80 ;

  keep fmtname type start label  ;

  type='C';

  set ccs ;

  fmtname='CCS';

  start=diag_cd;

  label=ccs_cd ;

  output;

  fmtname='CCSD';

  start=diag_cd;

  label=ccs_desc ;

  output;

  fmtname='CCSB';

  start=diag_cd;

  label=catx(' - ',ccs_cd,ccs_desc) ;

  output;

run;

proc sort nodupkey ; by fmtname start label ; run;

proc format cntlin=cntlin; run;

data test;

  input diag_cd $5. ;

  ccs=put(diag_cd,$ccs.);

  ccsd=put(diag_cd,$ccsd.);

  ccsb=put(diag_cd,$ccsb.);

  put (_all_) (=);

cards;

01080

7141

;;;;

Super User
Posts: 11,343

Re: Format Variable based on description in csv file

For example I have formats that map my state's county codes to:

The county name

a variety of geographical regions such as Health Districts

translate to a different coding scheme used by some applications

A multilabel format that for the applications that support multilabel formats such as Proc Tabulate and Report, geographic region and county name

The best way to I found to learn to make CNTLIN data sets to make format is to examine a CNTLOUT data set.

Create a format (or multiple formats) with a few values similar to what you need. Add the option CNTLOUT= <some dataset name>;

When you run proc format that data set will be created with the information you need. Make a set that looks like that an use it as a CNTLIN dataset.

Important: Proc formats expect specific variable names for the bits it needs AND the type. Also the CNTLIN dataset must have all of the FMTNAME values together, best is to sort by FMTNAME.

The minimum set of variables you will need are FMTNAME, START, END (especially if a range is needed, Label (the result to display) and Type (C or N for character or numeric).

Contributor
Posts: 21

Re: Format Variable based on description in csv file

Thanks to every one for help. SAS Code provided by Tom solves all my problem.

Bikash

Super User
Posts: 10,046

Re: Format Variable based on description in csv file

Another way is Hash Table .

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 12 replies
  • 872 views
  • 3 likes
  • 6 in conversation