BookmarkSubscribeRSS Feed
Wolverine
Quartz | Level 8

I have files that include a variable that lists diagnosis codes associated with various medical specialties.  I need to compare values from another file to determine if a given record included a diagnosis that matched the medical specialty.  I've been copying and pasting the list directly into the syntax:

IF DX in ("03282","03640"....) THEN cardiology_flag = 1;

This works but is unwieldy and difficult to update.  There are 10 specialties with 800-2500 codes each!

The list is stored in a file named as such: DXcodes_cardiology.sas7bdat.  There are 2 variables in this file that contain valid codes, named DX_5A and DX_5B. 

So how do I get SAS to look at a record in a file called DXcardiology._spec_cat.sas7bdat, and compare the the value stored in the variable DX to the list of codes in the variables DX_5A and DX_5B in DXcodes_cardiology.sas7bdat, and then give the variable cardiology_flag a value 1 if the code is present in the list, and a value of 0 if it's not present in the list?

13 REPLIES 13
Steelers_In_DC
Barite | Level 11

You can use proc format, or proc sql;

proc sql noprint;

select 'variable'

into :prefixlist

separated by ','

from table;

This will give you a macro variable &prefixlist

Wolverine
Quartz | Level 8

Mark, I took what you provided and rewrote it a bit to work with my variable names and file names.

Proc SQL;

    Select DX_5A, DX_5B

        Into :DX_5Alist, :DX_5Blist

        Separated by ","

    From medicaid.DXcodes_cardiology;

Quit;

This works and gives me an output that looks like a crosstab frequency for the 2 variables.  So as I'm trying to understand this, it looks like this is the 1st step of the process -- the values present in the variables assigned into a pair of macro variables.  But then how do I compare the values in the macro variables to the values in the medical records in DXcardiology._spec_cat?

Steelers_In_DC
Barite | Level 11

If this isn't helpful provide an example, two datasets, of what you have and what you want.

Wolverine
Quartz | Level 8

Mark, I certainly like your approach, but I'm not sure what you mean by providing an example of the datasets.  I don't see any way to upload them here, and even if I could, they have sensitive info and cannot be shared publicly.  I tried to post some screenshots (with the sensitive variables removed) but the forum won't allow me to upload them.  They're well below the 2MB filesize limit.

Ok, I was able to upload some screenshots.  The "card list" file shows the list of diagnosis codes (variables DX_5A and DX_5B) that are associated with Cardiology.  The "card claims" file shows actual patient records, including the diagnosis codes (variable DX) associated with each record.

The output file should show each record as it exists in the "card claims" file, except that there should be a flag variable (cardiology_flag) indicating whether the diagnosis code for that record is considered to be cardiology-related in the "card list" file.


card list.jpgcard claims.jpg
ballardw
Super User

Dummy data that mimics the behavior of your data is sufficient and only needs enough of the variables involved to test all the basic rules you need. A data step with an input and datalines is preferred.

For example in the post I had previously to make and use an informat from data I had to start with a dummy data as you hadn't really provided much detail. That should provide a skeleton of how to post code to create a dataset.

Tom
Super User Tom
Super User

So let's assume that you want to assign the variable C_CODE from CARD_LIST to every record in CARD_CLAIMS.

(Let's also assume that you have cleaned up C_CODE so that is does not contain both "7" and "'7'" as two different codes for the same thing.)

proc sql ;

  create table want as

    select a.*,b.c_code

      from card_claim a

      left join card_list b

      on (a.dx is not null)

      and ((a.dx = b.dx_5b) or  (a.dx = b.dx_5a))

  ;

quit;

Wolverine
Quartz | Level 8

That's not quite what I'm trying to get it to do, but I think it's close.  Let me try to explain it again, just so we're all on the same page...

The variables DX_5A and DX_5B in "card_list" contain a list of all cardiology-related diagnosis codes.

So what I want SAS to do is look at the first record in "card_claims", find the value that is stored in the DX variable for that record, and then determine is that value is present ANYWHERE in the DX_5A and DX_5B variables in "card_list".  If so, the record in "card_claims" should be retained, and if not, then it should be deleted.  And then I need to repeat that process for all records in "card_claims".

So I think what you have above would work, but don't I want to use RIGHT JOIN?  I only want to retain records that have a DX value that matches any of the values stored in the DX_5A and DX_5B variables in "card_list".

Tom
Super User Tom
Super User

Yes. So you have a special case of the general problem of putting the DX codes into categories.

Wolverine
Quartz | Level 8

I just realized there's a complication.  One of the things we want to know is what diagnosis codes cardiologists frequently use that AREN'T cardiology diagnoses.  Using the above approach, those non-cardiology DX codes would be lost.

Tom
Super User Tom
Super User

So you DO want the LEFT JOIN .

Sounds like CARD_CLAIM is the list of DX codes actually used and CARD_LIST is the list of "card" DX codes.

Reeza
Super User

I highly recommend formats, especially if you already have it in a table.  You can use the data set to create the format and then apply it to the dx field. 

ballardw
Super User

This is how I understand your issue and a possibly solution:

/* dummy data set as I think your look up table looks like*/
data DXcodes_cardiology;
      informat DX_5A DX_5B $10.;
      input DX_5A DX_5B;
datalines;
10023 00456
156   34567
abc   def
      ;
run;

/* reshape and prepare for use to create an informat*/

data cariocntl;
   set DXcodes_cardiology end=last;
   fmtname='CariologyFlag';
   type='I';
   label='1';
   start= DX_5A; output;
   start= DX_5B; output;
   if last then do;
      start='**OTHER**';  /* special indicator that values not listed treated this way*/
      label='0';    /* this is zero, value for informat not listed valeus*/
      HLO = 'O';    /* this is capital o and is instruction on how to treat others */
      output;
   end;
run;

/* create the format, best would be to place it in a permanent library and point the format search path to include that library*/

proc format library=work cntlin=cariocntl;
run;

/* and a quick example of how to use*/

data test;
   input code $;
   CariologyFlag = input(code,CariologyFlag.);
datalines;
10023
156 
abc 
00456
34567
def
211111
19045
pdq
      ;
run;

Tom
Super User Tom
Super User

Let's make up some data a try it.

data specialties ;

   input dx $5. specialty $32. ;

cards;

03282Cardiology

03640Cardiology

run;

data cases;

input id dx $5. ;

cards;

1 03282

2 04567

;;;;

proc sql ;

create table want as

  select a.*,b.specialty

  from cases a left join specialties b

  on a.dx = b.dx

  ;

quit;


sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 13 replies
  • 3061 views
  • 0 likes
  • 5 in conversation