BookmarkSubscribeRSS Feed
raddad34
Fluorite | Level 6

Hello, so I have a dataset that looks like this 

 

Patient ICD10 YEAR COST 
1 i003 2013 345
1 i002 2014 544
1 i001 2015 786
1 v01.6 2014 2009
1 v01.5 2015 4003
2 i001 2011 345
2 i002 2011 544
3 v01.6 2009 3154
3 v01.5 2010 4003
3 v01.3 2010 9009
3 v01.1 2013 10168 

 

 

 ICD10 codes are made up for simplicities sake.

I also have a “grouping” list of ICD 10 codes that looks like this:

 

i001 > i002 > i003
v01.1 > v01.2 > v01.3 > v01.4 > v01.5 > v01.6

 

The left of the list will always be more expensive than the right and thus should.

 

I want to calculate the difference in cost of the ICD10 codes that are grouped together only if in the previous year they had an icd10 code in the group series. So, for example I want my output to look like this

Patient ICD10 YEAR COST COSTDIFF
1 i003 2013 345 
1 i002 2014 544 199
1 i001 2015 786 242
1 v01.6 2014 2009 0
1 v01.5 2015 4003 3202
2 i001 2011 345 0
2 i002 2011 544 199
3 v01.6 2009 3154 858
3 v01.5 2010 4003 5006
3 v01.3 2010 9009 5006
3 v01.1 2013 10168 0

 

I think I have an idea of how to code this but one of my main problems is that I have no clue how I would check if the previous year had a icd10 code that matched the current year icd10 code, where I would set the difference to 0, or was on the right side of the series(ie less expensive), where I would take the difference. 

 

Does anyone have any ideas? thanks.

 



 

 

7 REPLIES 7
ballardw
Super User

@raddad34 wrote:

Hello, so I have a dataset that looks like this 

Patient ICD10 YEAR COST
1 i001 2013 345
1 i002 2014 544
1 i003 2015 786
2 i001 2011 345
2 i002 2011 544
3 v01.1 2009 4003
3 v01.5 2010 9214
3 v01.6 2010 10168
3 v01.3 2011 10168

 ICD10 codes are made up for simplicities sake.

I also have a “grouping” list of ICD 10 codes that looks like this:

 

i001 > i002 > i003
v01.1 > v01.2 > v01.3 > v01.4 > v01.5 > v01.6

 

The left of the list will always be more expensive than the right and thus should.

 

I want to calculate the difference in cost of the ICD10 codes that are grouped together only if in the previous year they had an icd10 code in the group series. So, for example I want my output to look like this

 

Patient ICD10 YEAR COST COSTDIFF
1 i003 2013 345 0
1 i002 2014 544 199
1 i001 2015 786 242
1 i001 2015 786 242 2 i002 2011 345 0 2 i001 2011 544 0 3 v01.6 2009 4003 0 3 v01.5 2010 9214 5211 3 v01.3 2010 10168 954 3 v01.3 2011 10168 0 3 v01.1 2012 12771 2603

I think I have an idea of how to code this but one of my main problems is that I have no clue how I would check if a previous year had a icd10 code that matched the current year icd10 code, where I would set the difference to 0, or was on the right side of the series(ie less expensive), where I would take the difference. 

 

Does anyone have any ideas? thanks.

 



 

 


One way:

data have;
   input Patient $ ICD10 $ YEAR COST;
datalines;
1 i001 2013 345
1 i002 2014 544
1 i003 2015 786
2 i001 2011 345
2 i002 2011 544
3 v01.1 2009 4003
3 v01.5 2010 9214
3 v01.6 2010 10168
3 v01.3 2011 10168
;

data want;
   set have;
   by patient;
   costdiff=dif(cost);
   if first.patient then costdiff=0;
run;

Note the data step to provide actual data that code can be written against.

If your data is not sorted by patient and year do so before the Want step.

The SAS function DIF (or DIF1) returns the difference of a variable from the previous record. You can actually look back multiple records with Dif2, Dif3 ...Dif99 (or more, haven't tested this for awhile).

The function however is a queued value and if you attempt to use it in an IF type construct the result compares to the last time the IF was true.

When you use a BY statement SAS provides automatic variables First. and Last. for each that are 1/0 (true/false) if the record is the first or last with that value of the By variable. SAS treats 1 as true and 0 as false so can use that to conditionally treat values differently for the first record or reset values.

raddad34
Fluorite | Level 6
Hey, some information I foolishly forgot to add in my example dataset is that a patient will probably have more than 1 icd10 code, I edited the main post to reflect that.
Another thing is that the code you provided doesn't look at the previous year, just the first year. ie if a patient had codes in 2014, none in 2015 and in 2016 had codes again, i would like 2016 to show a difference of 0 since I'm just looking for a year to year change. Thanks.

ballardw
Super User

@raddad34 wrote:
Hey, some information I foolishly forgot to add in my example dataset is that a patient will probably have more than 1 icd10 code, I edited the main post to reflect that.
Another thing is that the code you provided doesn't look at the previous year, just the first year. ie if a patient had codes in 2014, none in 2015 and in 2016 had codes again, i would like 2016 to show a difference of 0 since I'm just looking for a year to year change. Thanks.


Untested because none of your example data actually has the same ICD10 code in multiple years for the same patient.

 

data have;
   input Patient $ ICD10 $ YEAR COST;
datalines;
1 i001 2013 345
1 i002 2014 544
1 i003 2015 786
2 i001 2011 345
2 i002 2011 544
3 v01.1 2009 4003
3 v01.5 2010 9214
3 v01.6 2010 10168
3 v01.3 2011 10168
;

proc sort data=have;
   by patient icd10 year;
run;

data want;
   set have;
   by patient icd10;
   costdiff=dif(cost);
   yeardiff=dif(year);
   if first.icd10 then do;
      costdiff=0;
      yeardiff=0;
   end;
   if yeardiff ne 1 then costdiff=0;
run;
raddad34
Fluorite | Level 6

I updated my dataset to include what im talking about here

 

Patient ICD10 YEAR COST 
1 i003 2013 345
1 i002 2014 544
1 i001 2015 786
1 v01.6 2014 2009
1 v01.5 2015 4003
2 i001 2011 345
2 i002 2011 544
3 v01.6 2009 3154
3 v01.5 2010 4003
3 v01.3 2010 9009
3 v01.1 2013 10168 

On the last line of patient 3 i'm trying to get the difference to be 0. Is there a way to check if say, year-1 contains one of the codes in the series?

ballardw
Super User

@raddad34 wrote:

I updated my dataset to include what im talking about here

 

Patient ICD10 YEAR COST 
1 i003 2013 345
1 i002 2014 544
1 i001 2015 786
1 v01.6 2014 2009
1 v01.5 2015 4003
2 i001 2011 345
2 i002 2011 544
3 v01.6 2009 3154
3 v01.5 2010 4003
3 v01.3 2010 9009
3 v01.1 2013 10168 

On the last line of patient 3 i'm trying to get the difference to be 0. Is there a way to check if say, year-1 contains one of the codes in the series?


"Codes in a series" is going to require a definition, possibly different, for each "series". Probably requires adding a variable to indicate it is the same "series".

raddad34
Fluorite | Level 6

So by series I mean "icd code grouping" which is essentially a list I have of codes and which group they belong to, so for example

i001,i002, i003 - "Heart conditions"
v01.1,v01.2,v01.3,v01.4, v01.5,v01.6 - "Health Services"

 

By assigning a variable, do you mean setting these codes equal to a descriptor for example? I could do something like if i001, i002, i003 then descriptor="heart conditions". 

 

 

ballardw
Super User

@raddad34 wrote:

So by series I mean "icd code grouping" which is essentially a list I have of codes and which group they belong to, so for example

i001,i002, i003 - "Heart conditions"
v01.1,v01.2,v01.3,v01.4, v01.5,v01.6 - "Health Services"

 

By assigning a variable, do you mean setting these codes equal to a descriptor for example? I could do something like if i001, i002, i003 then descriptor="heart conditions". 

 

 


Basically. Then you would sort by Id, descriptor, year .

However you might now have two or more "descriptors" the same if there are multiple ICD10 codes from the same group in given year. Which leads to another round of how do you want to handle that.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 757 views
  • 0 likes
  • 2 in conversation