Programming the statistical procedures from SAS

How do I impute missing values based on two grouping variables?

Reply
Occasional Contributor
Posts: 11

How do I impute missing values based on two grouping variables?

Hi, 

 

I have missing values in cost variable.  I would like to impute the missing values, grouped by two variables.  My data structure looks like this: 

 

DrugNameRX_YrCost;
Drug 12010$50
Drug 12010.
Drug 12011$60
Drug 22010$30
Drug 22010.
Drug 22011$20

 

I tried using Proc MI like this:

proc mi data=Have out=have_impute
by DrugName Rx_yr;
var Cost;
run;

 

But it returns an error message of "Fewer than two analysis variables".  

 

Anyone know what I am doing wrong, or a better way to do this? 

 

Thanks in advance!

Chris

 

New Contributor
Posts: 2

Re: How do I impute missing values based on two grouping variables?

Maybe you should avoid 'drugname' as the analysis variable since it is character variable.

Grand Advisor
Posts: 16,880

Re: How do I impute missing values based on two grouping variables?

I'm not sure that's a good way to do it.

 

I'd consider some rules that are probably true 90% of the time.

 

For example for drug1 in the same year I would assume the same price.

If I don't have data for that year, then I would consider an interpolation method, probably something as simple as the average of the years before and after. You probably have some more complex scenarios, such as missing two years in a row or different prices in same year. Regardless, I don't think a straighforward imputation method would be the best way to go in your case. This is assuming you're actually working with drug data and not some other data. 

Occasional Contributor
Posts: 11

Re: How do I impute missing values based on two grouping variables?

I am working with insurance claims data. I have over 900K records, 5% are missing the cost data. The costs vary a little depening on the patients insurance type, but not much. I definitely have data for a given year, just not sure how to systematically make the updates, other than hardcoding a value like

if DrugNm=Drug 1 and Rx_year=2010 and Cost=. then Cost=X;

Thanks!
Chris
Grand Advisor
Posts: 16,880

Re: How do I impute missing values based on two grouping variables?

If you have data for those records make a 'master table' that has the values for the drug/year and then merge the tables on drug/year. 

 

You can probably use PROC STANDARD for this to replace missing values, as long as you're okay with using the mean value of the drugs per year. If not you'll need another method.

 

http://support.sas.com/documentation/cdl/en/proc/61895/HTML/default/viewer.htm#a002473725.htm

Ask a Question
Discussion stats
  • 4 replies
  • 191 views
  • 0 likes
  • 3 in conversation