How to predict a missing value in SAS

Reply
New Contributor
Posts: 3

How to predict a missing value in SAS

I'm trying to predict the value of a variable using either SAS EG or Enterprise Miner. I have some missing data for the StackHeight Variable shown below. Using the Sku and Supplier I can predict what the stack height might be. How would I do this?

 

data test;
input sku $ 1-3 Supplier $ 5-12 StackHeight 13 ;
datalines;
123 Sony 5
123 Sony 5
123 Sony
124 LG 4
124 LG 3
124 LG
124 LG 4
125 Toshiba 5
125 Toshiba 5
125 Toshiba
;
run;

Trusted Advisor
Posts: 1,683

Re: How to predict a missing value in SAS

Do you mean:

data test;
input  StackHeight 13  @;
if  StackHeight not(missing) then 
input sku $ 1-3 Supplier $ 5-12 StackHeight 13 ;
else delete;  /* ignore lines when missing */
datalines;
123 Sony 5
123 Sony 5
123 Sony
124 LG 4
124 LG 3
124 LG
124 LG 4
125 Toshiba 5
125 Toshiba 5
125 Toshiba
;
run;
Trusted Advisor
Posts: 1,683

Re: How to predict a missing value in SAS

[ Edited ]

You will get same result with:
data test;
input ... ; /* your code */
if missing(StackHeight ) then delete;

datalines;

  .... your data ...
run;

New Contributor
Posts: 3

Re: How to predict a missing value in SAS

No, I want to keep the rows with the missing stack height values.  What I want to do is predict what they might be using the values that are populated for the other rows. 

Respected Advisor
Posts: 2,170

Re: How to predict a missing value in SAS

[ Edited ]

SAS666 wrote:

I'm trying to predict the value of a variable using either SAS EG or Enterprise Miner. I have some missing data for the StackHeight Variable shown below. Using the Sku and Supplier I can predict what the stack height might be. How would I do this?

 

data test;
input sku $ 1-3 Supplier $ 5-12 StackHeight 13 ;
datalines;
123 Sony 5
123 Sony 5
123 Sony
124 LG 4
124 LG 3
124 LG
124 LG 4
125 Toshiba 5
125 Toshiba 5
125 Toshiba
;
run;


I don't think this is a SAS issue at all. I think the problem is there no way to "predict" the value of a missing in this situation if all you have is the SKU and Supplier (at least as far as I understand the meaning of the word "predict"). I mean, forget SAS for a second, let's just talk concepts, how would you, @SAS666, do such a prediction if all you had was pencil and paper???\

 

I would also like to state that it is easy to "fill in" a number when StackHeight is missing, but that isn't the same as a "prediction". So what do you mean?

--
Paige Miller
New Contributor
Posts: 3

Re: How to predict a missing value in SAS

Posted in reply to PaigeMiller

Well, in this case Suppliers usually use the same stack height across their products.  There are actually a few more variables I would use to make the prediction. In this case it is for TV size.  So if a TV has a sku of 123 and supplier of LG and is 50-60 inches tall, I'd look to see what LG usually stacks for TVs in that size range and then input that value. That's how i've been doing it manually.  I'm trying to figure out how to do it in SAS so I can do it on a larger scale. 

 

So in my example data I'd say the missing Sony value should be 5 and the missing LG value should be 3.75 and Toshiba would be 5. 

 

Does that make sense?

Respected Advisor
Posts: 2,170

Re: How to predict a missing value in SAS

[ Edited ]

So in my example data I'd say the missing Sony value should be 5 and the missing LG value should be 3.75 and Toshiba would be 5. 

 

It would have been very helpful to both of us if you stated this in your first post. If you say LG Value should be 3 2/3 (3.667), then I understand what you are doing and the following code ought to work. If you really meant LG value is 3.75 then I don't know how you got that.

 

UNTESTED CODE

 

proc glm data=test;
    class sku supplier;  
    model stackheight=sku*supplier;
    output out=pred predicted=predicted;
run;
quit;

 

--
Paige Miller
Super User
Posts: 21,530

Re: How to predict a missing value in SAS


SAS666 wrote:

Well, in this case Suppliers usually use the same stack height across their products.  There are actually a few more variables I would use to make the prediction. In this case it is for TV size.  So if a TV has a sku of 123 and supplier of LG and is 50-60 inches tall,


 

There's nothing in your sample data about height....

 

Given what you've shown, you could consider PROC STDIZE to replace the missing with the mean for the group, or some other metric (median/max).  See an example here and check the documentation for the other options.

https://gist.github.com/statgeek/cbd6bd253b4ab4828dd71d0ce3d17199

Esteemed Advisor
Posts: 5,124

Re: How to predict a missing value in SAS

For numerical values, fill in with the most popular value, i.e. the mode

 

data test;
input sku $ Supplier $ StackHeight;
datalines;
123 Sony 5
123 Sony 5
123 Sony .
124 LG 4
124 LG 3
124 LG .
124 LG 4
125 Toshiba 5
125 Toshiba 5
125 Toshiba .
;

proc means data=test noprint;
where StackHeight is not missing;
by sku supplier notsorted;
var StackHeight;
output out=modes mode=mode;
run;

proc sql;
update test
set StackHeight = 
    (select mode 
     from modes 
     where sku=test.sku and supplier=test.supplier)
where StackHeight is missing;
quit; 
PG
Esteemed Advisor
Posts: 5,124

Re: How to predict a missing value in SAS

For categorical values, you have to find the most popular value yourself:

 

data test;
input sku $ Supplier $ StackHeight $;
datalines;
123 Sony 5
123 Sony 5
123 Sony .
124 LG 4
124 LG 3
124 LG .
124 LG 4
125 Toshiba 5
125 Toshiba 5
125 Toshiba .
;

proc sort data=test; by sku supplier StackHeight; run;

proc sql;
create table freqs as
select sku, supplier, StackHeight, count(*) as n
from test where StackHeight is not missing
group by sku, supplier, StackHeight
order by sku, supplier, n desc;
quit;

data modes;
set freqs; by sku supplier;
if first.supplier;
drop n;
run;

proc sql;
update test
set StackHeight = 
    (select StackHeight 
     from modes 
     where sku=test.sku and supplier=test.supplier)
where StackHeight is missing;
quit; 

PG
Ask a Question
Discussion stats
  • 9 replies
  • 110 views
  • 0 likes
  • 5 in conversation