Desktop productivity for business analysts and programmers

using proc standardize to put the missing value to zero

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 111
Accepted Solution

using proc standardize to put the missing value to zero

Hello,

 

I am using proc stdize to set the missing value to zero without having to named all numeric variables.  Nice proc....

however, I am receiving this warning:

 

WARNING: At least one of the scale and location estimators of variable var12 can not be computed. Variable var12 will
not be standardized.

 

I believe that this warning popup only when all the values are missing for a particular variable.  Is it possible to continue to use this procedure to set the missing value to zero without getting this warning?

 

Regards,

 

Alain

 

proc stdize data=Dataset1

out=dataset2 reponly missing=0;


Accepted Solutions
Solution
‎03-12-2018 06:04 AM
Super User
Posts: 13,084

Re: using proc standardize to put the missing value to zero

First ask yourself does it make sense to "standardize" a variable will all missing value to anything? Why were all of the values missing might be a critical question before attempting to use values of 0 in further steps.

 

 

You will likely have to add the value of 0 in a separate data step, either before or after the Proc Stdize call.

View solution in original post


All Replies
Trusted Advisor
Posts: 1,270

Re: using proc standardize to put the missing value to zero

Hi, 

 

Please try this

 

proc stdize data=Dataset1(drop=var12) reponly out=dataset2 missing=0;
run;

Frequent Contributor
Posts: 111

Re: using proc standardize to put the missing value to zero

Good evening,

 

I have tested two different codes and when all the values of a variable are missing, we get the warning message.

 

data have;
infile datalines delimiter=',';
input A B C;
datalines;
1, ,3
3, ,4
3, ,5
., ,7
2, ,3
6, ,.
.,1 ,2
;
run;
proc stdize data=have reponly out=want missing=0;
var _numeric_;
run;

 

The following code is not working because all the values of B are missing;

data have;
infile datalines delimiter=',';
input A B C;
datalines;
1, ,3
3, ,4
3, ,5
., ,7
2, ,3
6, ,.
.,.,2
;
run;
proc stdize data=have reponly out=want missing=0;
var _numeric_;
run;

Frequent Contributor
Posts: 111

Re: using proc standardize to put the missing value to zero

Thanks for responding to my question.

 

I have test your code and it works but if a drop a variable I will not be able to set the missing value to zero.

 

Regards,

Alain

Solution
‎03-12-2018 06:04 AM
Super User
Posts: 13,084

Re: using proc standardize to put the missing value to zero

First ask yourself does it make sense to "standardize" a variable will all missing value to anything? Why were all of the values missing might be a critical question before attempting to use values of 0 in further steps.

 

 

You will likely have to add the value of 0 in a separate data step, either before or after the Proc Stdize call.

Frequent Contributor
Posts: 111

Re: using proc standardize to put the missing value to zero

Good evening,

 

You have a very good point regarding the use of a missing variable. My client is asking me to develop algorithms either to extract data or to perform calculations.

 

Our dataset are relatively large and we don't know at the begining that in a certain dataset, for a particular variable, all the value are missing.

 

I have recently experiment this procedure to initialize missing value to zero without have to name all the variables, but it sound it not appropriate when all the values are missing.

 

Regards,

 

Alain

 

Super User
Posts: 13,084

Re: using proc standardize to put the missing value to zero


@alepage wrote:

Good evening,

 

Our dataset are relatively large and we don't know at the begining that in a certain dataset, for a particular variable, all the value are missing.

 

 


First step with almost any project is get some familiarity with the data. For numeric a quick

 

Proc means data=have ;

   var _numeric_;

run;

 

Would show you the variables with all missing values (N=0) on possibly other problematic variables by mean, min or max values.

 

For instance a numeric with all missing values I might trace back to documentation to see if 1) is supposed to be numeric and 2) did the data actually contain value but they were not acceptable as numeric such as containing special characters or were currency read with the incorrect informat. A "large" data set with any variable missing for all records often points to some issue from data extraction by the client, incomplete specifications of data received (to include changes without notification) or read specification errors on my part.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 147 views
  • 1 like
  • 3 in conversation