BookmarkSubscribeRSS Feed
Axsta
Calcite | Level 5
I calculated the median using proc means and I would like to know how to use the median output from proc means in datastep to impute the missing values.
Using the output option creates a dataset and using set in the step is not what I intend. Is there anyway that single value can be used ?
7 REPLIES 7
japelin
Rhodochrosite | Level 12

Since proc means is a procedure for calculating basic statistics, I don't think it has the function to rewrite the original data set.

Tom
Super User Tom
Super User

If you want to manipulate data having it in datasets is the way to go.

 

But if the goal is just to replace the missing values with the MEDIAN then just skip PROC MEANS and use PROC STDIZE with METHOD=MEDIAN and MISSING=REPLACE.

andreas_lds
Jade | Level 19

If you don't want to use proc stdsize, merging the result of proc means with the original dataset could be an option.

Aku
Obsidian | Level 7 Aku
Obsidian | Level 7

With Proc SQL using self join and Coalesce function it's pretty easy to this.

 

Rick_SAS
SAS Super FREQ

For an example of using PROC STDIZE to impute by using the median, see the article "Mean imputation in SAS," which mentions the METHOD=MEDIAN option.

 

You can also get the value from PROC MEANS by reading it into a macro variable, but it requires more effort and more steps:

proc means data=sashelp.heart N NMISS MEDIAN stackods;
   var Cholesterol;
   ods output Summary=MedianOut;
run;

data _NULL_;
set MedianOut;
call symputx("Median", Median);
run;
 
%put &=Median;

data Impute;
set sashelp.heart;
if missing(Cholesterol) then 
   Cholesterol = &Median;
run;

proc means data=Impute N NMISS MEDIAN;
   var Cholesterol;
run;
ballardw
Super User

@Axsta wrote:
I calculated the median using proc means and I would like to know how to use the median output from proc means in datastep to impute the missing values.
Using the output option creates a dataset and using set in the step is not what I intend. Is there anyway that single value can be used ?

Since there at least two different methods to create a data set as output from Proc Means and they can have different structures depending on options chosen, the first thing you should do is show the Proc Means code used to generate the output data set.

 

Impute the missing values where? In the result of proc means? A different data set? Single value of what?

Axsta
Calcite | Level 5

Axsta_0-1618484457267.png

I am preparing myself for the base sas programming specialist exam and this was one of the questions in them. I am not sure if they require you to use advanced macros and call symput. 

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 875 views
  • 4 likes
  • 7 in conversation