BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
venkatard
Calcite | Level 5

I have 2 data sets one with mean value for each variable(proc means output) and the other with values. I want to replace the missing values with mean value for each particular variable in the dataset.

Any help please.

I tried call symput but does it work for multiple variables?

1 ACCEPTED SOLUTION

Accepted Solutions
Ron_MacroMaven
Lapis Lazuli | Level 10

On Tue, 16 Apr 2013 17:45:04 +0100, Ian

<ian.wakeling@HANANI.QISTATS.CO.UK> wrote:

>How about using PROC STDIZE?

proc stdize data    = old 

            out     = new

            missing = mean

            reponly;

            var       _numeric_;

run;

>You can also impute with the median or read values from a second data set.

View solution in original post

11 REPLIES 11
kuridisanjeev
Quartz | Level 8

Hi..

Not sure how you going to accomplish this with Call Symput..If possible can you provide your code which you have tried???

Other then that coalesce function will be helpful for your requirement.

Regards.

Sanjeev.K

allurai0412
Fluorite | Level 6

hi ,

try this...but............please atleast post your working / sample datasets.....

data samp;
infile datalines missover;
input id sal ;
if missing(sal) then  sal=symget("avgsal");
datalines;
001 5000
002 7000
003
004 7832
005
;
run;

proc sql noprint ;
select avg(sal)  into : avgsal  from samp ;
run;

Regards

ALLU

data_null__
Jade | Level 19

PROC STDIZE, for 1 or more variables in the VAR statement and BY groups if you like.

data samp;
   infile datalines missover;
  
input id:$3. sal ;
   datalines;
001 5000
002 7000
003
004 7832
005
;;;;
   run;
proc stdize reponly missing=mean out=poked;
   var sal;
   run;
Ron_MacroMaven
Lapis Lazuli | Level 10

1. read in the data structure of your data.

make an array of all the numeric variables

2. read in the single row of the Means data set

whose variable have been autonamed

so that Mean is the prefix of each variable.

The issues are getting the names of the variables aligned with their means.

so that the array references are correct.

alternately you can use sql to extract the list of variable

and feed that list to Summary

which you can then use in the array statements.

DATA Update;*todo: (drop = Mean:);

*get the data structure for the array Valu;

if 0 then set Library.Mydata;

array Valu(*) _numeric_;

*read the Means data set -- one row -- into the set of values for updating;

set Means;

array Mean(*);

do until(EndoFile);

   set Library.MyData end = EndoFile;

   do i = 1 to dim(Valu);

      if Valu(I) eq . then Valu(I) = Mean(I);

      end;

   output;

   end;

stop;

run;

PROC SQL; describe table &syslast;

           quit;

Ron Fehd  matching maven

Ron_MacroMaven
Lapis Lazuli | Level 10

On Tue, 16 Apr 2013 17:45:04 +0100, Ian

<ian.wakeling@HANANI.QISTATS.CO.UK> wrote:

>How about using PROC STDIZE?

proc stdize data    = old 

            out     = new

            missing = mean

            reponly;

            var       _numeric_;

run;

>You can also impute with the median or read values from a second data set.

Sudhakar_A
Calcite | Level 5

you can merge this 2 datasets with by variables which you have used in proc means to create multiple means values. then you can assign the mean values to missing values if needed.

venkatard
Calcite | Level 5

I think i will give clear idea.

First dataset with means.

mean_score   mean_forum   mean_subs

  23.45               43.45               12.22

Second data set with original values.

Score    forum    subs

12            23       .

145           12      87  

.               12      98

15              .       67

Now i want to replace the missing values in second data set with the mean value.

For example in score if any value is missing it should be replaced with mean_score.

Sudhakar_A
Calcite | Level 5

take all mean values to respective Macro variables. Then in the next step use symget or simple if condition to replace the values.

Data _null_;

     set mean_data;

     call symput("Mscore", strip(put(mean_score)));

     call symput("Mforum", strip(put(mean_forum)));

     call symput("Msubs", strip(put(mean_subs)));

run;

data hh;

     set values;

     if score=. then score=&mean_score;

     else score=score;

     ....

     ....

     ....

run;

do the same for other variables, Hope this helps

Ksharp
Super User

It is easy :

data want;

if _n_ eq 1 then set mean_data;

set have ;

score=coalesce( score ,mean_score) ;

forum=coalesce( forum,mean_forum);

...........

run;

Ksharp

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 1337 views
  • 4 likes
  • 7 in conversation