BookmarkSubscribeRSS Feed
Alica
Fluorite | Level 6

Hi, 

I am trying to calculate an average value for different variables based on the provider. 

That is my code so far: 

 

proc sql; 
select round(mean(durchschnitt_pnps)) into: avg_pNPS
from xwrk.nps_durchschnittsebenen1; 
select round(mean(durchschnitt_cnps)) into: avg_cNPS
from xwrk.nps_durchschnittsebenen1; 
select round(mean(durchschnitt_enps)) into: avg_eNPS
from xwrk.nps_durchschnittsebenen1; 
select round(mean(durchschnitt_brand_attributes)) into: avg_brand_attributes
from xwrk.nps_durchschnittsebenen1;
quit; 

data xwrk.nps_ebenen_geglättet; 
set xwrk.nps_durchschnittsebenen1; 
if missing(durchschnitt_pnps) then durchschnitt_pnps = &avg_pNPS;
if missing(durchschnitt_cnps) then durchschnitt_cnps = &avg_cNPS;
if missing(durchschnitt_enps) then durchschnitt_enps = &avg_eNPS;
if missing(durchschnitt_brand_attributes)then durchschnitt_brand_attributes = &avg_brand_attributes;
run;

Thanks for help 🙂 

3 REPLIES 3
ballardw
Super User

I don't see anything in that code that is restricting or using "provider"

 

I think your description is incomplete. It appears that what you are actually attempting is to replace missing values with the mean, and again your code doesn't use provider anywhere.

 

Proc summary/means is one of the easiest ways to get the mean (and other statistics) of multiple variables and use of a CLASS or BY variable gets the summary for each level of that variable.

 

You don't mention the name of your provider variable so this is a guess:

Proc summary data=xwrk.nps_durchschnittsebenen1 nway;
   class provider; /*replace provider with the name of your provider variable if different*/
  var durchschnitt_pnps  durchschnitt_cnps  durchschnitt_enps
        durchschnitt_brand_attributes;
  output out= meandataset (drop=_type_ _freq_) mean= /autoname;
run;

The above code creates a data set with one observation for each provider and the mean values of the variables on the VAR statement. The /autoname option attempts to place the statistic as a suffix to the variable. Since your variable durchschnitt_brand_attributes is close to the length limit of SAS variable names the base name will be truncated enough to allout adding "_mean" to the name.

 

Then update your data using that data set . One way:

proc sql;
   create table xwrk.nps_ebenen_geglättet as
   select a.provider
          ,coalesce(a.durchschnitt_pnps,b.durchschnitt_pnps_mean) as durchschnitt_pnps
          /* repeat for the other variables*/
   from xwrk.nps_durchschnittsebenen1 as a
        left join 
        meandataset as b
        on a.provider=b.provider
   ;
quit;

I don't have your data so can't test or know other variables that you expect in the output.

The COALESCE function returns the first non-missing value in the list of values/variables provided as arguments.

PaigeMiller
Diamond | Level 26

You are working very hard, using the wrong tools (macro variables and SQL), to program something that SAS has already programmed for you.

 

PROC STDIZE will do that for you.

 

Example:

 

proc stdize data=xwrk.nps_durchschnittsebenen1 out=xwrk.nps_ebenen_geglättet reponly missing=mean;
     by provider;
     var durchschnitt_pnps durchschnitt cnps durchschnitt_enps durchschnitt_brand_attributes;
run;

 

This assumes your data has been sorted by PROVIDER. Including a PROC SORT if necessary, that's 6-7 lines of code. How easy is that?

 

In addition, your problem description needs to be more complete in the future. You said: "I am trying to calculate an average value for different variables based on the provider" but that's incomplete. Based on your code, you also want to replace a missing with the mean for that provider, which was never mentioned. You will get better and faster responses in the future by including a complete description of the problem.

 

Additional advice which I have given many people: stop choosing PROC SQL to perform commonly used math and statistical calculations. SAS has programmed many many many many many many calculations that are commonly used so that you don't have to. PROC SQL is usually inefficient and tedious compared to what SAS provides. Just because you CAN program something in SQL does not mean that you SHOULD program it in SQL. If you are not sure if something has already been programmed in SAS, ask first!

--
Paige Miller
PaigeMiller
Diamond | Level 26

Fixing a typo in my answer above, the fix is shown in red

 

var durchschnitt_pnps durchschnitt_cnps durchschnitt_enps durchschnitt_brand_attributes;
--
Paige Miller

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 579 views
  • 1 like
  • 3 in conversation