I want to run a PROC REG on some data with many variables. However, some of the variables were collected at the plot level while others were collected in transects within the plot. If i take the mean of the transect variables, that gives me a plot level variable that i can then use the PROC REG function on. However, if i use PROC MEANS to get the mean variable of these transect data points it also gives me N, MAX, MIN, and STD output. If i run PROC REG on the data generated by PROC MEANS it includes the 4 variables i mentioned in the regression, which i obviously don't want. So, i want to delete those 4 variables and have PROC REG use the MEAN variable exclusively.
proc means data=AllEvents;
class plotID;
var logN2O logVWC Soil_N Soil_C SoilC_N LitterN LitterC LitterC_N pH_H2O pH_CaCl2 logNH4 logTN;
output out=plotmeans;
run;
data plotmeans;
if "_STAT_"="N" then delete;
if "_STAT_"="MAX" then delete;
if "_STAT_"="MIN" then delete;
if "_STAT_"="STD" then delete;
output;
run;
The proc means command works fine but the my "if...then...delete;" command doesn't work. It doesn't generate an error, but it does give me the message: NOTE: Variable _STAT_ is uninitialized.
How can i generate an output of the mean of the variables i'm interested in at the plot level that i can then run PROC REG with?
If you just want PROC MEANS to generate the MEAN then tell it that. If you add MEAN= to the OUTPUT statement then SAS will store the means into the variables with the same names as the input variables.
proc means data=AllEvents;
class plotID;
var logN2O logVWC Soil_N Soil_C SoilC_N
LitterN LitterC LitterC_N pH_H2O pH_CaCl2 logNH4 logTN
;
output out=plotmeans mean=;
run;
Your data step has a number of problems. First you are not telling it what data to read. Second you are comparing two string literals in your IF statements that can never be equal. You probably could have gotten it to work with code like this just use a subsetting IF instead of IF/THEN/DELETE.
data plotmeans;
set plotmeans;
if _stat_='MEAN';
run;
You can control the output in various ways. Look at the OUTPUT statement documentation for starters to select only the MEAN.
And then look at the STACKODS option to see if you want a different structure.
proc means ...;
....
output out=want mean = /autoname;
run;
or
proc means data=... MEAN STACKODS;
....
ods output summary=want_stacked;
run;
You need a set statement after the data statement:
set plotmeans;
It reads the input data set into the data step.
If you just want PROC MEANS to generate the MEAN then tell it that. If you add MEAN= to the OUTPUT statement then SAS will store the means into the variables with the same names as the input variables.
proc means data=AllEvents;
class plotID;
var logN2O logVWC Soil_N Soil_C SoilC_N
LitterN LitterC LitterC_N pH_H2O pH_CaCl2 logNH4 logTN
;
output out=plotmeans mean=;
run;
Your data step has a number of problems. First you are not telling it what data to read. Second you are comparing two string literals in your IF statements that can never be equal. You probably could have gotten it to work with code like this just use a subsetting IF instead of IF/THEN/DELETE.
data plotmeans;
set plotmeans;
if _stat_='MEAN';
run;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.