BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
PI3
Obsidian | Level 7 PI3
Obsidian | Level 7

Hi all! I ran across an inconsistency and it's leading me to question the validity of the standard error bars that I made using proc sgplot. I'm a pretty new SAS user, so it's very possible I made a mistake, but I cannot figure out what it is for the life of me. What I've noticed is that the standard error bars I made using proc gplot (by creating a new dataset with three y values and manually adding or subtracting the SE from the mean) are very different from the ones I got from using proc sgplot (by using limitstat). The ones that I got from procsgplot also don't entirely line up with the calculated standard errors in the data we have. Something about it is 'off'.

 

Thank you! I appreciate the help. 

 

*just your basic proc means;

*Summming # pos nests, also dstats;
proc means data=Coop noprint;
by site rep week trt;
var nest;
output out=posnest n=nnest sum(nest)=sumnest mean=xnest stderr=senest std=sdnest var=vnest;
run;

 

/*the proc sgplot that is not correct*/
title "Mean positive nest data=posnest";
proc sgplot data=posnest;
vline week / response=xnest group=trt stat=mean limitstat=stderr numstd=2;
yaxis label='X nest +/- SE nest with proc sgplot';
run;
title;

 

*the proc g plot, which is doing what I want it to;

*this first section is adding the additional yvalues;

data reshapemean(keep=site week trt xnest senest yvar);
set posnest;
yvar=xnest;
output;
yvar=xnest-senest;
output;
yvar=xnest+senest;
output;
run;

 

proc gplot data=reshapemean;
plot yvar*week=trt;
title3 'X nest +/- SE nest with proc Gplot';
run;

 

using proc sgplot (wrong)using proc sgplot (wrong)using proc Gplot (accurate)using proc Gplot (accurate)peek at the data being graphedpeek at the data being graphed

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User
I think you need to use the raw data with SGPLOT not the summarized data if you're asking it to create the summary statistics.

View solution in original post

9 REPLIES 9
Reeza
Super User
Do you have an axis statement somewhere limiting the yvar to 1?
PI3
Obsidian | Level 7 PI3
Obsidian | Level 7

No.. where/why do I need that? If this is about the proc gplot.. it is supposed to be plotting three yvalues. The mean, the mean - SE and the mean + SE.

Reeza
Super User

@PI3 wrote:

No.. where/why do I need that? If this is about the proc gplot.. it is supposed to be plotting three yvalues. The mean, the mean - SE and the mean + SE.


Your gplot graph seems to truncate at 1.0, while the SGPLOT goes to 1.5. 

So many of the bars truncating at 1.0 seem problematic to me.

PI3
Obsidian | Level 7 PI3
Obsidian | Level 7

That's true. I need to fix that. I think there is another error, though. If you look at the first three weeks in particular you will notice that the bars are not the same, even taking the truncated maximum values into account.

Reeza
Super User
I think you need to use the raw data with SGPLOT not the summarized data if you're asking it to create the summary statistics.
ballardw
Super User

When you use options like LIMITSTAT

vline week / response=xnest group=trt stat=mean limitstat=stderr numstd=2;

you are telling SGPLOT to calculate the stderr. Which is not what you did for Gplot.

And since you summarized the data you only have the number of records represented by one Site and Rep per Week and Trt combination. Which means very likely that there are a very different number of records so the calculation is different and the mean plotted would be across the Site and Reps.

 

If you are going to summarize data and plot then summarize everything.

And maybe provide actual input data. I am not sure that either of the plots is actually show what you say you want. One matches your expectations but I'm not sure the approach you show is actually correct.

PI3
Obsidian | Level 7 PI3
Obsidian | Level 7

Thank you Ballardw and Reeza! Using the raw data set solved my issue with proc sgplot and produced the same graph as the proc gplot (without the truncated maximum values..). I really appreciate the help from you both.

PI3
Obsidian | Level 7 PI3
Obsidian | Level 7

I just want to post an update on this for other people who are new to SAS (like me) and would like a further explanation of what what going on. I know that I've looked at old posts on this forum before when trying to figure things out, so I want to leave this as clear as possible for posterity. If you already are familiar with SAS this won't be new or revelatory. 

 

I didn't understand what I was doing before when I was trying to run my sgplot and gplot on the same data set.

 

With the gplot, I am taking the results of the proc means and manually adding or subtracting one standard error (~68% confidence) from the mean. (That's the purpose of the reshape data set.) Then, I'm taking that data and plotting it.

 

With the sgplot, SAS is calculating the mean and standard error for me. I just have to specify what statistic I'm interested in and the number of standard errors. Therefore, I am using the data set that I ran the proc means on for the gplot. So, in other words, I am using the raw, un-analyzed data for sgplot and the analyzed data for gplot.

 

P.S. There is one hitch for our study, which is that we have to run an analysis before getting our 'raw' data because we used sub-samples. That wouldn't be the case for most people, though. Also, we are only using one SE for this graph because it's just for informal, internal purposes. It is easier to read and we just wanted to take a look at the data graphically.

 

Final, correct, graphs:

 

sgplot.pnggplot.png

 

Final code:


*proc means on data=m1 (second proc means on data=coop)(m2);
proc sort data=m1; by week trt; run;
proc means data=m1 noprint;
by week trt;
var xnest;
output out=m2 n=n2nest mean=x2nest stderr=se2nest std=sd2nest var=v2nest;
*min=minwa minqa minlwa max=maxwa maxqa maxlwa;
/*
proc print data=m2; title 'data=m2 dstats (second proc means on data=m1, by trt week and boxflag)'; run;
*/
data reshape2;
set m2;
yvar=x2nest;
output;
yvar=x2nest-se2nest;
output;
yvar=x2nest+se2nest;
output;
run;

 

title "Mean positive xnest data=m1 numstd=1 sgplot A**";
proc sgplot data=m1;
vline week / response=xnest group=trt stat=mean limitstat=stderr numstd=1; *numstd= number of standard errors)
yaxis label='X nest +/- SE nest with proc sgplot' VALUES = (0 TO 65 BY 5);
run;
title;

 


/* Plot the error bars using the HILOTJ interpolation*/
symbol1 INTERPOL=HILOTJ; *HILOTJ= high-low values, but the dataset with 3 y-values makes it standard error bars;
proc gplot data=reshape2;
plot yvar*week=trt;
title3 'Plot x2nest +/- se2nest (gplot reshape on m2) A**';
run;
title;
quit;

Reeza
Super User
Just a recommendation, that you should avoid GPLOT graphics, they're not as supported anymore and any new developments are happening in the SG procedures. Plus they're easier to use and generate better quality graphics.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 1156 views
  • 5 likes
  • 3 in conversation