I'd like to include before and after model fit (proc genmod using negbin or poisson) visuals in my poster using clean data (unique patients) vs duplicate data (patients recounted). The idea is to show the impact of deduplication, if any. Global stat shows negbin fits my data better but I'd like to present it by visuals. (I'm using 9.4)
The ideal plot would be one shown in SUGI reference attached here. Below is just a screenshot of Figure 4 in the reference which is a nice cum probability graph comparing negbin, poisson and observed. But paper didn't cover how to. Anybody knows the way to reproduce this kinda plot?
- I read about ods graphs options for proc genmod and tried "assess var" option as shown below. It was resulted in no ouputs. Any idea why?
ods graphics on;
proc genmod data=mydata;
class exposure(ref="0")/param=ref;
model outcome=exposure/ dist=negbin link=log offset=ln;
assess var=(outcome)/resample=10000
seed=603708000
crpanel;
ods trace on;
run;
Below is the first 10 obs of my data where outcome is counts (number of cancers) , zone is exposure (taking value of 1,2 and 0), grand_total is the ZIP population/denominator.
I appreciate your precious time!!!! thanks in advance.
You can use PROC UNIVARIATE to obtain the empirical cumulative probabilities, as shown in
http://blogs.sas.com/content/iml/2016/09/06/graph-step-function-sas.html
You can then use the STEP or SERIES statement in PROC SGPLOT to graph it, as in this example:
data MyData;
input x @@;
datalines;
7 7 13 9 8 8 9 9 5 6 6 9 5 10 4 5 3 8 4
;
/* http://blogs.sas.com/content/iml/2016/09/06/graph-step-function-sas.html */
ods select cdfplot;
proc univariate data=MyData;
cdfplot x / vscale=proportion
odstitle="Empirical CDF" odstitle2="PROC UNIVARIATE";
ods output cdfplot=outCDF; /* data set contains ECDF values */
run;
title "Empirical CDF";
proc sgplot data=outCDF noautolegend;
step x=ECDFX y=ECDFY; /* variable names created by PROC UNIVARIATE */
xaxis grid label="x" offsetmin=0.05 offsetmax=0.05;
yaxis grid min=0 label="Cumulative Proportion";
run;
To overlay additional curves, take the parameter estimates from PROC GENMOD and use the CDF function in a DATA step to compute the predicted CDFs for the Poisson (and NB) distribution. You can either evaluate the CDF at the data, or you can evaluate the CDF on a grid of points, as shown in
http://blogs.sas.com/content/iml/2012/04/04/fitting-a-poisson-distribution-to-data-in-sas.html
(If uniform grid, you need to merge the data and the CDF.)
Here is an example where GENMOD gave Lambda=7.1 for the Poisson fit and the Poisson CDF is evaluated at the data:
%let Lambda = 7.1; /* param estimate from GENMOD fit */
data All;
set outCDF;
PoisCDF = cdf("Poisson", ECDFX, &Lambda);
run;
proc sgplot data=All;
step x=ECDFX y=ECDFY / legendlabel="ECDF";
step x=ECDFX y=PoisCDF / legendlabel="Model Fit";
xaxis grid label="x" offsetmin=0.05 offsetmax=0.05;
yaxis grid min=0 label="Cumulative Proportion";
run;
Are you asking how to get the data for the graph or how to graph the data or both?
If graphing, you can use SGPLOT with STEP and SERIES statement to get the graph shown, but you do need the data first 🙂
No, I don't know how to get the estimates. That should be your first question 🙂
It's pretty straightforward...
You should have estimates with the count and probability, ie the data you want on the charts.
Then use SGPLOT, if you can post we can run/reporduce your results then I can help you there but otherwise you'll have to wait for someone else.
Check out robslink.com for examples though be careful to find SG procedures. You can review the SG documentation for examples.
Sorry, that should be if you can post the data.
If you don't know what the graph is why are you creating it?
You can use PROC UNIVARIATE to obtain the empirical cumulative probabilities, as shown in
http://blogs.sas.com/content/iml/2016/09/06/graph-step-function-sas.html
You can then use the STEP or SERIES statement in PROC SGPLOT to graph it, as in this example:
data MyData;
input x @@;
datalines;
7 7 13 9 8 8 9 9 5 6 6 9 5 10 4 5 3 8 4
;
/* http://blogs.sas.com/content/iml/2016/09/06/graph-step-function-sas.html */
ods select cdfplot;
proc univariate data=MyData;
cdfplot x / vscale=proportion
odstitle="Empirical CDF" odstitle2="PROC UNIVARIATE";
ods output cdfplot=outCDF; /* data set contains ECDF values */
run;
title "Empirical CDF";
proc sgplot data=outCDF noautolegend;
step x=ECDFX y=ECDFY; /* variable names created by PROC UNIVARIATE */
xaxis grid label="x" offsetmin=0.05 offsetmax=0.05;
yaxis grid min=0 label="Cumulative Proportion";
run;
To overlay additional curves, take the parameter estimates from PROC GENMOD and use the CDF function in a DATA step to compute the predicted CDFs for the Poisson (and NB) distribution. You can either evaluate the CDF at the data, or you can evaluate the CDF on a grid of points, as shown in
http://blogs.sas.com/content/iml/2012/04/04/fitting-a-poisson-distribution-to-data-in-sas.html
(If uniform grid, you need to merge the data and the CDF.)
Here is an example where GENMOD gave Lambda=7.1 for the Poisson fit and the Poisson CDF is evaluated at the data:
%let Lambda = 7.1; /* param estimate from GENMOD fit */
data All;
set outCDF;
PoisCDF = cdf("Poisson", ECDFX, &Lambda);
run;
proc sgplot data=All;
step x=ECDFX y=ECDFY / legendlabel="ECDF";
step x=ECDFX y=PoisCDF / legendlabel="Model Fit";
xaxis grid label="x" offsetmin=0.05 offsetmax=0.05;
yaxis grid min=0 label="Cumulative Proportion";
run;
@Rick_SASthanks.
I have updated your code to my data. Bummer is, cdfplot is red and I get a error log below?
ods select cdfplot;
proc univariate data=post.zipcrude5;
cdfplot rate / vscale=proportion odstitle="Empirical CDF" odstitle2="PROC UNIVARIATE";
ods output cdfplot=post.outCDF;
run;
ERROR 22-322: Syntax error, expecting one of the following: ;, (, ALL, ALPHA, ANNOTATE, CIBASIC,
CIPCTLDF, CIPCTLNORMAL, CIQUANTDF, CIQUANTNORMAL, DATA, DEBUG, EXCLNPWGT, FREQ,
GOUT, LOCCOUNT, MODE, MODES, MU0, NEXTROBS, NEXTRVAL, NOBYPLOT, NOPRINT, NORMAL,
NOTABCONTENTS, NOVARCONTENTS, OUTTABLE, PCTLDEF, PLOT, PLOTSIZE, ROBUSTSCALE,
ROUND, SUMMARYCONTENTS, TRIMMED, VARDEF, WINSORIZED.
ERROR 76-322: Syntax error, statement will be ignored.
4989 odstitle="Empirical CDF" odstitle2="PROC UNIVARIATE";
4990 ods output cdfplot=post.outCDF;
4991 run;
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE UNIVARIATE used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
WARNING: Output 'cdfplot' was not created. Make sure that the output object name, label, or path
is spelled correctly. Also, verify that the appropriate procedure options are used to
produce the requested output object. For example, verify that the NOPRINT option is not
used.
Don't worry about the red color. It just means that the syntax highlighter doesn't know that statement. It has been in PROC UNIVARIATE since forever.
The ERROR is on the PROC UNIVARIATE statement, so I suspect you had a copy-paste error that you corrected in the code you pasted. Try it again. If you still get an error, try it on data we all have access to:
ods select cdfplot;
proc univariate data=sashelp.cars;
cdfplot mpg_city / vscale=proportion odstitle="Empirical CDF" odstitle2="PROC UNIVARIATE";
ods output cdfplot=outCDF;
run;
Thanks for previous comments. All worked out.
1. I tried following but NegBCDF column is all missing in the outCDFcar.
Lambda1 is the exp(estimate) from "ods output parameterestimates=data;"
ods select cdfplot;
proc univariate data=sashelp.cars;
cdfplot mpg_city / vscale=proportion odstitle="Empirical CDF" odstitle2="PROC UNIVARIATE";
ods output cdfplot=outCDFcar;
run;
%let Lambda1=1.04;
data alldup; set outCDFcar;
NegBCDF=cdf('NEGB',1, ECDFX, &Lambda1);
run;
2. Below is the desired image. I have two separate cdf data with N=1122 for clean and N=1133 for uncleaned data from proc univariate. Any idea how I can overlay them on the same plot as shown below?
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.