BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Cruise
Ammonite | Level 13

I'd like to include before and after model fit (proc genmod using negbin or poisson) visuals in my poster using clean data (unique patients) vs duplicate data (patients recounted). The idea is to show the impact of deduplication, if any. Global stat shows negbin fits my data better but I'd like to present it by visuals. (I'm using 9.4)

 

The ideal plot would be one shown in SUGI reference attached here. Below is just a screenshot of Figure 4 in the reference which is a nice cum probability graph comparing negbin, poisson and observed. But paper didn't cover how to. Anybody knows the way to reproduce this kinda plot?

want image.png

- I read about ods graphs options for proc genmod and tried "assess var" option as shown below. It was resulted in no ouputs. Any idea why?

 

 

ods graphics on;
proc genmod data=mydata;
   class exposure(ref="0")/param=ref;
   model outcome=exposure/ dist=negbin link=log offset=ln;
   assess var=(outcome)/resample=10000
                        seed=603708000
                        crpanel;
   ods trace on;
   run;

https://support.sas.com/documentation/cdl/en/statug/63962/HTML/default/viewer.htm#statug_genmod_sect...

 

Below is the first 10 obs of my data where outcome is counts (number of cancers) , zone is exposure (taking value of 1,2 and 0), grand_total is the ZIP population/denominator.

 

 

mydata.png

 

I appreciate your precious time!!!! thanks in advance.

1 ACCEPTED SOLUTION

Accepted Solutions
Rick_SAS
SAS Super FREQ

You can use PROC UNIVARIATE to obtain the empirical cumulative probabilities, as shown in 

http://blogs.sas.com/content/iml/2016/09/06/graph-step-function-sas.html

You can then use the STEP or SERIES statement in PROC SGPLOT to graph it, as in this example:

 

data MyData;
input x @@;
datalines;
   7 7 13 9 8 8 9 9 5 6 6 9 5 10 4 5 3 8 4
;

/* http://blogs.sas.com/content/iml/2016/09/06/graph-step-function-sas.html */
ods select cdfplot;
proc univariate data=MyData;
cdfplot x / vscale=proportion 
         odstitle="Empirical CDF" odstitle2="PROC UNIVARIATE";
ods output cdfplot=outCDF;   /* data set contains ECDF values */
run;
 
title "Empirical CDF";
proc sgplot data=outCDF noautolegend;
   step x=ECDFX y=ECDFY;          /* variable names created by PROC UNIVARIATE */
   xaxis grid label="x" offsetmin=0.05 offsetmax=0.05;
   yaxis grid min=0 label="Cumulative Proportion";
run;

To overlay additional curves, take the parameter estimates from PROC GENMOD and use the CDF function in a DATA step to compute the predicted CDFs for the Poisson (and NB) distribution. You can either evaluate the CDF at the data, or you can evaluate the CDF on a grid of points, as shown in 

http://blogs.sas.com/content/iml/2012/04/04/fitting-a-poisson-distribution-to-data-in-sas.html

(If uniform grid, you need to merge the data and the CDF.)

Here is an example where GENMOD gave Lambda=7.1 for the Poisson fit and the Poisson CDF is evaluated at the data:

 

%let Lambda = 7.1;  /* param estimate from GENMOD fit */
data All;
set outCDF;
PoisCDF = cdf("Poisson", ECDFX, &Lambda);
run;
proc sgplot data=All;
   step x=ECDFX y=ECDFY / legendlabel="ECDF";
   step x=ECDFX y=PoisCDF / legendlabel="Model Fit";
   xaxis grid label="x" offsetmin=0.05 offsetmax=0.05;
   yaxis grid min=0 label="Cumulative Proportion";
run;

 

 

 

View solution in original post

24 REPLIES 24
Reeza
Super User

Are you asking how to get the data for the graph or how to graph the data or both?

 

If graphing, you can use SGPLOT with STEP and SERIES statement to get the graph shown, but you do need the data first 🙂

Cruise
Ammonite | Level 13
I have my data ready. Both of them. One clean and the other has duplicates. Could you please show me a syntax example?
Cruise
Ammonite | Level 13
I think I just got what you're asking. Are you saying that I have to prep the data first for sgplot? If so, No. No data prepped for sgplot yet. Do you know how?
Reeza
Super User

No, I don't know how to get the estimates. That should be your first question 🙂

Cruise
Ammonite | Level 13
I have my estimates calculated both for negbin and poisson. I'm looking out for hints as to how my estimates form a data to feed proc sgplot. Please don't hesitate to share any links of resources with me.
Reeza
Super User

It's pretty straightforward...

 

You should have estimates with the count and probability, ie the data you want on the charts. 

Then use SGPLOT, if you can post we can run/reporduce your results then I can help you there but otherwise you'll have to wait for someone else.

 

Check out robslink.com for examples though be careful to find SG procedures. You can review the SG documentation for examples. 

 

Cruise
Ammonite | Level 13
"if you can post?" what are you referring to? my estimates? I can post them for help. I am baffled where "cum probability" on graph is based on?
Reeza
Super User

Sorry, that should be if you can post the data. 

 

If you don't know what the graph is why are you creating it?

Cruise
Ammonite | Level 13
I can post the data. Please see the attachment. Why I want to create it? I was looking for the most appropriate SAS visual outputs to illustrate model fit using proc genmod with negbin/ poisson distribution. And I stumbled across the reference attached here. I understand the concept of the graph but asking for help from you guys as to how cumulative probability on the shown plot was pulled out.
Cruise
Ammonite | Level 13
@Reeza , see attached data. I will check in after few hours. I am eastern timezone.
Rick_SAS
SAS Super FREQ

You can use PROC UNIVARIATE to obtain the empirical cumulative probabilities, as shown in 

http://blogs.sas.com/content/iml/2016/09/06/graph-step-function-sas.html

You can then use the STEP or SERIES statement in PROC SGPLOT to graph it, as in this example:

 

data MyData;
input x @@;
datalines;
   7 7 13 9 8 8 9 9 5 6 6 9 5 10 4 5 3 8 4
;

/* http://blogs.sas.com/content/iml/2016/09/06/graph-step-function-sas.html */
ods select cdfplot;
proc univariate data=MyData;
cdfplot x / vscale=proportion 
         odstitle="Empirical CDF" odstitle2="PROC UNIVARIATE";
ods output cdfplot=outCDF;   /* data set contains ECDF values */
run;
 
title "Empirical CDF";
proc sgplot data=outCDF noautolegend;
   step x=ECDFX y=ECDFY;          /* variable names created by PROC UNIVARIATE */
   xaxis grid label="x" offsetmin=0.05 offsetmax=0.05;
   yaxis grid min=0 label="Cumulative Proportion";
run;

To overlay additional curves, take the parameter estimates from PROC GENMOD and use the CDF function in a DATA step to compute the predicted CDFs for the Poisson (and NB) distribution. You can either evaluate the CDF at the data, or you can evaluate the CDF on a grid of points, as shown in 

http://blogs.sas.com/content/iml/2012/04/04/fitting-a-poisson-distribution-to-data-in-sas.html

(If uniform grid, you need to merge the data and the CDF.)

Here is an example where GENMOD gave Lambda=7.1 for the Poisson fit and the Poisson CDF is evaluated at the data:

 

%let Lambda = 7.1;  /* param estimate from GENMOD fit */
data All;
set outCDF;
PoisCDF = cdf("Poisson", ECDFX, &Lambda);
run;
proc sgplot data=All;
   step x=ECDFX y=ECDFY / legendlabel="ECDF";
   step x=ECDFX y=PoisCDF / legendlabel="Model Fit";
   xaxis grid label="x" offsetmin=0.05 offsetmax=0.05;
   yaxis grid min=0 label="Cumulative Proportion";
run;

 

 

 

Cruise
Ammonite | Level 13

@Rick_SASthanks.

I have updated your code to my data. Bummer is, cdfplot is red and I get a error log below?

ods select cdfplot;
proc univariate data=post.zipcrude5;
cdfplot rate / vscale=proportion odstitle="Empirical CDF" odstitle2="PROC UNIVARIATE";
ods output cdfplot=post.outCDF; 
run;

ERROR 22-322: Syntax error, expecting one of the following: ;, (, ALL, ALPHA, ANNOTATE, CIBASIC,
              CIPCTLDF, CIPCTLNORMAL, CIQUANTDF, CIQUANTNORMAL, DATA, DEBUG, EXCLNPWGT, FREQ,
              GOUT, LOCCOUNT, MODE, MODES, MU0, NEXTROBS, NEXTRVAL, NOBYPLOT, NOPRINT, NORMAL,
              NOTABCONTENTS, NOVARCONTENTS, OUTTABLE, PCTLDEF, PLOT, PLOTSIZE, ROBUSTSCALE,
              ROUND, SUMMARYCONTENTS, TRIMMED, VARDEF, WINSORIZED.
ERROR 76-322: Syntax error, statement will be ignored.
4989           odstitle="Empirical CDF" odstitle2="PROC UNIVARIATE";
4990  ods output cdfplot=post.outCDF;
4991  run;

NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE UNIVARIATE used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds

WARNING: Output 'cdfplot' was not created.  Make sure that the output object name, label, or path
         is spelled correctly.  Also, verify that the appropriate procedure options are used to
         produce the requested output object.  For example, verify that the NOPRINT option is not
         used.

Rick_SAS
SAS Super FREQ

Don't worry about the red color. It just means that the syntax highlighter doesn't know that statement. It has been in PROC UNIVARIATE since forever. 

 

The ERROR is on the PROC UNIVARIATE statement, so I suspect you had a copy-paste error that you corrected in the code you pasted. Try it again.  If you still get an error, try it on data we all have access to:

 

ods select cdfplot;
proc univariate data=sashelp.cars;
cdfplot mpg_city / vscale=proportion odstitle="Empirical CDF" odstitle2="PROC UNIVARIATE";
ods output cdfplot=outCDF; 
run;
Cruise
Ammonite | Level 13

@Rick_SAS,

Thanks for previous comments. All worked out.

 

1. I tried following but NegBCDF column is all missing in the outCDFcar.

Lambda1 is the exp(estimate) from "ods output parameterestimates=data;" 

 

 

ods select cdfplot;
proc univariate data=sashelp.cars;
cdfplot mpg_city / vscale=proportion odstitle="Empirical CDF" odstitle2="PROC UNIVARIATE";
ods output cdfplot=outCDFcar; 
run;
%let Lambda1=1.04;
data alldup; set outCDFcar;
NegBCDF=cdf('NEGB',1, ECDFX, &Lambda1);
run;

2. Below is the desired image. I have two separate cdf data with N=1122 for clean and N=1133 for uncleaned data from proc univariate. Any idea how I can overlay them on the same plot as shown below? 

 

 

decired plot.png

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 24 replies
  • 3957 views
  • 7 likes
  • 4 in conversation