turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS/GRAPH and ODS Graphics
- /
- proc genmod graphics for count data model fit asse...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-24-2017 09:28 PM - edited 04-25-2017 08:59 AM

I'd like to include before and after model fit (proc genmod using negbin or poisson) visuals in my poster using clean data (unique patients) vs duplicate data (patients recounted). The idea is to show the impact of deduplication, if any. Global stat shows negbin fits my data better but I'd like to present it by visuals. (I'm using 9.4)

The ideal plot would be one shown in SUGI reference attached here. Below is just a screenshot of Figure 4 in the reference which is a nice cum probability graph comparing negbin, poisson and observed. But paper didn't cover how to. Anybody knows the way to reproduce this kinda plot?

- I read about ods graphs options for proc genmod and tried "assess var" option as shown below. It was resulted in no ouputs. Any idea why?

```
ods graphics on;
proc genmod data=mydata;
class exposure(ref="0")/param=ref;
model outcome=exposure/ dist=negbin link=log offset=ln;
assess var=(outcome)/resample=10000
seed=603708000
crpanel;
ods trace on;
run;
```

Below is the first 10 obs of my data where outcome is counts (number of cancers) , zone is exposure (taking value of 1,2 and 0), grand_total is the ZIP population/denominator.

I appreciate your precious time!!!! thanks in advance.

Accepted Solutions

Solution

04-25-2017
03:10 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Cruise

04-25-2017 08:06 AM

You can use PROC UNIVARIATE to obtain the empirical cumulative probabilities, as shown in

http://blogs.sas.com/content/iml/2016/09/06/graph-step-function-sas.html

You can then use the STEP or SERIES statement in PROC SGPLOT to graph it, as in this example:

```
data MyData;
input x @@;
datalines;
7 7 13 9 8 8 9 9 5 6 6 9 5 10 4 5 3 8 4
;
/* http://blogs.sas.com/content/iml/2016/09/06/graph-step-function-sas.html */
ods select cdfplot;
proc univariate data=MyData;
cdfplot x / vscale=proportion
odstitle="Empirical CDF" odstitle2="PROC UNIVARIATE";
ods output cdfplot=outCDF; /* data set contains ECDF values */
run;
title "Empirical CDF";
proc sgplot data=outCDF noautolegend;
step x=ECDFX y=ECDFY; /* variable names created by PROC UNIVARIATE */
xaxis grid label="x" offsetmin=0.05 offsetmax=0.05;
yaxis grid min=0 label="Cumulative Proportion";
run;
```

To overlay additional curves, take the parameter estimates from PROC GENMOD and use the CDF function in a DATA step to compute the predicted CDFs for the Poisson (and NB) distribution. You can either evaluate the CDF at the data, or you can evaluate the CDF on a grid of points, as shown in

http://blogs.sas.com/content/iml/2012/04/04/fitting-a-poisson-distribution-to-data-in-sas.html

(If uniform grid, you need to merge the data and the CDF.)

Here is an example where GENMOD gave Lambda=7.1 for the Poisson fit and the Poisson CDF is evaluated at the data:

```
%let Lambda = 7.1; /* param estimate from GENMOD fit */
data All;
set outCDF;
PoisCDF = cdf("Poisson", ECDFX, &Lambda);
run;
proc sgplot data=All;
step x=ECDFX y=ECDFY / legendlabel="ECDF";
step x=ECDFX y=PoisCDF / legendlabel="Model Fit";
xaxis grid label="x" offsetmin=0.05 offsetmax=0.05;
yaxis grid min=0 label="Cumulative Proportion";
run;
```

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Cruise

04-24-2017 09:40 PM

Are you asking how to get the data for the graph or how to graph the data or both?

If graphing, you can use SGPLOT with STEP and SERIES statement to get the graph shown, but you do need the data first

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Reeza

04-24-2017 09:58 PM

I have my data ready. Both of them. One clean and the other has duplicates. Could you please show me a syntax example?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Reeza

04-24-2017 10:00 PM

I think I just got what you're asking. Are you saying that I have to prep the data first for sgplot? If so, No. No data prepped for sgplot yet. Do you know how?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Cruise

04-24-2017 10:39 PM

No, I don't know how to get the estimates. That should be your first question

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Reeza

04-24-2017 10:57 PM

I have my estimates calculated both for negbin and poisson. I'm looking out for hints as to how my estimates form a data to feed proc sgplot. Please don't hesitate to share any links of resources with me.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Cruise

04-24-2017 11:01 PM

It's pretty straightforward...

You should have estimates with the count and probability, ie the data you want on the charts.

Then use SGPLOT, if you can post we can run/reporduce your results then I can help you there but otherwise you'll have to wait for someone else.

Check out robslink.com for examples though be careful to find SG procedures. You can review the SG documentation for examples.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Reeza

04-24-2017 11:08 PM

"if you can post?" what are you referring to? my estimates? I can post them for help. I am baffled where "cum probability" on graph is based on?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Cruise

04-24-2017 11:39 PM

Sorry, that should be if you can post the data.

If you don't know what the graph is why are you creating it?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Reeza

04-25-2017 12:23 AM

I can post the data. Please see the attachment. Why I want to create it? I was looking for the most appropriate SAS visual outputs to illustrate model fit using proc genmod with negbin/ poisson distribution. And I stumbled across the reference attached here. I understand the concept of the graph but asking for help from you guys as to how cumulative probability on the shown plot was pulled out.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Reeza

04-25-2017 12:29 AM

@Reeza , see attached data. I will check in after few hours. I am eastern timezone.

Solution

04-25-2017
03:10 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Cruise

04-25-2017 08:06 AM

You can use PROC UNIVARIATE to obtain the empirical cumulative probabilities, as shown in

http://blogs.sas.com/content/iml/2016/09/06/graph-step-function-sas.html

You can then use the STEP or SERIES statement in PROC SGPLOT to graph it, as in this example:

```
data MyData;
input x @@;
datalines;
7 7 13 9 8 8 9 9 5 6 6 9 5 10 4 5 3 8 4
;
/* http://blogs.sas.com/content/iml/2016/09/06/graph-step-function-sas.html */
ods select cdfplot;
proc univariate data=MyData;
cdfplot x / vscale=proportion
odstitle="Empirical CDF" odstitle2="PROC UNIVARIATE";
ods output cdfplot=outCDF; /* data set contains ECDF values */
run;
title "Empirical CDF";
proc sgplot data=outCDF noautolegend;
step x=ECDFX y=ECDFY; /* variable names created by PROC UNIVARIATE */
xaxis grid label="x" offsetmin=0.05 offsetmax=0.05;
yaxis grid min=0 label="Cumulative Proportion";
run;
```

To overlay additional curves, take the parameter estimates from PROC GENMOD and use the CDF function in a DATA step to compute the predicted CDFs for the Poisson (and NB) distribution. You can either evaluate the CDF at the data, or you can evaluate the CDF on a grid of points, as shown in

http://blogs.sas.com/content/iml/2012/04/04/fitting-a-poisson-distribution-to-data-in-sas.html

(If uniform grid, you need to merge the data and the CDF.)

Here is an example where GENMOD gave Lambda=7.1 for the Poisson fit and the Poisson CDF is evaluated at the data:

```
%let Lambda = 7.1; /* param estimate from GENMOD fit */
data All;
set outCDF;
PoisCDF = cdf("Poisson", ECDFX, &Lambda);
run;
proc sgplot data=All;
step x=ECDFX y=ECDFY / legendlabel="ECDF";
step x=ECDFX y=PoisCDF / legendlabel="Model Fit";
xaxis grid label="x" offsetmin=0.05 offsetmax=0.05;
yaxis grid min=0 label="Cumulative Proportion";
run;
```

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Rick_SAS

04-25-2017 09:20 AM

@Rick_SASthanks.

I have updated your code to my data. Bummer is, cdfplot is red and I get a error log below?

```
ods select cdfplot;
proc univariate data=post.zipcrude5;
cdfplot rate / vscale=proportion odstitle="Empirical CDF" odstitle2="PROC UNIVARIATE";
ods output cdfplot=post.outCDF;
run;
```

ERROR 22-322: Syntax error, expecting one of the following: ;, (, ALL, ALPHA, ANNOTATE, CIBASIC,

CIPCTLDF, CIPCTLNORMAL, CIQUANTDF, CIQUANTNORMAL, DATA, DEBUG, EXCLNPWGT, FREQ,

GOUT, LOCCOUNT, MODE, MODES, MU0, NEXTROBS, NEXTRVAL, NOBYPLOT, NOPRINT, NORMAL,

NOTABCONTENTS, NOVARCONTENTS, OUTTABLE, PCTLDEF, PLOT, PLOTSIZE, ROBUSTSCALE,

ROUND, SUMMARYCONTENTS, TRIMMED, VARDEF, WINSORIZED.

ERROR 76-322: Syntax error, statement will be ignored.

4989 odstitle="Empirical CDF" odstitle2="PROC UNIVARIATE";

4990 ods output cdfplot=post.outCDF;

4991 run;

NOTE: The SAS System stopped processing this step because of errors.

NOTE: PROCEDURE UNIVARIATE used (Total process time):

real time 0.00 seconds

cpu time 0.00 seconds

WARNING: Output 'cdfplot' was not created. Make sure that the output object name, label, or path

is spelled correctly. Also, verify that the appropriate procedure options are used to

produce the requested output object. For example, verify that the NOPRINT option is not

used.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Cruise

04-25-2017 09:32 AM

Don't worry about the red color. It just means that the syntax highlighter doesn't know that statement. It has been in PROC UNIVARIATE since forever.

The ERROR is on the PROC UNIVARIATE statement, so I suspect you had a copy-paste error that you corrected in the code you pasted. Try it again. If you still get an error, try it on data we all have access to:

```
ods select cdfplot;
proc univariate data=sashelp.cars;
cdfplot mpg_city / vscale=proportion odstitle="Empirical CDF" odstitle2="PROC UNIVARIATE";
ods output cdfplot=outCDF;
run;
```

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Rick_SAS

04-25-2017 12:41 PM

Thanks for previous comments. All worked out.

1. I tried following but NegBCDF column is all missing in the outCDFcar.

Lambda1 is the exp(estimate) from *"ods output parameterestimates=data;" *

```
ods select cdfplot;
proc univariate data=sashelp.cars;
cdfplot mpg_city / vscale=proportion odstitle="Empirical CDF" odstitle2="PROC UNIVARIATE";
ods output cdfplot=outCDFcar;
run;
%let Lambda1=1.04;
data alldup; set outCDFcar;
NegBCDF=cdf('NEGB',1, ECDFX, &Lambda1);
run;
```

2. Below is the desired image. I have two separate cdf data with N=1122 for clean and N=1133 for uncleaned data from proc univariate. Any idea how I can overlay them on the same plot as shown below?