Hello again to the SAS Community,
I've been trying to run some model checking for my logistic regression question, and have managed to get the deviance/pearson GOF statistics chart to show up, but it has 0's and .'s for the values in the chart. I've tried everything I know how to do, but can't seem to get this to run correctly.
Thank you in advance for your help!
************
alcgp=alcohol consumption (higher numbers = greater consumption)
casestatus (case=1; control =0)
count=number of subjects
proc import out=work.problem
Datafile= "...\alcohol1.xls"
dbms = excel replace ;
sheet="alcohol";
run;
proc logistic data=work.problem;
freq count;
class alcgp (ref='0') / param=ref ;
model casestatus (event='1') = alcgp / scale=none aggregate;
run;
Many users here don't want to download Excel files because of virus potential, others have such things blocked by security software. Also if you give us Excel we have to create a SAS data set and due to the non-existent constraints on Excel data cells the result we end up with may not have variables of the same type (numeric or character) and even values.
Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as text to show exactly what you have and that we can test code against.
You might have to check on the raw data after import. You may have missing values because the Excel content mixes character and numeric values in a single column or some other translation issue. You may have better options saving the XLS as CSV and import that with a large guessingrows option. Use the import wizard or look up the syntax to use that option. By default SAS only sees 20 rows of the Excel file for guessing variable type and content. So may incorrectly assign variable types if the top of the file varies notably from rows further down.
My apologies.
data problem;
input alcgp casestatus count;
datalines;
0 0 40
0 0 10
0 0 6
0 0 5
1 0 27
1 0 7
1 0 4
1 0 7
2 0 2
2 0 1
2 0 2
3 0 1
3 1 1
3 0 1
3 0 2
0 0 60
0 1 3
0 0 13
0 0 7
0 0 8
1 0 35
1 1 3
1 0 20
1 1 1
1 0 13
1 0 8
2 0 11
2 0 6
2 0 2
2 0 1
3 1 2
3 0 1
3 0 3
3 1 2
3 0 4
0 1 1
0 0 45
0 0 18
0 0 10
0 0 4
1 1 6
1 0 32
1 1 4
1 0 17
1 1 5
1 0 10
1 1 5
1 0 2
2 1 3
2 0 13
2 1 6
2 0 8
2 1 1
2 0 4
2 1 2
2 0 2
3 1 4
3 1 3
3 0 3
3 1 2
3 0 3
3 1 4
0 1 2
0 0 47
0 1 6
0 0 19
0 1 3
0 0 9
0 1 4
0 0 2
1 1 9
1 0 31
1 1 6
1 0 15
1 1 4
1 0 13
1 1 3
1 0 3
2 1 9
2 0 9
2 1 8
2 0 7
2 1 3
2 0 3
2 1 4
3 1 5
3 0 5
3 1 6
3 0 1
3 1 2
3 0 1
3 1 5
3 0 1
0 1 5
0 0 43
0 1 4
0 0 10
0 1 2
0 0 5
0 0 2
1 1 17
1 0 17
1 1 3
1 0 7
1 1 5
1 0 4
2 1 6
2 0 7
2 1 4
2 0 8
2 1 2
2 0 1
2 1 1
3 1 3
3 0 1
3 1 1
3 0 1
3 1 1
3 1 1
0 1 1
0 0 17
0 1 2
0 0 4
0 1 1
0 0 2
1 1 2
1 0 3
1 1 1
1 0 2
1 0 3
1 1 1
2 1 1
2 1 1
3 1 2
3 1 1
;
proc logistic data=work.problem;
freq count;
class alcgp (ref='0') / param=ref ;
model casestatus (event='1') = alcgp / scale=none aggregate;
run;
You have 4 unique profiles in your data and you are estimating 4 parameters, which gives you zero degrees of freedom as this is calculated as #Unique Profiles - #Estimated Parameters. You could omit looking at alcgp as a class variable to resolve the problem.
data problem;
input alcgp casestatus count @@;
datalines;
0 0 40 0 0 10 0 0 6 0 0 5 1 0 27
1 0 7 1 0 4 1 0 7 2 0 2 2 0 1
2 0 2 3 0 1 3 1 1 3 0 1 3 0 2
0 0 60 0 1 3 0 0 13 0 0 7 0 0 8
1 0 35 1 1 3 1 0 20 1 1 1 1 0 13
1 0 8 2 0 11 2 0 6 2 0 2 2 0 1
3 1 2 3 0 1 3 0 3 3 1 2 3 0 4
0 1 1 0 0 45 0 0 18 0 0 10 0 0 4
1 1 6 1 0 32 1 1 4 1 0 17 1 1 5
1 0 10 1 1 5 1 0 2 2 1 3 2 0 13
2 1 6 2 0 8 2 1 1 2 0 4 2 1 2
2 0 2 3 1 4 3 1 3 3 0 3 3 1 2 3 0 3
3 1 4 0 1 2 0 0 47 0 1 6 0 0 19
0 1 3 0 0 9 0 1 4 0 0 2 1 1 9
1 0 31 1 1 6 1 0 15 1 1 4 1 0 13
1 1 3 1 0 3 2 1 9 2 0 9 2 1 8
2 0 7 2 1 3 2 0 3 2 1 4 3 1 5
3 0 5 3 1 6 3 0 1 3 1 2 3 0 1
3 1 5 3 0 1 0 1 5 0 0 43 0 1 4
0 0 10 0 1 2 0 0 5 0 0 2 1 1 17
1 0 17 1 1 3 1 0 7 1 1 5 1 0 4
2 1 6 2 0 7 2 1 4 2 0 8 2 1 2
2 0 1 2 1 1 3 1 3 3 0 1 3 1 1
3 0 1 3 1 1 3 1 1 0 1 1 0 0 17
0 1 2 0 0 4 0 1 1 0 0 2 1 1 2
1 0 3 1 1 1 1 0 2 1 0 3 1 1 1
2 1 1 2 1 1 3 1 2 3 1 1
;
proc logistic data=problem;
freq count;
class alcgp(ref='0') / param=ref ;
model casestatus(event='1') = alcgp / scale=none aggregate;
run;
proc logistic data=problem;
freq count;
model casestatus(event='1') = alcgp / scale=none aggregate;
run;
Thank you everyone for your help! Can't believe it was that simple of a fix!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.