BookmarkSubscribeRSS Feed
catisgar
Calcite | Level 5

I've got a dataset with variables TYPE and FREQ. I'm trying to test on SAS if my variable FREQ follows a uniform distribution by variable = TYPE

TYPE FREQ

A 0

A 2

A 0

A 4

A 5

A 5

B 10

B 0

B 4

B 5

B 3

B 1

 

How should the SAS code work? I know I need to use proc univariate on beta distribution (1,1), but I cant get it to work if I want to test for uniform distribution by the 'TYPE' variable. Also, I want to use the range(freq)+2 as my sigma and min(freq)-1 as my theta parameters. How do I get it defined in my sas code?

 

PROC SQL NOPRINT; SELECT RANGE(freq)+2 INTO :SIGMA FROM dataset group by TYPE; QUIT;

PROC SQL NOPRINT; SELECT MIN(freq)-1 INTO :THETA FROM dataset group by TYPE; QUIT;

 

PROC UNIVARIATE DATA = dataset

BY type;

VAR freq;

HISTOGRAM freq / NOPLOT BETA ( W=1 L=1 COLOR=CX4B0082 SIGMA=&SIGMA THETA=&THETA ALPHA=1 BETA=1)

;
;

RUN; QUIT;

Appreciate any help please thanks!

3 REPLIES 3
PGStats
Opal | Level 21

Didn't you get the message 


ERROR: The largest value of FREQ is greater than or equal to the upper threshold
(THETA + SIGMA) for the beta fit.

 

when you tried to fit the Beta distribution?

 

What is the hypothesis that you want to test here?

PG
Ksharp
Super User

I would like to use Chi-Square Test for Equal Proportions


data have;
input type $ freq;
cards;
A 0
A 2
A 0
A 4
A 5
A 5
B 10
B 0
B 4
B 5
B 3
B 1
;
run;
ods select OneWayChiSq;
proc freq data=have;
by type;
table freq/chisq;
exact chisq;
quit;
Rick_SAS
SAS Super FREQ

To extend what PGStats said, the test you perform depends on what you know (and want to know) about the data.

1. Where dd the data come from? If there reason to think that the data generating mechanism is uniform?

2. Do you know the upper and lower bounds of the data? For example, is the data always in the interval [0,10]?

3. Your example uses integer values. The Beta distribution is a continuous distribution. You can use the Beta distribution to test whether continuous data fits a uniform distribution. Unless your real data has values like 2.72 and 6.135, you won't get a good fit with the Beta distrib.

4. Based on your sample data, I think KSharp has the right idea. If you have lots of data that has integer values, you can test for a discrete uniform distribution on the set {0,1,...,10} by using a chi-square test.  If you have only a small amount of data (like your example), you probably want to bin groups together. For example, run a chi-square test for frequencies in the set {0-1, 2-3, 4-5,...}

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 3024 views
  • 4 likes
  • 4 in conversation