Which tests and procedure should I use for this rate comparison? Does ...

SAS-questioner · Posted 08-14-2023 02:29 PM

I want to compare two rates, but it's little bit complicated. For example, there are 50 stores in county X, 30 out of the 50 stores have product A, and 20 out of the 50 stores have product B. And then I find out how many people (with certain feature) within county X need product A (N), and how many people (with certain feature) within county X need product B(n). I want to compare the rate like: 30/N vs 20/n. Basically, the rates of product A compare to rate of product B, and rate of product A is defined as number of stores that have product A / number of people that need product A, the same goes with rate of product B. For the counties, there are multiple counties, not just an X. Basically, I select some counties first, and select some stores under each county. When I read previous threads, some said if I want to compare rate, I should use PROC GENMOD, because it is either Poisson or negative binomial, and PROC GENMOD is the procedure to do it. In my case, is it appropriate for PROC GENMOD (Poission or negative binomial)? If so, does it model random effect? And do I need to model random effect in my case? Thank you!

ballardw · Posted 08-14-2023 04:11 PM

Example data to go along with the word salad might help.

How do you "I find out how many people (with certain feature)"? Include that in the data.

How do you go about " I select some counties first, and select some stores under each county. "? Subsetting data? Random sample? Grouping?

SAS-questioner · Posted 08-15-2023 12:42 PM

Hi, Ballardw, thank you for replying. I will put the expected data below:

County      #_store   product_A   product_B   #_need_A   #_need_B
     1        50          20          30         100        120
     2        40          10          20          80         90
     3        55          25         35          150        160

"#_store" is the total number of store that I selected in that county. The "product_A" is the number of stores that have product A, the "product_B" is the number of stores that have product B, "#_need_A" is the number of people that need product A in that county, "#_need_B" is the number of people that need product B in that county. And I want to compare product_A/#_need_A vs. product_B/#_need_B.

For "find out how many people (with certain feature)" is defined as "#_need_A" and "#_need_B", but currently, I am not sure if the "#_need_A" and "#_need_B" is a absolute numbers or incidence rate, if it's incidence rate, for example, 100 out of 10000 people, do I still use 100? Or 100/10000 in this case?

For "I select some counties first, and select some stores under each county. ", it's "county" variable in that data, since each county has different population or number of stores, I am wondering if I need to make it as a random effect? Or it's just regular rate comparison?

Which tests and procedure should I use for this rate comparison? Does random effect involved?

Re: Which tests and procedure should I use for this rate comparison? Does random effect involved?

Re: Which tests and procedure should I use for this rate comparison? Does random effect involved?

Which tests and procedure should I use for this rate comparison? Does random effect involved?

Re: Which tests and procedure should I use for this rate comparison? Does random effect involved?

Re: Which tests and procedure should I use for this rate comparison? Does random effect involved?

Ready to join fellow brilliant minds for the SAS Hackathon?