BookmarkSubscribeRSS Feed
curiosgeorge
Calcite | Level 5

Hey all! New to SAS so please excuse this newbie question.

 

I have three different datasets that have services dates to the family doctor, diabetes clinic, hospitalization containing both cases and controls. I am looking to figure out the crude rate of visits to each of these settings by total visits/ combined person years for each setting and altogether. I did the counts within each of the three datasets per id present but my total person year variable comes from a different, fourth dataset - this total person years is one number for all cases and another for all controls.

 

How can I structure the data to add up all the counts from each of the datasets and then divide by total person years, while also getting the confidence intervals. Then I need to I do a ttest to compare the rates between cases and controls as well. 

4 REPLIES 4
PaigeMiller
Diamond | Level 26

It would help if you showed us a portion of these data sets, so we can see what the situation is.

 

In general, you need to merge the data sets by some patient ID number.

 

It's not  clear to me that you have met the conditions for a t-test, as counts divided by total person years is probably not normally distributed; however if you have enough data, then maybe that doesn't matter.

--
Paige Miller
curiosgeorge
Calcite | Level 5

Each of the datasets are in this format:

 

id

Case

Number of Hospitalization(count)

Time contributed(yrs)

1

1

5

7.9

2

1

2

2.1

3

0

1

4.7

4

0

1

9.4

 

 

Ex:

Total count for case: 7 (From 5+2 count for hospitalization); Total years for case = 10 (from 7.9 + 2.1) ; rate= 7/10

Total count for control :2 ; Total time for control = 14.1; rate = 2/14.1

Then I am wanting to compare the two rates.

PaigeMiller
Diamond | Level 26

PROC SUMMARY will sum up the number of hospitalizations and the times, for each case or contol of interest, and by patient.

 

From there, in a SAS data set, you could divide the total number of hospitalizations by the total time to get the rates, again by patient.

 

And then you ought to be able to run PROC TTEST on the patient to compare the mean rate of the controls to the mean rate of the cases. This again assumes that these are approximately normally distributed, and they are not, so maybe there is a better approach, perhaps assuming these are Poisson rather than normal. But I haven't really thought that through yet. I will let you know if I think of anything, or perhaps someone else has an idea.

 

ADDING: This is what I was thinking of: "Modeling rates and estimating rates and rate ratios (with confidence intervals)"

http://support.sas.com/kb/24/188.html

--
Paige Miller
curiosgeorge
Calcite | Level 5

Yes point taken about the distribution of the data, thanks for the insight. I think I will need some time to think about this one!

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 529 views
  • 0 likes
  • 2 in conversation