BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
SasUser324
Calcite | Level 5

Hello all,    

I can’t seem to wrap my mind around this and was wondering if anyone could help clarify things.  How do I calculate the standard deviation of calculations of SQL count data that was collapsed across observations and then how do I carry that through the calculations I’m executing in SQL?  This is hard to explain so I made an illustrative example and tried to describe clearly the calculation I’m doing and the standard deviation I’m looking for.  I have a hunch I’m overlooking some fundamental concept here. 

Let’s assume I have a cohort of 20 people (8 exposed, 12 unexposed) and I am interested in determining the proportion of people that had each of three procedures, stratified by exposure status.  I can only see claims for 13 people here, 7 had no procedure and so don’t appear in this data table. My eventual goal is to generate the usual Table 1 comparing exposed to unexposed and including a standardized difference (of procedure rates).  Unfortunately, I can't accomplish the first step of generating standard deviation estimates. 


My data is in the following format:

TABLE 1. DATA_SET_A

Procedure

Exposed

Subject

1

1

Person 1

1

1

Person 2

1

1

Person 3

1

1

Person 4

1

0

Person 5

1

0

Person 6

1

0

Person 7

1

0

Person 8

2

1

Person 1

2

1

Person 3

2

0

Person 6

2

0

Person 8

2

1

Person 9

2

1

Person 10

2

0

Person 11

2

1

Person 12

2

0

Person 13

3

0

Person 5

3

0

Person 8

3

1

Person 10

3

0

Person 11

3

0

Person 13

  1. First, I use PROC SQL to pull population counts (stratified by exposure) from my source population, which contains records for all people [STEP 1].
  2. Next, I use PROC SQL to collapse DATA_SET_A so each procedure is listed, for each exposure status, with the count [STEP 2].
  3. Finally, I use LEFT JOIN to link procedure/exposure numerators with their appropriate denominators and perform the proportion calculation [STEP 3].

Please refer to the SAS code I've pasted at the bottom of this post.

This code generates Table 2 (see below).  How do obtain the standard deviations for these proportion estimates (see ? in the table below)? I don’t know how to obtain the standard deviation of a count that I’ve collapsed in SQL.  I tried using aggregate functions such as std(count(*)) AS StdDev, but SAS didn’t like that.  Do I have to obtain the standard deviation during the count step and then carry it forward or can I somehow obtain it during my proportion step?  I know how to calculate the standard deviation if I transform my dataset such that each procedure is a variable instead of having a procedure variable that aligns with the counts.  The problem is that my real problem involves hundreds of procedures so I can't switch them all over manually.

TABLE 2. DATA_SET_B

Procedure

Exposed

Count

Denominator

Proportion

  1. Std. Dev.

1

1

4

8

4/8

  1. 0.50

?

1

0

4

12

4/12

  1. 0.33

?

2

1

5

8

5/8

  1. 0.63

?

2

0

4

12

4/12

  1. 0.33

?

3

1

1

8

1/8

  1. 0.13

?

3

0

4

12

4/12

  1. 0.33

?

Thanks for your help. Please be blunt if I’m missing something obvious. 



CODE BELOW:


*CALCULATING THE PROPORTION OF PATIENTS THAT WERE CODED FOR EACH PROCEDURE AT LEAST ONCE IN THE FOLLOW-UP PERIOD*;

STEP 1

      *COUNTS FOR DENOMENATORS*;

                        *COUNT UNIQUE PATIENTS*;

PROC SQL;

CREATE TABLE work.Denominators AS

      SELECT DISTINCT exposed, count(*) AS Denominator

      FROM (SELECT DISTINCT exposed, subject FROM work.SourcePopn)           

GROUP BY exposed;

QUIT;

STEP 2

*COUNT UP PROCEDURS FOR NUMERATOR*;

PROC SQL;

      CREATE TABLE work.Numerators AS

      SELECT DISTINCT exposed, procedure, count(*) AS count

      FROM (SELECT DISTINCT subject, procedure, exposed FROM work.DATA_SET_A)

      GROUP BY exposed, procedure

      ORDER BY exposed DESC, count DESC;

QUIT;

STEP 3

*THIS CODE ATTACHES THE DENOMENATOR AND DOES PROPORTION CALCULATION*;

PROC SQL;

      CREATE TABLE work.DATA_SET_B AS

SELECT a.procedure, a.exposed, a.count, b.denominator,

(a.count/b.denominator) AS Proportion

      FROM work.Numerators AS a LEFT JOIN work.Denominators as b

      ON a.exposed=b.exposed

      ORDER BY exposed DESC, count DESC;

RUN;

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

The standard deviation of a proportion is sqrt(p*(1-p)/n), where p is the estimated proportion; should be easy enough to compute from PROC SQL or otherwise.

--
Paige Miller

View solution in original post

1 REPLY 1
PaigeMiller
Diamond | Level 26

The standard deviation of a proportion is sqrt(p*(1-p)/n), where p is the estimated proportion; should be easy enough to compute from PROC SQL or otherwise.

--
Paige Miller

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 1 reply
  • 1282 views
  • 0 likes
  • 2 in conversation