12-09-2016 11:36 PM
I have a table that has patient information and visits. I have categorized the visits into certain cateogires, and have given the categories numbers. For example, in the table below the code of 3=clinic visit, 4=doctor visit, 5=nurse visit etc. etc.
What I would like to do is count how many visits have a value of 3, 5, 7, and 8 and divide that number by visits that have the value of 4,6,9. So in this example there are 4 rows that have the value of either 3,5,7,or 8 and 3 rows that have a value of 4,6 or 9, so I would want to create an equation that would be probablity of visits= 3/4=0.75
What is the best way to approach this?
Thank you in advance
12-10-2016 12:59 AM - edited 12-10-2016 01:00 AM
value of 3, 5, 7, and 8 and divide that number by visits that have the value of 4,6,9 ...probablity of visits= 3/4=0.75
Not really sure what you're really after as according to your narrative it would be 4/3 but then you tell us it's 3/4 and you call this probability.
Below a code sample which hopefully gives you what you're after or then at least will point you into the right direction.
data have; patient_id=1; input code; datalines; 3 4 5 6 7 8 9 10 ; run; proc sql; /* create table want as*/ select patient_id, divide ( sum(case when code in (3,5,7,8) then 1 else 0 end) , sum(case when code in (4,6,9) then 1 else 0 end) ) as want, divide ( sum(case when code in (4,6,9) then 1 else 0 end) , count(*) ) as prob from have group by patient_id ; quit;
12-10-2016 01:21 PM - edited 12-10-2016 01:25 PM
Actually you're generating an odds ration, not a probability, as Patrick has observed:
If the data is sorted by ID, then this is short and sweet:
data want (keep=id n_numerator n_denominator odds);
if first.id then do; n_numerator=0; n_denominator=0; odds=.; end; /* Added in an editted response */
n_numerator + (code in (3,5,7,8));
n_denominator + (code in (4,6,9));