10-31-2017 11:59 AM
hi I am trying to think through this problem: I need to calculate 'in each unique pregnancy, at which week in pregnancy a woman got a vaccine.'
- a unique eposide is defined as: distinct person_id, condition_id (so if two records for person 1, person_id = 1, condition_id =1 and person_id =1, condition_id = 2, means this women has two distinct pregnancies; if one record of person_id = 1, condition_id =2, this person has one distinct pregnancy)
I want to fill in a table that looks like:
|All pregnancy episodes||week0-12||week13-19||week 20|
number under all pregnancy episodes should = sum of all the week column.
BUT my data shows that during some pregnancy episodes, some women got more than one vaccine, and I only want each unique pregnancy being count once, so that the sum of all weeks can equal to all unique pregnancies. I want to calculate according to this rule:
For all the episodes that has more than one vaccines weeks (week when they got their vaccine)
- If 1 vaccine is in 27-36 week, and others are not in week 27-36, then use the week in 27-36
- if multiple vaccine in 27-36, then use the earliest week
- if neither vaccine is in 27-36, then use the earliest week
My input data looks like:
My desired output is the red rows in input table above, in other words, I want to leave only one row per unique pregnancy after applying the rule for multiple week entries under one pregnancy.
Any suggestion is appreciated!
10-31-2017 12:50 PM
something like this
data have; input person_id condition_id week ; datalines; 1 1 26 1 2 4 1 2 30 2 1 15 3 1 14 3 1 28 3 2 28 3 2 32 4 1 18 5 2 32 6 1 33 6 2 34 6 2 17 ; run; proc sql; select person_id, condition_id, week from have group by person_id , condition_id having max(week) = week; quit;
10-31-2017 01:52 PM
Thanks @kiranv_, but this code won't work for this customized value assign, I don't need max(week), I need to select first which week range it's in, then depends on that, I need to assign its own value or the earliest value