Hello! Thank you fro your help in advance. I have a dataset containing the outcome variable 1=case and 0=control, group_id= the matching controls for each case and the case, propensity scores for every case and control. I would like to select the propensity score of the case and use this value to generate a new variable that is the difference between the case's score and all the control's propensity scores in the same group.
Outcome Group_id Propensity score New Variable (control propensity score-case propensity score)
1 1 .2378 0
0 1 .2637 (.2637-.2378)
0 1 .2987 (.2987-.2378)
0 1 .2309 (.2309-.2378)
0 1 .2134 (.2134-.2378)
0 2 .0023 (.0023-.0324)
0 2 .0123 (.0123-.0324)
0 2 .0224 (.0224-.0324)
1 2 .0324 0
0 2 .0128 (.0128-.0324)
I have about 45,000 groups I need to calculate this difference for. Thank you very much for your time!
While there are a few ways, this is probably the most likely to work without hiding potential error situations:
data want;
do until (last.group_id);
set have;
by group_id;
if outcome=1 then case_propensity = propensity_score;
end;
do until (last.group_id);
set have;
by group_id;
new_variable = propensity_score = case_propensity;
output;
end;
run;
Assuming your data set is sorted by GROUP_ID, the top loop finds the CASE observation for a GROUP_ID. Then the bottom loop reads the same observations, calculates, and outputs.
proc sort data=have out=want; by Group_id descending Outcome; run; data want (drop=hold); set want; by Group_id; retain hold; if first.Group_id then hold=Propensity_score; new_variable=Propensity_score-hold; run;
Art, CEO, AnalystFinder.com
Thank you for taking the time to respond!
While there are a few ways, this is probably the most likely to work without hiding potential error situations:
data want;
do until (last.group_id);
set have;
by group_id;
if outcome=1 then case_propensity = propensity_score;
end;
do until (last.group_id);
set have;
by group_id;
new_variable = propensity_score = case_propensity;
output;
end;
run;
Assuming your data set is sorted by GROUP_ID, the top loop finds the CASE observation for a GROUP_ID. Then the bottom loop reads the same observations, calculates, and outputs.
Thank you for the response! This worked!
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.