BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
SMcelroy1287
Obsidian | Level 7

Hello! Thank you fro your help in advance. I have a dataset containing the outcome variable 1=case and 0=control, group_id= the matching controls for each case and the case, propensity scores for every case and control. I would like to select the propensity score of the case and use this value to generate a new variable that is the difference between the case's score and all the control's propensity scores in the same group.

 

Outcome    Group_id    Propensity score         New Variable (control propensity score-case propensity score)

1                   1                     .2378                       0  

0                   1                    .2637                       (.2637-.2378)

0                   1                     .2987                      (.2987-.2378)

0                   1                     .2309                      (.2309-.2378)

0                   1                     .2134                      (.2134-.2378)

0                   2                     .0023                      (.0023-.0324)

0                   2                     .0123                      (.0123-.0324)

0                   2                     .0224                       (.0224-.0324)

1                   2                     .0324                        0

0                   2                     .0128                       (.0128-.0324)

 

I have about 45,000 groups I need to calculate this difference for. Thank you very much for your time!

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

While there are a few ways, this is probably the most likely to work without hiding potential error situations:

 

data want;

do until (last.group_id);

   set have;

   by group_id;

   if outcome=1 then case_propensity = propensity_score;

end;

do until (last.group_id);

   set have;

   by group_id;

   new_variable = propensity_score = case_propensity;

   output;

end;

run;

 

Assuming your data set is sorted by GROUP_ID, the top loop finds the CASE observation for a GROUP_ID.  Then the bottom loop reads the same observations, calculates, and outputs.

View solution in original post

4 REPLIES 4
art297
Opal | Level 21
proc sort data=have out=want;
  by Group_id descending Outcome;
run;

data want (drop=hold);
  set want;
  by Group_id;
  retain hold;
  if first.Group_id then hold=Propensity_score;
  new_variable=Propensity_score-hold;
run;

Art, CEO, AnalystFinder.com

 

Astounding
PROC Star

While there are a few ways, this is probably the most likely to work without hiding potential error situations:

 

data want;

do until (last.group_id);

   set have;

   by group_id;

   if outcome=1 then case_propensity = propensity_score;

end;

do until (last.group_id);

   set have;

   by group_id;

   new_variable = propensity_score = case_propensity;

   output;

end;

run;

 

Assuming your data set is sorted by GROUP_ID, the top loop finds the CASE observation for a GROUP_ID.  Then the bottom loop reads the same observations, calculates, and outputs.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1917 views
  • 2 likes
  • 3 in conversation