BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
greveam
Quartz | Level 8

 

Hi,

 

I have two different cohorts (0 and 1), in which participants could receive treatment A or B. I want to compare baseline characteristics between those receiving treatment A vs. B in a merged dataset (data=have) containing individuals from cohort 0 and 1.

 

To account for the differences introduced by the differences between those receiving treatment A vs. B in cohort o vs. 1, I was thinking about weighting by the probability of being in cohort 1 when comparing treatment A vs. B in the merged dataset (see below). But is there a smarter approach? Maybe random sampling from the merged dataset? Inputs/Comments are highly valued. Thanks!

 

proc logistic data=have;
class cohort sex;
model cohort(event='1') = age sex / stb;
output out=want prob=prob;
run;

 

proc means data=want median q1 q3;

class treatment;

var age;

weight prob;

run;

1 ACCEPTED SOLUTION

Accepted Solutions
MichaelL_SAS
SAS Employee

For comparing characteristics of different treatment conditions you might consider using the ASSESS statement in PROC PSMATCH. You can use PROC PSMATCH to produce graphical summaries for assessing balance after stratifying on the predicted probability of receiving treatment, inverse probability weighting (IPW), or matching. For example the code below would produce balance assessments that incorporate the IPW-ATT weights 

 

proc psmatch data=have region=allobs;
   class cohort sex;
   psmodel cohort(treated='1')= age sex;
   assess ps var=(age sex) / plots=all;
   output out=want weight=attwgt;
run;

For more information about inverse probability weighting in PROC PSMATCH you can look at the Propensity Score Weighting section or Example 1 in the PROC PSMATCH documentation. Note that the weight= option use in the example code I provided and the PSWEIGHT statement used in the documentation example are new syntax introduced in SAS/STAT 15.1.  The documentation for previous releases is also available on online here

 

http://support.sas.com/documentation/onlinedoc/stat/index.html

 

 

View solution in original post

5 REPLIES 5
PaigeMiller
Diamond | Level 26

I don't think LOGISTIC is necessary or appropriate here. LOGISTIC is for cases when your response variable(s) are binary, you don't have that here.

 

I think PROC SURVEYMEANS will work better, it can compare means if the samples are somehow weighted by probability that an individual is a cohort. Of course, this assumes I know what you mean by "I was thinking about weighting by the probability of being in cohort 1" and I don't really know, you haven't really explained how the design of the study produces individuals in cohort 0 or cohort 1. So please explain further the design of the study.

--
Paige Miller
greveam
Quartz | Level 8

I can see my wording was equivocal - sorry about that. The event in the logistic model is cohort (0 vs. 1 - binary) with explanatory variables a-z, which calculates the probability of being in cohort 1 (event) given explanatory variables a-z.

 

In other words, when I weight by this probability in the baseline comparisons of treatment A vs. B in the merged dataset, the p-values are adjusted (weighted) by the probability of being in cohort 1 based on the differences in explanatory variables relative to cohort 0.

 

My problem is:

1. The frequency (n) of patients receiving treatment A is greater in cohort 1 vs. 0.

2. Cohort 1 is "healthier" than 0.

 

This introduces pseudo-differences between treatment A and B as function of the relative "oversampling" in cohort 1. My question is therefore, is this the above modelling the best way account for differences in treatment A vs. B in a merged dataset of cohort 0 and 1, which differ on baseline variables.

 

Maybe a random sampling (with a by statement on cohort?) from the merged dataset would be better? Then the number of patients receiving A vs. B would then not be biased on the oversampling of treatment A in cohort 1.

PaigeMiller
Diamond | Level 26

It seems that you want some sort of statistical analyses where some variables are considered independent variables (or predictor variables, or x-variables) and other variables are considered dependent variables (or response variables, or y-variables).

 

Now I still am confused about which are the x-variables and which are the y-variables. It seems to me that cohort and A vs B are x-variables, and some measure of health in a y-variable here. Could you comment on this?

--
Paige Miller
StatDave
SAS Super FREQ

What you describe sounds like something that requires causal analysis - probably the method that is implemented in PROC CAUSALTRT. See the discussion and examples in the documentation for that procedure.

MichaelL_SAS
SAS Employee

For comparing characteristics of different treatment conditions you might consider using the ASSESS statement in PROC PSMATCH. You can use PROC PSMATCH to produce graphical summaries for assessing balance after stratifying on the predicted probability of receiving treatment, inverse probability weighting (IPW), or matching. For example the code below would produce balance assessments that incorporate the IPW-ATT weights 

 

proc psmatch data=have region=allobs;
   class cohort sex;
   psmodel cohort(treated='1')= age sex;
   assess ps var=(age sex) / plots=all;
   output out=want weight=attwgt;
run;

For more information about inverse probability weighting in PROC PSMATCH you can look at the Propensity Score Weighting section or Example 1 in the PROC PSMATCH documentation. Note that the weight= option use in the example code I provided and the PSWEIGHT statement used in the documentation example are new syntax introduced in SAS/STAT 15.1.  The documentation for previous releases is also available on online here

 

http://support.sas.com/documentation/onlinedoc/stat/index.html

 

 

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1147 views
  • 0 likes
  • 4 in conversation