SAS Support Communities

MichaelL_SAS · ‎05-14-2024

PROC CAUSALTRT does not output stabilized weights. You can request stabilized IPW-ATE weights from the PSMATCH procedure, or you can always use DATA Step to modify the weights produced by PROC CAUSALTRT and multiply by the proportion of treated/untreated subjects. Note that for a single treatment variable if you are estimating the ATE by using inverse probability of treatment weighting as your estimation method the use of inverse probability of treatment weights vs stabilized inverse probability of treatment weights are equivalent for the IPWR estimation method in PROC CAUSALTRT. Differences in the estimates only start to occur when examining treatment regimes with repeated treatment decisions, a type of analysis not covered by the CAUSALTRT procedure.

MichaelL_SAS · ‎03-11-2024

Assuming the data that you are using are the same as the Drugs data set used in the PROC PSMATCH documentation, an optimal fixed ratio matching with 5 control units for every treated unit will not be feasible. There are only 373 control units and 113 treated units, so there aren't enough controls for that to satisfy that fixed ratio. It might be that the prompt is requesting an optimal variable ratio matching, with 1-5 control units matched to each treated unit. You would request such a matching my using the METHOD=VARRATIO(KMAX=5) specification in the MATCH statement. Using the DRUGS data set from the PROC PSMATCH documentation, that matching is feasible.

MichaelL_SAS · ‎08-15-2023

As noted in the thread @JosvanderVelden linked to, it is likely the case that there is no overlap between the predicted propensity score values. I believe PROC PSMATCH doesn't produces any output when this error is encountered, but you can check this by fitting the same propensity score model in PROC LOGISTIC and see investigate using the predictions it produces. I suspect PROC LOGISTIC will report some issues regarding complete/quasi-complete separation for these data.

MichaelL_SAS · ‎08-10-2023

To add to what what @sbxkoenk noted, based on the SUBJECT=ParticipantId specification in your code, I suspect you might be running an analysis where you have one measurement from each subject and are using the GEE model to get the empirical/robust/sandwich covariance matrix estimate. If that is the case, then specifying TYPE=EXCH, would be a likely cause for that WARNING message, since estimating the exchangeable working correlation structure would require clusters with two or more observations. As mementioned in the note @sbxkoenk rreferenced, you can switch to using an independent working correlation structure. The independent working correlation structure is used by default if you omit the TYPE= option.

MichaelL_SAS · ‎03-23-2023

The EXACT= option imposes a constraint on the matching that limits the feasibility of what matched sets can be produced, so it is not really appropriate to say more than one matching is occurring. In particular, EXACT= option requires all observations in a matched set to have the same value(s) for the categorical variable(s) listed in the option. If you are performing a greedy matching with the EXACT= option specified, this constraint means that when the procedure looks for the nearest neighbors of a given treated unit, it only considers the observations in the control condition that have the same value(s) of the variable(s) listed in the EXACT= option. If you are performing an optimal matching method, the constraint imposed by the EXACT= option becomes a constraint for the underlying optimization problem that is solved to create the matched sets, and again limits the feasibility of what observations can be matched together.

MichaelL_SAS · ‎01-26-2023

I think the issue is in your NC function definition. The INPUT function call used to extract the knot values from the list is using a 3.0 format, i.e. knot = input(scan(knots, j, ' '), 3.0); kmax = input(scan(knots, -1, ' '), 3.0); kmax1 = input(scan(knots, -2, ' '), 3.0); For your GLMSELECT example where the range of the X values is larger, that format looks to work okay, but for your PHREG example where the covariates are all between 0 and 1, the 3.0 format is probably giving you knot values that are not precise enough, which throws off the evaluation of the spline basis functions, and everything breaks down from there. To check your NC function evaluations, you can output those values and compare them to the basis function evaluations in an OUTDESIGN= data set created like in this blog post of Rick's.

MichaelL_SAS · ‎01-10-2023

The procedure is performing two series of tests, that as @PGStats said are based on likelihood ratio statistics. The Type1 table shows the results for adding effects to the mean model, (using the full zero inflated model), and the Type1Zero table shows the results for adding effects to the zero inflated model (using the full mean model). The 2*LogLikelihood values are the same in the final row of each table, because in both cases it is based on the likelihood when all the effects in the mean model and all the effects in the zero inflated model are used. To your original question, the 2*LogLikelihood values are the same in the tables for the effect “a” because it was the last effect you specified in both the MODEL and ZEROMODEL statements, so it corresponds to the last row of both tables.

MichaelL_SAS · ‎07-29-2022

I believe the most likely reason for that error would be if there is no overlap in the predicted propensity scores between the two treatment conditions. I think that error will cause PROC PSMATCH to exit before creating the output data set you requested, so to see if there is any overlap in the predicted propensity scores you could try fitting the same model in PROC LOGISTIC and use an output data set it produces to see if there is any overlap between the treatment conditions.

MichaelL_SAS · ‎06-22-2022

Sorry, for the delay in responding. I think the issue I see with the simulation approach is maybe best described with some of the notation from causal diagrams. Given that E and Y values are set, if you are simulating a value of U with the desired correlations given the observed data, the causal structure would likely be one where E->U<-Y, which would make U a collider on a pathway between E and Y. For U to be a common cause of E and Y (and therefore a confounder) you would need the direction of those arrows to reverse and have E<-U->Y, something that I don't think is really possible given the fixed values of E and Y. Note that the documentation for the CAUSALGRAPH procedure provides some more details on graphical causal models, and there is this 2019 SGF paper that also discusses the collider issue in example 2. In that case where U is a collider, comparing effect estimates that do/do not incorporate it in the adjustment set is studying the effect of inappropriately adjusting for a collider (as doing so opens up a non-casual pathway between E and Y) instead of studying the effect of not adjusting for an unmeasured confounder (as that leaves a non-causal pathway unblocked). In a sense the different assumptions about U result in analyses that are mirror image of one another, i.e. one assumes your current adjustment is correct and would be made incorrect by incorporating U vs the other assumes your current adjustment set is incorrect and would be made correct by incorporating U.

MichaelL_SAS · ‎06-22-2022

I suspect the issue is due to missing values in the response or other covariates. PROC PSMATCH only uses the observations with non-missing values, and based on the PROC LOGISTIC output you provided I suspect that the 116 observations with resp_never_married=1 and the 41 observations with resp_sep_div_wid=1 all have a missing value for some other covariate and are ultimately excluded. In that case PROC PSMATCH detects only 1 level for those variables, not exactly 2, and therefore issues that warning and excludes them from the ASSESS statement output. You might try just using that 0/1 coding treating them as continuous inputs and not listing them in the CLASS statement. EDIT: I realized my comment about treating those variables as continuous is a bit silly. If my assumption is correct, that means they are constant, so the values are the same in each treatment condition and they are trivially balanced.

MichaelL_SAS · ‎06-15-2022

I think this approach where you simulate only the value of an unmeasured confounder based on the observed data is likely to run into issues. Namely, if you are simulating the U values based on the observed exposure E and outcome Y, while you might be able to create the desired correlations, the causal relationships that produce them are unlikely to correspond to U being an unmeasured confounder. For U to be an unmeasured confounder it would have to be a common cause of E and Y. However, in your simulation E and Y are already known, so U cannot have a truly causal effect on them, so the correlation would come from E or Y effecting U, in which case the casual relationships are the reverse of what you want, and U would not be an unmeasured confounder. I think it is fair to say that how to best perform sensitivity analyses for the effect of unmeasured confounding in observational studies is not a settled question. There are a wide variety of methods discussed in the literature. I think in the case of a binary outcome with the effect measured on the relative risk scale, the E-value as described by VanderWeele and Ding might be the most commonly suggested approach. I believe the appendix to their original paper had example SAS code for the computation of E-values and they have since made a web-app for computing E-values. There are also approaches that are specific to methods like propensity score matching, there is an example of this in the PROC PSMATCH documentation. There are also other approaches where the measured confounders are used to provide some basis for judging what the effect of an unmeasured confounder might be by seeing the effect of omitting each of the measured confounders from the adjustment set.

MichaelL_SAS · ‎01-25-2022

The propensity score model fit by PROC PSMATCH uses the logit link whereas the model you are fitting with PROC LOGISTIC is using the complementary log-log link function as requested by the link=cloglog option in the MODEL statement. The two models are therefore not the same and you would not expect to get the same propensity score values.

MichaelL_SAS · ‎01-18-2022

Note that the variables RACE and INSURANCE are only excluded from balance diagnostics requested in the ASSESS statement, they are still used in the propensity score model. My understanding of why categorical variables are not allowed in the ASSESS statement is that some of the diagnostics may not be well defined. For example in any assessment that compares the mean of a variable between the treatment conditions, i.e. the standardized mean difference plot and table, how to handle a categorical variable that takes levels say, “Very Poor”, “Poor”, “Neutral”, “Good”, “Very Good” is not straight forward. One option would be to use the GLM 0/1 coding and look at comparisons of those variables between the treatment condition. These comparison though only tells you about balance in each level of the variable separately, not a the overall balance in the distribution of that variable. Another option would be to PROC FREQ to compare the distribution of categorical variables between treatment conditions. An example of this approach is illustrated in Example 1 of this recent SAS Global Form Paper (in particular PROC FREQ is used around the end of page 14). https://www.sas.com/content/dam/SAS/support/en/sas-global-forum-proceedings/2019/3056-2019.pdf

MichaelL_SAS · ‎12-22-2021

When you specify a repeated statement PROC GENMOD fits a GEE model, and it might be helpful to refer to the PROC GENMOD documentation section on Generalized Estimating Equations (GEEs). I don’t think this type of model nicely lends itself to a notion of clusters being “weighted” according to their size. If you look at the estimating equations, you don’t multiply the score function for a cluster by any term that depends on the cluster size. That said, the score function for a cluster i can be written as a sum of the residual from each observation in a cluster times a column of the D_i’V_i^-1 matrix (using the notation from documentation). In that sense larger clusters contribute “more” in the sense that there are more terms in that summation, but again I don’t think it is appropriate to say it is weighted more heavily.

MichaelL_SAS · ‎12-16-2021

The table name for the missingness model parameter estimates is MissModelPEst, they are not included in the response model parameter estimates table (the GEEEmpPEst table). Note you can always use the ODS TRACE statement to request a record of each output object that's produced. That information will be written to the SAS log and it will include the table name. Also at the end of the details section for SAS/STAT procedures there will be a section on ODS Table Names that will summarize the table names and the statement/option that requests the table. Here is that section for PROC GEE.

Online Status	Offline
Date Last Visited	a month ago

SAS Support Communities

Re: proc causaltrt and stabilized ipw weights

Re: PS 1 to 5 Optimal Matching

Re: proc psmatch

Re: Log-binomial with GEE regression

Re: proc psmatch matching order?

Re: PHREG natural cubic spline - what is the fitted function?

Re: Calculation of type I sum of squares for a ZIP model in SAS

Re: ERROR in proc PSMATCH: The support region does not exist.

Re: Simulation of a variable (continuous or dichotomous) correlated to...

Re: PROC PSMATCH/ASSESS statement not recognizing a binary variable as...

Re: PROC PSMATCH/ASSESS statement not recognizing a binary variable as...

Re: PSMA

Re: Repeated Measures Weighting

Re: Repeated Measures Weighting

Re: HPGENSELECT/PROC GENMOD - GEE Stepwise

Re: PS 1 to 5 Optimal Matching

Re: Log-binomial with GEE regression

Re: proc psmatch matching order?

Re: PHREG natural cubic spline - what is the fitted function?

Re: PROC PSMATCH/ASSESS statement not recognizing a binary variable as...

Re: proc causaltrt and stabilized ipw weights

Re: PS 1 to 5 Optimal Matching

Re: proc psmatch

Re: Log-binomial with GEE regression

Re: proc psmatch matching order?

Re: PHREG natural cubic spline - what is the fitted function?

Re: Calculation of type I sum of squares for a ZIP model in SAS

Re: ERROR in proc PSMATCH: The support region does not exist.

Re: Simulation of a variable (continuous or dichotomous) correlated to...

Re: PROC PSMATCH/ASSESS statement not recognizing a binary variable as...

Re: Simulation of a variable (continuous or dichotomous) correlated to...

Re: PSMA

Re: Propensity score matching

Re: Repeated Measures Weighting

Re: How to store effects from missmodel of proc gee?

SAS Viya Copilot Private Preview

Follow Us

What is...