About kc

kc · ‎07-21-2022

Will do.

kc · ‎07-21-2022

I am simulating survival data to mimic an already existing dataset. The details of the original dataset are as follows - 1. 2 treatment groups 2. Patient follow up until at least 5 years. 3. Event rate at 5 years is around 50% for both groups at the end of 5 years. 4. An additional 10% (trt 1) and 25% (trt 2) patients dropped out of the study before the full 5Y follow-up. I am using the Weibull Shape (1.0147) and Weibull Scale (6.4465) parameters from the SAS output of proc lifereg procedure run separately by group to simulate the data. The survival time (in years) is capped at 5y before running the procedure. The proc lifereg code is as follows: proc lifereg data=surv; where group=1; model surv_time_years*event(0) = / dist=Weibull; run; I am using the following line of code: Time=rand('Weibull', 1.0147, 6.4465) to simulate data (with same number of patients) for trt 1. Although the overall mean time of simulated (3.46) vs. original data (3.45) for trt 1 is almost the same, issue is that the simulation is overestimating the number of patients completing 5 years. Around 10% more patients in the simulated dataset have time >= 5y, with some really extreme values not seen in the original dataset. As a consequence, the number of patients completing 1,2,3,4,5 years in both simulated vs. original data is completely different. I followed other posts on the forum and tried removing the censoring variable 'event(0)' from the model statement when estimating Weibull parameters as suggested in one post, but with no luck. Also, I am in possession of Rick Wicklin's 'Simulating Data with SAS' text and have gone through the relevant sections on simulating survival data. I tried using other distributions (comparing model fit using AIC/BIC from proc lifereg) to simulate the data resulting in similar or even worse results. I understand this is a simulation and some amount of variation is expected, but, I feel I am missing something here. Any help is appreciated.

kc · ‎05-07-2022

I am trying to simulate monotone and non-monotone missing data patterns on a repeated measures longitudinal dataset (with no missing values to begin with). For simplicity, lets assume 1000 observations (one record per subject), a group variable with 2 levels, with 4 repeated measures of an outcome variable. https://blogs.sas.com/content/iml/2016/10/26/patterns-of-missing-data.html I believe it's easy to simulate non-monotone missing data pattern using the example in the link above. And, in order to create monotone patterns, you can specify the patterns you need using a zero-one matrix and either use proc iml or data steps to get the end result. I have the following questions - 1. It would be helpful if someone can share a worked example/code out there that creates both monotone and non-monotone missing data patterns, and perhaps allows you to control the amount/rate of missing data for these different patterns. 2. Also, can we use a propensity score based model to create monotone missing data pattern for a dataset? Any links/articles/code would be helpful. Thanks!

kc · ‎04-18-2022

@SteveDenham: Thanks for your quick responses and suggestions!

kc · ‎04-18-2022

@StatsMan: Thanks for sharing the paper. I was able to simulate data for TYPE=VC. I will report back after implementing data simulation using TYPE=UN. One last issue is that I still see values only for the 1st patient in the G matrix when using TYPE=UN, output via ODS statement. Hope I am not missing something here conceptually.

kc · ‎04-18-2022

@StatsMan: Yes, your correction about 'TIME' is the right interpretation from the model specified. Is there a way to output G (and/or V) for all patients (depending on the covariance structure of course) in to a dataset for post-processing? Thanks!

kc · ‎04-18-2022

Estimated G and GCORR matrices for "VC and "UN" covariance structures from SAS output: VC: Estimated G Matrix Row Effect Subject Col1 Col2 1 Intercept 1 265.65 2 time 1 0.05513 Estimated G Correlation Matrix Row Effect Subject Col1 Col2 1 Intercept 1 1.0000 2 time 1 1.0000 UN: Estimated G Matrix Row Effect Subject Col1 Col2 1 Intercept 1 308.72 -1.8769 2 time 1 -1.8769 0.08507 Estimated G Correlation Matrix Row Effect Subject Col1 Col2 1 Intercept 1 1.0000 -0.3662 2 time 1 -0.3662 1.0000 Just to reiterate, the goal is to simulate some "dummy" data based on the estimated parameters from the mixed model specified. Thanks.

kc · ‎04-17-2022

I need help constructing a G matrix for the mixed model below. I am using the example in section 12.3.3 of the "Simulating Data with SAS" textbook as a reference so I can simulate a dataset. The mixed model is as follows: proc mixed data=sample method=reml; class trt pt; model followup = baseline trt time time2 time*trt time2*trt / solution outpm=outpm; random intercept time/subject=pt; ods select CovParms SolutionF; ods output CovParms=CovParms SolutionF=SolutionF; run; Trt has 2 levels A and B, time is numeric and has a range 1-7 and included as both a fixed and random effect. Time2 is quadratic time. followup and baseline are scores at followup timepoints and at study startup. Pt is just ID number. I am using the default covariance structure which is 'VC'. The Covariance and Fixed effects estimates are in the tables below - Covariance Parameter Estimates Cov Parm Subject Estimate Intercept pt 265.65 time pt 0.05513 Residual 226.46 Solution for Fixed Effects Effect trt Estimate Standard Error DF t Value Pr > |t| Intercept 44.5923 2.1326 580 20.91 <.0001 Baseline 0.3187 0.03407 1349 9.35 <.0001 trt A 9.4363 1.8590 1349 5.08 <.0001 trt B 0 . . . . time 0.5883 0.09601 508 6.13 <.0001 time2 -0.01128 0.001628 1349 -6.93 <.0001 time* trt A -0.5680 0.1281 1349 -4.43 <.0001 time* trt B 0 . . . . time2* trt A 0.007973 0.002190 1349 3.64 0.0003 time2* trt B 0 . . . . Based on the model above, design matrix X has 10 columns, and Z has 2 columns. I need some clarity on how to construct the G matrix (number of rows/columns and the values) when the covariance structure is VC. And, also perhaps comment on how G changes when the structure is "UN" (unstructured) or CS (compound symmetry). UN: Covariance Parameter Estimates Cov Parm Subject Estimate UN(1,1) pt 308.72 UN(2,1) pt -1.8769 UN(2,2) pt 0.08507 Residual 220.02 CS: Covariance Parameter Estimates Cov Parm Subject Estimate Variance pt 225.59 CS pt 43.1601 Residual 247.97

kc · ‎09-07-2021

I am reporting absolute risk difference and CI's from proc freq below. I am wondering which CIs and p-value should be reported in the results. There are 2 treatment groups and both Ns > 100. But the events of interest (=1) are < 5. proc freq data=data; table trt*event/ chisq riskdiff(cl=( newcombe exact wald) ) pdiff; run;

kc · ‎01-29-2021

Hello, I am trying to see the impact of treatment group on an event using the repeat event analysis by creating a counting dataset. But, I need the HR's in different time intervals (3 intervals) for the treatment group. After creating the start and end time variables in the dataset, I used the ‘end’ time variable to create ‘Time_Interval’ (code below). The issue is, although the HR’s for time interval 2 and 3 make sense clinically, HR for interval 1 is way off base. My question is whether the ‘Time_Interval’ calculation should be based on the ‘end’ time variable in the first place? If not, can someone point me in the right direction of how to accomplish this task? Thanks! Code: if 0 <= end <= 30 then Time_Interval='0-30D'; if 30 < end <= 365 then Time_Interval='30D-1Y'; if 365 < end <= 1825 then Time_Interval='1-5Y'; proc phreg data=counting_dataset covs(aggregate) covm; class group (ref='Trt2') Time_ Interval; model (start, end)*event_flag(0 2) = group Time_ Interval group*Time_ Interval / ties=breslow; id Subject; hazardratio 'HR Trt1 vs. Trt2' group ; run;

kc · ‎11-23-2020

Hi Steve, Thank you for your response - but, using OUTSURV dataset gives CL for survival estimates not failure. I believe using 'plots=survival(failure cl)' in the lifetest statement and 'failureplot=want' in the ods output statement does give CL for failure but the issue is interpretation: Here is a sample row from the 'want' dataset - I have split one row from the dataset in to two rows here for better reading. I believe values in _1_SDF_LCL_ and _1_SDF_UCL_ represent 95% CL for failure rates but the labels are switched? (Because _1_SDF_LCL_ > _1_SDF_UCL_ ) Question: Not sure why this is happening. Is it safe to assume _1_SDF_LCL_ is indeed UCL for failure rate and _1_SDF_UCL_ is in fact LCL? STRATUM Time SDF_LCL SDF_UCL Survival AtRisk Event Censored 1 0.72279 0.97752 0.99270 0.98717 918 1 . StratumNum _1_SDF_LCL_ _1_SDF_UCL_ _1_SURVIVAL_ _1_CENSORED_ 1 0.022476 .007304630 0.012827 .

kc · ‎11-22-2020

I would like to get 95% confidence interval for failure rate - an example would be perfect! Thanks.

kc · ‎07-30-2020

Hello, I get a warning in SAS saying "WARNING: Scaling the covariance is not available for zero-inflated models." when I use the scale=d option in the model statement of proc genmod. Any idea of how to handle overdispersion if scale option isn't available for these models? Using some other procedures to run a zip or zinb models, perhaps? Ideas welcome - thanks in advance.

kc · ‎06-12-2020

I am running a growth curve model for longitudinal data analysis using proc mixed on a dataset with 2 treatment groups. I have estimate statements to calculate individual means and the difference between means of the 2 treatment groups at 6 month follow-up. I am running in to an issue where the difference calculated by the estimate statement is not the same as the actual difference between the individual means. The sample code is below - trt has values A and B, fup_scale and baseline_scale are continuous variables, miss and dth are binary variables with values 1 and 0, month specifies follow-up month, sq_month is just a quadratic term for month. I have assigned treatment specific weights to baseline scale, miss and dth variables in the estimate statements when calculating individual means and overall weights for miss and dth in the estimate statement for difference. I am not able to figure out whats missing or incorrect in these statements - any help is appreciated! proc mixed data=gcm method=ml covtest noitprint noclprint; class trt pt; model fup_scale = baseline_scale trt month sq_month miss dth trt*month trt*sq_month trt*miss trt*dth trt*miss*month trt*dth*month miss*month miss*sq_month dth*month dth*sq_month / solution; random intercept month / sub=pt type=un G Gcorr; *averaged estimates*; estimate 'avg mean at 6m - trt A' intercept 1 baseline_scale 57.66 trt 1 0 month 6 sq_month 36 miss 0.20 dth 0.30 trt*month 6 0 trt*sq_month 36 0 trt*miss 0.20 0 trt*dth 0.30 0 trt*miss*month 1.2 0 trt*dth*month 1.8 0 miss*month 1.2 miss*sq_month 7.2 dth*month 1.8 dth*sq_month 10.8 /cl; estimate 'avg mean at 6m - trt B' intercept 1 baseline_scale 57.19 trt 0 1 month 6 sq_month 36 miss 0.25 dth 0.35 trt*month 0 6 trt*sq_month 0 36 trt*miss 0 0.25 trt*dth 0 0.35 trt*miss*month 0 1.5 trt*dth*month 0 2.1 miss*month 1.5 miss*sq_month 9 dth*month 2.1 dth*sq_month 12.6 /cl; estimate 'avg Difference at 6m: A - B' trt 1 -1 trt*month 6 -6 trt*sq_month 36 -36 trt*miss 0.225 -0.225 trt*dth 0.325 -0.325 trt*miss*month 1.35 -1.35 trt*dth*month 1.95 -1.95/cl; run;

kc · ‎08-03-2018

Thank you! Will update the post after working through the code.

Online Status	Offline
Date Last Visited	‎09-26-2024 03:14 PM

Re: Trend test for multinomial variable

Re: Trend test for multinomial variable

Re: Trend test for multinomial variable

Re: Trend test for multinomial variable

Re: Trend test for multinomial variable

Trend test for multinomial variable

Re: Test Proportional Hazards Assumption in Recurrent Event Model

Re: Test Proportional Hazards Assumption in Recurrent Event Model

Test Proportional Hazards Assumption in Recurrent Event Model

at-risk table in proc phreg - recurrent event analysis

Re: proc nlmixed - test if the interaction term is significant at a sp...

Re: Simulating Survival Data - issue with overestimation

Re: Proc IML - evaluating multiple conditions

Beta - Binomial Bayesian Model

Re: proc sgplot - axis scale and tick marks

Re: Simulating Survival Data - issue with overestimation

Simulating Survival Data - issue with overestimation

Produce monotone and non-monotone missing data patterns

Re: Construct G matrix for a mixed model with random effects

Re: Construct G matrix for a mixed model with random effects

Re: Construct G matrix for a mixed model with random effects

Re: Construct G matrix for a mixed model with random effects

Construct G matrix for a mixed model with random effects

absolute risk difference CIs and pvalue

Hazard Ratio for different time intervals

Re: proc lifetest - 95% CI for failure rate

proc lifetest - 95% CI for failure rate

scale=deviance option in proc genmod for zero-inflated poisson and zer...

proc mixed estimate statement

Re: Piecewise Linear Regression