SAS Support Communities

Lindy · ‎08-11-2019

Could any one please provide me with the SAS code I should use for this task? The rate of a type of disease in the general population is 50%, and I am giving a group of randomly selected people a type of vitamin to observe if the vitamin can decrease the rate of disease in this treatment group. I want to be able to identify the significant effect as long as there is 40% rate of this disease in my treatment group at alpha .05 and power .80. What sample size minimally I should have? Thank you so much!

Lindy · ‎08-04-2019

Thank you so much for your help! I am quite an armature in using SAS for power analysis. If it is at your convenience, could you please provide me with the several line of SAS code I should use for this task? The rate of a type of disease in the general population is 50%, and I am giving a group of randomly selected people a type of vitamin to observe if the vitamin can decrease the rate of disease in this treatment group. I want to be able to identify the significant effect as long as there is 40% rate of this disease in my treatment group at alpha .05 and power .80. What sample size minimally I should have? Thank you so much!

Lindy · ‎08-04-2019

Hi there, I used R to get a sample size I need for a logistic regression study. > wp.logistic(n = NULL, p0 = 0.5, p1 = 0.6, alpha = 0.05, + power = 0.8, family = "normal", parameter = c(0,1)) R said I need a sample N=214 if I want to discover a significant difference of .1 (p2=.6 and p1=.5, p2-p1=.1), at alpha=.05, power=.8. However, when I use SAS to find the sample size based on the covariates I would like to have , it gives me smaller sample sizes (N=168, N=175) than N=214 when I want alpha=.05 and power=.8. Who can help me figure out what is going on? R said N at least should be 214 even without covariates. So I think adding covariates the required sample size should be at least more than N=214. proc power; logistic vardist("recidivism") = BINOMIAL (0.5, 1) vardist("gender") = BINOMIAL (0.5, 1) vardist("self-control") = ordinal((2 4 6) : (0.4 0.4 0.2)) vardist("age") = normal(24, 6) vardist("income") = normal(2400, 600) testpredictor = "recidivism" covariates = "gender"|"self-control" "age" "income" responseprob = 0.5 0.6 .07 testoddsratio = 2.4 alpha = 0.05 power = 0.8 ntotal = .; run;

Lindy · ‎06-28-2018

Really appreciate your help, Daniel! It is important to know that researchers should not parcel factor analysis and SEM into two steps. --Lindy

Lindy · ‎06-07-2018

Thank you, Daniel! I am using two-step approach. I am doing a cross-lagged model using SEM with three waves of data. I first created factors based on CFA, and then throw in these factors in SEM. The model is complicated, and using items directly as indicators in these three waves cross-lagged model is very clunky. I know we can use proc MI and proc mianalyze in SEM to deal with missing values, so it will not be a big issue if some factors in a model are with missing values. But proc mianalyze will NEVER give us model fit indices... For example, I have a data of 500 cases but due to missing values, only 145 cases are used in SEM if I do not use imputation. The output based on 145 cases give me model fit indices such as RMSEA, AIC, chai squar, etc. But when using proc MI and proc mianalyze, indeed the sample size is 500, but I have no fit indices in the output. I don't think it is proper to use the fit indices from the 145 cases output. What could be a solution?

Lindy · ‎05-14-2018

I used adjust= statement This is my code proc lifetest data = roads.tepfulldata plots=survival(atrisk); time survivaltime*censor_1(1); strata prior_group/test=logrank adjust=sidak; run; But this code give me an output in which every 2 of the 16 categories of "prior_group' variable--a categorical variable with 16 values. So generally this code can only help me figure out of every two groups are different from each other, but not the info about if the survival risk of multiple groups DO NOT differ thus be lumped together as one group.

Lindy · ‎05-13-2018

Hi folks, I have a question. I know people can use multiple comparisons such as Hsu's MCB to find out among several groups, which pair of group have the biggest contrast of the treatment mean, and which groups do not differ in their means of treatment effects. Now I have a data for survival analysis in which there are several groups, I want to compare their survival risk curve in the observation time to see which groups do not differ and which group different using MCB. Is that possible? I know using Kaplan-Meier curve we can only compare each two groups to see if they differ in survival risk. But I hope to have something like Hsu's MCB so that I can lump those groups that do not differ in their survival risk together into one group.

Lindy · ‎05-12-2018

Hi folks, I am running a proc calis procedure based on several latent factors. However I have a problem. I noticed that during the creation of latent factors using proc factor, the cases with missing values on the items used to create a latent factor could not be used. therefore, originally there were 260 cases, after creating some latent variables based on items with missing values, there only 150 cases to be used in SEM because the latent variables have missing values. MY QUESTION: Should I impute missing values on a latent factor (after I create the latent factor) before running the SEM in proc calis? If so, how to impute missing values of a latent factor? Or I should first impute the values on the items to be used to create a latent factor, and then after imputation on the values of the items, I use proc factor to create a latent factor so that this factor would not have missing values? Thank you!

Lindy · ‎05-06-2018

Hi folks, I found myself lost in setting a baseline reference group for a categorical variable in my model of parametric survival analysis. There are 6 categories in priorarrests variable, coded in value 1 to 6. I want to use 2, the second group as my baseline. This is my code proc lifereg data= roads.onlyrecordslessthan12 outest=weiboutest; class race_group gender_group marital_group famprb0new priorarrests (ref="2"); model survivaltime*censor_1(1) =priorarrests impulsivetemperment school0 age gender_group race_group famprb0new marital_group/distribution=lnormal; title 'Weibull regression for the data'; output out=weibsurv xbeta=weib_xb; run; The log shows 10 proc lifereg data= roads.onlyrecordslessthan12 outest=weiboutest; 11 class race_group gender_group marital_group famprb0new priorarrests (ref="2"); - 22 200 ERROR 22-322: Syntax error, expecting one of the following: a name, ;, -, /, :, _ALL_, _CHARACTER_, _CHAR_, _NUMERIC_. ERROR 200-322: The symbol is not recognized and will be ignored. 12 model survivaltime*censor_1(1) =priorarrests impulsivetemperment school0 age 13 gender_group race_group famprb0new marital_group/distribution=lnormal; 14 title 'Weibull regression for the data'; 15 output out=weibsurv xbeta=weib_xb; 16 run; I tried using no quotation marks as this priorarrests (ref=2); But the log still gives error message. 17 proc lifereg data= roads.onlyrecordslessthan12 outest=weiboutest; 18 class race_group gender_group marital_group famprb0new priorarrests (ref=2); - 22 200 ERROR 22-322: Syntax error, expecting one of the following: a name, ;, -, /, :, _ALL_, _CHARACTER_, _CHAR_, _NUMERIC_. ERROR 200-322: The symbol is not recognized and will be ignored. 19 model survivaltime*censor_1(1) =priorarrests impulsivetemperment school0 age 20 gender_group race_group famprb0new marital_group/distribution=lnormal; 21 title 'Weibull regression for the data'; 22 output out=weibsurv xbeta=weib_xb; 23 run; Anyone could help me? Thank you!

Lindy · ‎05-01-2018

Hi folks, I created several latent factors using proc factor. The items under each factor loads well, and I created the value of the composite using factor scores. However, when I am about to use the factors as my independent variables in a model and I need to report descriptive statistics, I found generally the standard deviation of the facotrs are bigger than their means. Usually people would see a redflag when S.D. is bigger than the mean...But I think this rule may not apply to factors created based on factor scores? How can I explain it? Mean Std Dev Minimum Maximum 0.158052 0.9290338 -3.4236783 1.781888 Mean Std Dev Minimum Maximum -0.0573974 0.9146756 -0.6069258 5.511616 Mean Std Dev Minimum Maximum -0.1035039 0.8218769 -0.3771253 9.7391213

Lindy · ‎05-01-2018

Thank you for your reply! I tried univariate (anova) and it worked. But I can't help wondering people like me who use complex data would face a dilemma: if we have two DVs, and they are correlated. We wanted to use multivariate regression. However, in SAS, it seems we could not...right? Then how about poisson, logistic regression? Maybe survey statement also do not go together with them, either. I only know survey statement could go with proc reg, and proc phreg.

Lindy · ‎04-30-2018

Hi all, Can anyone help me with a question on using complex data? I know that we need to use strata and cluster statement in order to use weighted data. But I could not find any example that these statement could be used in manova... If I write my model like this, it will run. But I can only have one dependent variable. proc surveyreg data = wave1.cro_section_neededIV; cluster psuscid; strata region; weight gswgt1; class BIO_SEX; model nonagressivedelin =BIO_SEX parentattchemnt schoolattachment delinqpeerw1 parentalcontrol BIO_SEX*cparentattchemnt BIO_SEX*cparentalcontrol age eduw1/ solution ; run; If I add " manova h= _ALL_ ;" after model statement, it will not run... So it means when using weighted data, there is no way to use multivariate method? I also tried to use proc glm, not working with strata statement, either. Only proc reg is allowed to go with "survey" and strata statement. WHY?

Lindy · ‎04-29-2018

Thank you! Will check out PLS! Good luck!

Lindy · ‎04-29-2018

Thank you so much for your insights, Paige! I am not using PCA to predict Y. My plan is like this. I want to have a latent variable called "delinquency propensity" as Y in my model, and I have several independent variables such as parenting styles to the children, children's school scores, etc. Majority of the independent variables are scaled variables. Because there is no item in my data called "delinquency propensity", I used several items from the data asking the frequency of using a weapon, fighting, truancy, etc. Using proc princomp, I found these items are under 1 latent factor, so I want to use prin1 in the output as "delinquency propensity" --Y in my model. As I posted before, this prin1 ranged from -.77 to more than 12. Based on the info, do you think OLS model is good option? Thank you very much!

Lindy · ‎04-29-2018

Thank you, Paige! I checked the frequency and distribution of prin1, and I found there is no particular outlier. The range of prin1 is -.77 to 12.95. The sample is with about 4000 cases and the majority of the respondents fall in -.77 to about 1 on prin1 (delinquency score), but there are some respondents evenly scored at some value from 1 to 12.95. In this case, should I go ahead to use prin1 as my outcome variable in OLS? Thank you! --lindy

Online Status	Offline
Date Last Visited	‎08-17-2019 04:20 PM

SAS Support Communities

power analysis: calculate the sample size needed

Re: weird results about sample size of logistic regression in power an...

weird results about sample size of logistic regression in power analys...

Re: imputing missing value in SEM proc calis

Re: imputing missing value in SEM proc calis

Re: multiple comparison for survival analysis data

multiple comparison for survival analysis data

imputing missing value in SEM proc calis

set reference group for categorical variable in proc lifereg

creating a factor using cfa

Re: multivariate method on complex data

power analysis: calculate the sample size needed

Re: weird results about sample size of logistic regression in power an...

weird results about sample size of logistic regression in power analys...

Re: imputing missing value in SEM proc calis

Re: imputing missing value in SEM proc calis

Re: multiple comparison for survival analysis data

multiple comparison for survival analysis data

imputing missing value in SEM proc calis

set reference group for categorical variable in proc lifereg

creating a factor using cfa

Re: multivariate method on complex data

multivariate method on complex data

Re: proc princomp

Re: proc princomp

Re: proc princomp

Follow Us

What is...