About James18

James18 · ‎12-01-2022

I have one final question that may be difficult to answer. I would like to display the results from the DISTPLOT in the LSMeans statement in the exponentiated form rather than the log scale after running the results with a negative binomial distribution. Is there a way to do this within SAS with ODS Graphics? Code: Proc genmod data=work.example; class x_variable Interacting_variable; model sumlikert = x_variable interacting variable x_variable*interacting variable / cl dist=negbin link= log type3; LSMEANS X_variable*interacting_variable / ilink cl plot= distplot; run;

James18 · ‎11-30-2022

You deserve a raise sir. Thank you! This will be very useful for the future. Additionally, I found an article that makes an argument for using parametric tests even in situations for count data that is not normally distributed which would contend how these results could be interpreted (Proc GENMOD w/ normal distribution vs Proc GENMOD w/ negative binomial distribution). It is always difficult to determine what the model of best fit should be. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3886444/

James18 · ‎11-30-2022

You are correct! I made the mistake of referring to the response (dependent variables) as the predictor. Thank you for catching that. This helps tremendously. This is a difficult concept to grasp, but it makes more sense now. So, technically it would not be appropriate to say a negative binomial distribution was used if the link= identity function was used? A more accurate prediction would use a link=log function, and I am assuming estimate statements could be used to exponentiate comparison groups of the predictor (x_variable) to allow for easier interpretations. I added the estimate statement as: estimate 'X_variable' x_variable 1 0 / exp; But, I must have encountered another concern as the results are nonestimable.

James18 · ‎11-30-2022

Quick update: I modeled the data using poisson regression which seemed appropriate, but there was one major concern. I had severe overdispersion because my mean for the predictor (approx. 10) was not even close to the variance (approx. 110). Because of this, I decided to use a negative binomial distribution as suggested. Additionally, I used the pearson chi-squared value to test if it this model was a good fit. The model was an excellent fit (p=0.988), but I am not puzzled with the interpretation of the data. Question: In my Proc Genmod procedure, I used the link= identity function instead of the default of link=log as my predictor is continuous (but based upon count data). Would it be appropriate to interpret this normally as I would because I used the link=identity function or would the interpretation still need to be in a log scale because I used a negative binomial distribution? This may be a difficult question to answer with the limited information This was similar to the code I used for anyone else that is interested: proc genmod data=work.source; class x_variable (ref= 'reference_group') ; model y_variable = x_variable / cl dist=negbin link= identity type3 ; run; data test; pval = 1- probchi(361, 425) run;

James18 · ‎11-28-2022

Hello, I am familiar with coding in SAS on Demand for Academics but am not familiar with Rstudio. I eventually want to learn how to use both programs, but for now I want to show people how to look at data with the program that they are most familiar with. Is there a way to easily convert SAS code I have written (or simply run SAS code) in the Rstudio environment?

James18 · ‎11-28-2022

Thank you! My predictor is right skewed and the sum value doesn't get much larger than 70 across the 20 questions (scaled 1 to 5 on likert) so it seems reasonable to do a Poisson model.

James18 · ‎11-28-2022

I appreciate the feedback! I have a follow-up question. My outcome is based on a Likert scale from a questionnaire, and the values from each question were added together (what we call the sum likert score). Would it make sense in this situation to use Poisson regression within the Proc Genmod procedure since the outcome is technically discrete (cannot take on any negative integers and the highest value possible is established)? Any feedback would be appreciated!

James18 · ‎11-23-2022

Hello, I am working on variable selection using a purposeful modeling strategy (rather than stepwise) and could use some guidance on what Proc statement would best fit my dataset to produce accurate estimates. In addition, I am using an effect modifier in my dataset and am adding coviarites to the model 1-by-1 to see if the addition of a new covariate is a better model fit (using the AIC or -2 log likelihood value) Here is a little information about my data. Predictor (y): count data (cannot take on the value of non-negative integers) Main exposure: binary Effect modifier: Categorical 8 potential covariates (all categorical) 1) s the Proc HPGENSELECT the appropriate procedure if I only have around 350 observations? If so, should I use the Proc GENMOD procedure for variable selection instead? 2) If I can use the HPGENSELECT procedure, do I need to specify dist= poisson and link= identity to produce more accurate estimates? I ran 2 different models to see how the AIC score would change, and they were drastically different when I specified that the distribution is poisson. Model 1: AIC = 2778.45 Proc hpgenselect data=work.example ; class X_variable Effect_modifier ; model Y_variable = X_variable Effect_modifier X_Variable*Effect_modifier / cl ; run; Model 2: AIC = 4650.43 Proc hpgenselect data=work.example ; class X_variable Effect_modifier ; model Y_variable = X_variable Effect_modifier X_Variable*Effect_modifier / cl dist= poisson link= identity ; run; I would like to note that predictor is non normally distributed (skewed right) but homoscedasticity and linearity are not violated. 3) Lastly, originally I specified the X_variable with a reference option (ref = XXX), but the estimates did not seem correct. Would it be more appropriate to leave the default option for class parameterization as GLM? Thank you

James18 · ‎12-08-2020

I am having some trouble grasping the full understanding of how a macro variables function Suppose the following macro variable was submitted: %LET NewVbl = 10 + 20; Would the value of the NewVbl be 30? I understand how macro variables function if they are a set value (such as %Let NewVbl = 30), so what would be the purpose of adding a calculation with a macro variable? Any information would be greatly appreciated!

James18 · ‎12-08-2020

Hi, I believe you are having this issue because you are keeping the variable 'Statistic' in your Data step but in your Proc Report step, you are referring to 'Test' when you actually mean 'Statistic'. If you would like for the statistic variable to be renamed test, you could use a Define statement. I hope my code below helps! PROC REPORT DATA = HypRslt.chisqresults ; COLUMN Statistic ("Statistical Results" DF Value Prob ); DEFINE Statistic / 'Test'; DEFINE Value / Format= 4.2; DEFINE Prob / Format= 4.2 'P-Value'; RUN;

James18 · ‎12-08-2020

Hi, The code below could give you what you are trying to display. There is some unnecessary code that you can take. My best guess as to why you did not get the results you were looking for is because the response was not set to ColPercent. I hope this helps! PROC SGPLOT DATA = Work.whatever; VBAR StateCd / response= ColPercent GROUPDISPLAY=cluster DATALABEL = ColPercent; XAXIS LABEL = 'State Code' LABELATTRS = ( SIZE = 9 PT WEIGHT = BOLD COLOR = Black ); YAXIS LABEL = 'Percent' GRID LABELATTRS = ( SIZE = 9 PT WEIGHT = BOLD COLOR = Black ); RUN; QUIT;

James18 · ‎12-01-2020

Hi everyone, I am trying to use a PROC PRINT step to display only the subjects who died from a stroke from my data set. I am already satisfied with almost every part of the data set, but I would like to remove a line above my table where it says "Cause of Death Code = Stroke" just below my Title. This is the code I have been working with: PROC SORT DATA = Hypdt.Stroke OUT = WORK.hypdta_ST; BY DESCENDING CODCd SSN ; RUN; TITLE1 "Subjects Who Died from #BYVAL(CODCd)"; PROC PRINT data= WORK.hypdta_ST SPLIT = '*' NOOBS ; By DESCENDING CODCd; PAGEBY CODCd; ID SSN; VAR WtLb BMI SBP; LABEL SSN = '* Soc Sec * Number' WtLb = '* Subject * Weight' BMI = 'Body * Mass * Index' SBP = 'Systolic * Blood * Pressure'; RUN; The result I am getting is displayed below. I am not concerned with the values displayed inside the table. I am simply trying to remove the line below the title. Any guidance would be greatly appreciated.

Online Status	Offline
Date Last Visited	‎02-15-2023 02:00 PM

Re: Variable selection using Proc HPGenSelect

Re: Variable selection using Proc HPGenSelect

Re: Variable selection using Proc HPGenSelect

Re: Variable selection using Proc HPGenSelect

Running SAS code inside Rstudio Environment

Re: Variable selection using Proc HPGenSelect

Re: Variable selection using Proc HPGenSelect

Variable selection using Proc HPGenSelect

Creating a Macro variable

Re: Adding to PROC REPORT

Re: Variable selection using Proc HPGenSelect

Re: Variable selection using Proc HPGenSelect

Re: Variable selection using Proc HPGenSelect

Re: Variable selection using Proc HPGenSelect

Re: Variable selection using Proc HPGenSelect

Running SAS code inside Rstudio Environment

Re: Variable selection using Proc HPGenSelect

Re: Variable selection using Proc HPGenSelect

Variable selection using Proc HPGenSelect

Creating a Macro variable

Re: Adding to PROC REPORT

Re: Specifying a variable for the y axis in SGPLOT

Removing header in a PROC PRINT report