About mlensing

mlensing · ‎12-19-2022

I am using this code to calculate RR (outcome: dpneum=1, primary predictor: pathogen) for a retrospective cohort study using log-binomial regression: proc genmod data = clean.stacked; class pathogen (REF = 'RSV') cld (REF = '99') race_eth (REF = 'White') ; model dpneum2 (event="1")= pathogen age_years cld race_eth/dist=bin link=log; run; I recently introduced "age_years" (continuous variable) into the model and since doing so, the model is failing to converge with the log results as follows: NOTE: PROC GENMOD is modeling the probability that dpneum2='1'. One way to change this to model the probability that dpneum2='99' is to specify the DESCENDING option in the PROC statement. WARNING: The specified model did not converge. NOTE: The Pearson chi-square and deviance are not computed since the AGGREGATE option is not specified. ERROR: The mean parameter is either invalid or at a limit of its range for some observations. NOTE: The scale parameter was held fixed. NOTE: The SAS System stopped processing this step because of errors. NOTE: PROCEDURE GENMOD used (Total process time): real time 6.78 seconds cpu time 0.18 seconds I am not sure of next steps. I have been able to successfully run the model as both a Poisson regression and logistic regression (model converges), however each of these is giving vastly different parameter estimates (i.e. completing changing the direction of the effect between models). I have opted to use log-binomial regression to assess the association between pathogen and dpneum2 because the overall prevalence of dpneum2 in the sample is >10% (16.5%), however the prevalence per pathogen does dip below 10% for some (16.1%, 10.1%, 7.8%) so I am wondering if logistic regression is the better choice if this is criteria for suggesting the outcome is "rare." Ideally I would like to run a log-binomial regression here, so any suggestions for data restructuring or otherwise are appreciated.

mlensing · ‎07-12-2021

Hi everyone, I am working on a case-control study in which I need to create a survey ID variable for records. For every case in dataset1, I pull 3 controls randomly from a source dataset to create dataset2 (this is being done via proc survey select) resulting in a dataset of cases (dataset1) and a dataset of controls (dataset2). The survey ID variable naming scheme should be as follows: Case1: 12345-1 Control1: 12345-1-1 Control2: 12345-1-2 Control3: 12345-1-3 Case 2: 12345-2 Control1: 12345-2-1 Control2: 12345-2-2 Control3: 12345-2-3 This naming scheme can be applied to each dataset separately or can be applied to the combined dataset of all records. A key detail is that the controls are simply frequency matches and NOT paired matches. How would you all code this? Thanks!

mlensing · ‎12-08-2020

One workaround you could try would be to assign numeric values to each of the states (maybe in a temp dataset first), use a proc sort to sort the states in ascending alphabetical order, and then assign format names of each state names to have your states appear in the correct order on your graph.. This SAS Communities post also seems to have a similar situation to yours with other solutions you could try.

mlensing · ‎11-30-2020

Thank you so much!

mlensing · ‎11-30-2020

@ballardw @mkeintz Hi all, thank you for your clarifying comments. I am needing to "compare significance" among my bivariate interactions to determine the order in which terms should be introduced into my logistic regression model to compare model fit through -2LL tests. The instructors for the course I am in that suggest this step tend to be against automated variable selection tools and so this is part of our intentional/purposeful modeling.

mlensing · ‎11-30-2020

Hi SAS communities, I am looking for a solution to print more decimals for my p-values generated via a proc freq chisq procedure. I would ideally like to have 10-12 decimals printed as I am performing bivariate analyses and need to compare highly significant values (currently all my p-values are printing <0.0001). I'm definitely open to other ways to get a p-value other than a proc freq chisq procedure, I just need to be able to test the association between two variables. Thank you!

mlensing · ‎11-23-2020

Hi everyone, I am working on a group research project in which we are exploring the research question "Is greater length of residence in the US among immigrant mothers associated with increased risk of preterm delivery and low birth weight?" Our primary predictor is continuous (length of residence in months) while our outcomes are each dichotomous. We are using multivariable logistic regression to perform our analysis and are a little unsure of whether we should use proc logistic, proc genmod, or a different option to calculate relative risk. Additionally, we are also having trouble defining our reference group. Ideally, we want to compare immigrant mothers to non-immigrant mother, however in our dataset, the values for length of residence are blank/missing for non-immigrant mothers. In this case, is it better to create a new dichotomous variable for immigrant status and use "no" as the reference group in the class statement (as done in the code below) or recode length of residence to set missing values (non-immigrant mothers) to an unrealistic month value (e.g. '999999') and use that as our reference group? So far, we have tried using both proc genmod and proc logistic (code below): proc genmod data = temp descending; class imgrt (ref = 'N')/param = ref; model preterm = LORMonths IMGRT/dist = poisson link = log; run; proc logistic data = temp descending; class IMGRT (ref = 'N')/param = ref; model preterm = LORMonths IMGRT; run; If anyone could explain which procedure is best in this situation to obtain relative risk and/or point us to understandable documentation, we would so appreciate it!

mlensing · ‎11-20-2020

Thank you!

mlensing · ‎11-18-2020

Hi everyone, I am attempting to transpose data using a data step (cannot use proc transpose for this question). I can't figure out why I keep getting a value of '15' for Maple Yr2009. I'm assuming it has something to do with the missing data, but if anyone has a suggestion for how I correct this that would be wonderful. Dataset: DATA Work.Trees; INFILE DATALINES; INPUT Type $6. Year 8-11 HtFt 13-14; CARDS; Aspen 2009 15 Aspen 2010 16 Maple 2010 6 Maple 2011 8 Maple 2012 10 Spruce 2009 22 Spruce 2010 23 Spruce 2011 24 Spruce 2012 25 ; Current code to transpose using data step: DATA Work.TreesWide2; SET Work.Trees; BY Type; RETAIN Yr2009 Yr2010 Yr2011 Yr2012; IFFIRST.LabDate = 1THENCALL MISSING(Yr2009, Yr2010, Yr2011, Yr2012); IFYear = 2009 THEN Yr2009 = HtFt; ELSE IF Year = 2010THEN Yr2010 = HtFt; ELSE IF Year = 2011THEN Yr2011 = HtFt; ELSE IF Year = 2012THEN Yr2012 = HtFt; IF LAST.Type = 1; DROP Year HtFt; RUN; Output I am currently getting: Output I want: Thank you!

mlensing · ‎11-08-2020

Thank you so much, this worked perfectly! I really appreciate your help and thorough response!

mlensing · ‎11-08-2020

Hi, yes it is a requirement to sort SSN ascending. Thank you for clarifying.

mlensing · ‎11-08-2020

Hi everyone, As I'm still a bit new to SAS, I wanted to reach out for guidance surrounding sorting my dataset. I have created a master dataset composed of 3 datasets in which 2 have filled values for SSN (character variable) while the last does not have any values for SSN. I want to sort my master dataset by SSN and so that the missing values appear last/at the end of the dataset. Is this possible and is there a straightforward way to do this? Thank you in advance! DATA HypTabs.Contact; LENGTHSSN $11. Inits $3. City $20. StateCd $2. ZipCd $5.; SETWORK.Contact_IA WORK.Contact_MS WORK.Contact_UT; LABELSSN= 'Social Security Number' Inits= 'Subject Initials' City= 'City' StateCd= 'State Code' ZipCd= 'Zip Code'; RUN; PROC SORT DATA = HypTabs.Contact; BY SSN; RUN; (Note: When I do PROC SORT by SSN, observations 1-195 are blank/missing which corresponds to the dataset in which those values are missing.)

Online Status	Offline
Date Last Visited	‎02-13-2023 04:42 PM

Using proc genmod for log-binomial regression; failing to converge wit...

Identification variable naming scheme: case-control study

Re: Vertical Bar Chart question

Re: Print more decimals for p-value in proc freq

Re: Print more decimals for p-value in proc freq

Print more decimals for p-value in proc freq

Proc logistic vs proc genmod for continuous predictor in logistic regr...

Re: Transpose data using data step

Transpose data using data step

Re: Sort missing character variable values so that missing values appe...

Re: Print more decimals for p-value in proc freq

Re: Print more decimals for p-value in proc freq

Re: Print more decimals for p-value in proc freq

Re: Proc logistic vs proc genmod for continuous predictor in logistic ...

Re: Proc logistic vs proc genmod for continuous predictor in logistic ...

Using proc genmod for log-binomial regression; failing to converge wit...

Identification variable naming scheme: case-control study

Re: Vertical Bar Chart question

Re: Print more decimals for p-value in proc freq

Re: Print more decimals for p-value in proc freq

Print more decimals for p-value in proc freq

Proc logistic vs proc genmod for continuous predictor in logistic regr...

Re: Transpose data using data step

Transpose data using data step

Re: Sort missing character variable values so that missing values appe...

Re: Sort missing character variable values so that missing values appe...

Sort missing character variable values so that missing values appear l...