17 hours ago
Season
Barite | Level 11
Member since
12-03-2022
- 288 Posts
- 150 Likes Given
- 8 Solutions
- 33 Likes Received
-
Latest posts by Season
Subject Views Posted 37 16 hours ago 505 Tuesday 561 Tuesday 669 Monday 353 Saturday 353 2 weeks ago 643 2 weeks ago 664 2 weeks ago 712 2 weeks ago 323 2 weeks ago -
Activity Feed for Season
- Posted Re: PROC CAUSALMED survey data on Statistical Procedures. 16 hours ago
- Posted Re: Appropriate model for non-normal distribution on Statistical Procedures. Tuesday
- Got a Like for Re: Appropriate model for non-normal distribution. Tuesday
- Posted Re: Appropriate model for non-normal distribution on Statistical Procedures. Tuesday
- Got a Like for Re: Appropriate model for non-normal distribution. Monday
- Posted Re: Appropriate model for non-normal distribution on Statistical Procedures. Monday
- Posted Re: ODS RTF Side to Side Graph/Tables on ODS and Base Reporting. Saturday
- Liked Re: ODS RTF Side to Side Graph/Tables for Ksharp. Saturday
- Liked Re: Troubleshooting PROC MI differences when trying to replicate for Ksharp. a week ago
- Posted Re: Joint mean and variance modeling in SAS: any suggestions on which PROCs should I use? on Statistical Procedures. 2 weeks ago
- Liked Re: Joint mean and variance modeling in SAS: any suggestions on which PROCs should I use? for StatDave. 2 weeks ago
- Posted Re: Joint mean and variance modeling in SAS: any suggestions on which PROCs should I use? on Statistical Procedures. 2 weeks ago
- Liked Re: Joint mean and variance modeling in SAS: any suggestions on which PROCs should I use? for PaigeMiller. 2 weeks ago
- Posted Re: Joint mean and variance modeling in SAS: any suggestions on which PROCs should I use? on Statistical Procedures. 2 weeks ago
- Posted Joint mean and variance modeling in SAS: any suggestions on which PROCs should I use? on Statistical Procedures. 2 weeks ago
- Posted Re: Modeling zero-censored semi-continuous data with PROC SEVERITY on Statistical Procedures. 2 weeks ago
- Posted Re: Fisher's - Taking long to run - 4x4 table with greater than 1000 sample size on Statistical Procedures. 2 weeks ago
- Liked Re: Fisher's - Taking long to run - 4x4 table with greater than 1000 sample size for StatDave. 2 weeks ago
- Posted Re: Fisher's - Taking long to run - 4x4 table with greater than 1000 sample size on Statistical Procedures. 2 weeks ago
- Liked Re: PROC PLS, Is W (Xweights) has different calculation than the standard versions (Ex. from R/Pytho for PaigeMiller. 2 weeks ago
-
-
My Liked Posts
Subject Likes Posted 1 Tuesday 1 Monday 1 3 weeks ago 3 3 weeks ago 1 3 weeks ago
16 hours ago
First of all, please make sure about the module you wish to invoke for your analysis. @kmwats orignially raised questions on PROC CAUSALMED but digressed to PROC PSMATCH as he/she found it suitable for his/her analysis.
Second, based upon your reply, it seems that you are really looking for information on conducting complex survey data analysis with PROC CAUSALMED. If that is the case, then I regret to inform you that the SAS's built-in SURVEY procedures only serve to conduct a fraction of complex survey data analysis. In other words, many of the complex survey data versions of advanced statistical methods are not supported by SAS's built-in procedures. Mediation analysis in the setting of complex survey data analysis is one of the methods not supported by SAS's built-in proceudres till now.
To get around this with SAS, you must either be quite familiar with the algorithms of the method and compile a macro from scratch or search on the web and use a macro compiled by someone else, if there is any. However, given my experience in complex survey data analysis, it is usually quite futile to search on the web over and over again for a SAS macro, as the number of SAS users skilled at complex survey data analysis and willing to deal with the very problem you encounter is too small for the generation of at least one paper on every particular problem.
Therefore, I suggest that you search on the web briefly to see if anybody has compiled such macros. If the answer is no, then I suggest that you stop searching for SAS macros and switch to looking for R packages. It is much more likely to find R packages for a certain question than to find a SAS macro for the same one.
... View more
Tuesday
Yes, models for zero-inflated continuous data are complex indeed. Modeling them requires both sound probability theory and statistical programming knowledge. It is up to you to decide which model to choose. However, I am not very skilled at mixed models. I suggest that you wait @SteveDenham, who has been answering many questions on mixed models in the community and is also in the chat, for answers to the two questions you raised yesterday. Good luck on your project!
... View more
2 weeks ago
Thank you for your additional information! But aren't PROC QLIM and PROC HPQLIM used for modeling limited dependent variables like zero-inflated data? Can they build joint mean and variance models for uncensored data?
... View more
2 weeks ago
@SteveDenham wrote:
You can set the truncation value at a small non-zero value, and all of the estimates are correctly determined. The issue becomes what is the small value to use. I think a good way to choose would be to see to how many decimal places the response is measured, and then set the truncation at half that value. For example, suppose you measure the response to the nearest thousandth (=Y.YYY). Under this scheme, the truncation value of 0.0005 would guarantee that it is greater than zero, and that all observed values are included.
Or am I still missing the point here?
SteveDenham
I think your solution of tentatively selecting several thresholds and see what happens is a very nice idea. Despite the scheme you proposed was built upon selecting truncation thresholds, such attempts can be easily carried over to the selection of censoring thresholds. Therefore, I tried your approach on my data.
Before I disclose my findings, I would like to reiterate first that my original objective was to model the relationship between y and x1, x2, ..., xn. However, the SEVERITY procedure is versatile and can serve to perform multiple tasks. The more basic one is to estimate the parameters of the distribution(s) that y follow. A more advanced one is to build regression models for the scale parameter of the distribution(s) of y, e.g., the parameter μ if y follows a lognormal distribution. The latter can be done by adding the SCALEMODEL statement in the SEVERITY procedure.
In line with the capabilities of this module, my efforts of implementing your idea was also directed in two directions: (1) estimate the parameters of y in the absence of predictors; (2) estimate both the non-scale parameters of y as well as the regression coefficients of the model for the scale parameter. To accomplish the two goals, I tried several minuscule yet positive thresholds. They were of course smaller than the smallest observed positive value of my dataset.
However, it was disturbing to find out that setting different thresholds did lead to different results. For the first objective, PROC SEVERITY still exhibited some consistency, at least in the estimation of several (but not all!) distributions that were built into this module. For the second objective (i.e., in the presence of predictors), the regression coefficient estimates of the scale parameter model deviated from each other more or less, and even quite wildly on some occasions.
Therefore, the conclusion is that PROC SEVERITY is not a good tool for dealing with zero-censored data as the results is dependent on the specification of the censoring threshold. A supplement to this conclusion is that PROC SEVERITY supports multiple advanced functions relating to parameter estimation, including the specification of starting values that play a role in the maximum likelihood estimation process, the underlying method that this procedure uses to accomplish all of the aforementioned tasks. I am not sure whether delicate application of these utilities could remedy the problems I pointed out in the preceding paragraph, but I have not interest in trying it out.
... View more
2 weeks ago
Thank you for your explantaion on the theoretical and practical details! I have learnt a lot!
... View more
2 weeks ago
1 Like
As @StatDave said it is a zero-inflation model generally suited to COUNT data ,not a continous data.
Check this brand-new session:
https://communities.sas.com/t5/SAS-Communities-Library/Making-Zero-Inflation-Count/ta-p/962019/jump-to/first-unread-message
... View more
2 weeks ago
I continued reading the documentation of PROC SEVERITY yesterday and found one of the examples in it (SAS Help Center: Example 29.3 Defining a Model for Mixed-Tail Distributions) explictly pointed out the second phenomenon you mentioned, namely the presence of extreme values. However, it was also explicly stated there that these values should not be regarded as outliers and hence discarded. So I am afraid that @Ronein should embark on a more complicated analysis instead of simply deleting the extreme values. The good news for @Ronein is a code suitable for this purpose is readily available, saving a lot of work.
By the way, the documentation also contains an example of building finite mixture models with PROC SEVERITY, so relevant codes are also readily available there.
... View more
2 weeks ago
Hello, there. I found your post while searching for something on the lognormal distribution in the community. I read from your profile that you have not been here for more than four years. In addition, this question was raised more than 10 years ago. I am not sure if your problem has been solved and if you need my answer for the time being. But I think somebody else may need it and am therefore here to offer my viewpoint on your problem.
I think ANOVA is a suitable choice in terms of estimating the group means. But forming confidence intervals is not so intuitive under your setting. Why not try the accelerated failure time (AFT) model and put the group indicators as the only independent variables in your model? The AFT model is very inclusive in the sense that the lognormal distribution is only one of the popular distributions that can be modeled. Methods of inferences with respect to the AFT model, including the construction of confidence intervals of the dependent variable, may be more comprehensively studied than the log-transformed version of ANOVA you employed. I think you can have a try.
I have also found an article that might solve your problem under another paradigm. You may take a look. Inferences on the means of lognormal distributions using generalized p-values and generalized confidence intervals - ScienceDirect
... View more
3 weeks ago
It sounds like there's no way to adjust the style of the output table directly within the proc surveyfreq statement. So I'll try as you suggested and save the proc surveyfreq output, and then use a different procedure statement to make a more reader-friendly table. Thank you for the help.
... View more
3 weeks ago
I saw your post above that your dataset contained missing data and you were to repetitively go through the modeling process on each imputed sample.
An informal approach I can think of is to use the Firth's method and exact logistic regression on one (or several) of the imputed dataset and see how the results differ. If this tentative modeling process does not find any big difference between them, then we might speculate that the differences in each of the unanalyzed imputed dataset is not big either. Therefore, the combined estimate, which, according to Rubin's rules, is in fact the arithmetic mean of the regression coefficients calculated from each imputed sample, is not significantly affected by the selection of estimating method (i.e., Firth's method or exact logistic regression).
... View more
3 weeks ago
1 Like
I do not have an answer to your question at hand but can tell you what I know about your question:
(1) Trees are naturally categorical. When encountering continuous variables, trees categorize them in the process of its growth (i.e., in the model building process). After all, trees are chiefly tools of classification. Its is less powerful in predicting a continuous outcome. For instance, in "Example 15.3 Creating a Regression Tree" of the documentation of the HPSPLIT procedure, which I will mention in more detail later, the predicted outcomes are simply the arithmetic means of each end node. Please note that while this resembles the rationale of linear regression, the continuous independent variables are subject to categorization in tree-based models. That leads to loss of information.
(2) I happen to be reading a book on credit risk analysis named Amazon.com: Credit Risk Analytics: Measurement Techniques, Applications, and Examples in SAS (Wiley and SAS Business Series): 9781119143987: Baesens, Bart, Roesch, Daniel, Scheule, Harald: Books. This book includes the discussion of modeling LGD in SAS as well as building tree-based models in SAS Enterprise Miner. You may take a look at the book and see if there is anything useful. By the way, the book also contains contents regarding more sophisticated aspects of credit risk modeling and the usage of more advanced SAS modules like the QLIM procedure, which other books do not usually discuss.
(3) If SAS instead of SAS Enterprise Miner is also available, you can also take a look at the HPSPLIT procedure. It is a statistical procedure capable of building tree-based models. More information on PROC HPSPILT can be found in SAS documentation, namely SAS Help.
... View more
3 weeks ago
Actually, the STRATA statement in the SURVEY procedures are intended to house the strata used in the sampling design, not the croostabs. For your problem, it is appropriate to use syntax like gender*outcome or age group*gender*outcome like you might frequently do in the FREQ procedure.
... View more
3 weeks ago
1 Like
Take a look at the STORE statement of the PHREG procedure as well as the PLM procedure that utilizes things generated from the STORE statement. Some statistical graphs can be generated from the PLM procedure, but delicate modifications of them still necessitates the employment of the SG procedures like PROC SGPLOT.
... View more