Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Programming
- /
- SAS Procedures
- /
- centiles for income data, skewed

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

🔒 This topic is **solved** and **locked**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 10-03-2018 03:23 PM
(1044 views)

Hello,

I am trying to do a multilinear regression of income data as the outcome with disability as the main predictor and controlling for confounding of age education and gender. the income data is from two different years where i am trying to see if there have been any improvements or socioeconomic gains for people with disabilities. In order to look at this from a more relative angle, it was suggested that i convert the income data into centiles (100 by 1) and use linear regression. However, i am running into issues because the income data includes negative and 0 income values which we do not wish to take out but analyze all together. These are being grouped into one centile which makes the data skewed when running the regression analysis. Is there any ideas on how this data should be analyzed instead. there is a weight for the data however this is not helping meet the assumption of normality.

```
proc rank data=incdis groups=100 out=cent_incdis ;
var atinc42;
ranks atinc42_centile;
by year;
run;
proc surveyreg data=cent_incdis;
class ecsex99_n (ref='0') education (ref='0');
model atinc42_centile = disabs26_n ecsex99_n education ecage26 / solution;
weight icswt26;
run;
```

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Doh! You have a lot of tied data...look at the options for handling tied data, it’s under the PROC RANK statement documentation I believe.

4 REPLIES 4

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

The assumption is that residuals are normally distributed, not necessarily the variables.

I think centiles is great idea, not sure how it can be skewed when by definition, 1% is in each group. Do you mean the residuals become skewed?

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I’m not sure that I’d be that concerned anyways, the idea behind centiles is to see if the groups improvement is occurring at same rate as general population, so if that group has A larger number that’s reflective of reality. When this happens though youre more likely to see something like Simpsons Paradox occur due to an imbalance so be careful if you do subgroup analysis.

**SAS Innovate 2025** is scheduled for May 6-9 in Orlando, FL. Sign up to be **first to learn** about the agenda and registration!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Ready to level-up your skills? Choose your own adventure.