Hello everybody;
I decide to run a regression based on decile of data. suppose that I have firm, date and price variables. So I want to make decile by average price of each firm. For instance, the 10th decile has firms with the highest average price and first decile has firms with the lowest average price and so on.
How can I make this decile?
How can I run this regression?
Thanks in advance.
I sounds like you need these steps:
I imagine then you could run your regressions, etcs, with a BY mydecile statement.
PROC RANK with GROUPS=10.
It supports BY group processing.
Thanks, @Reeza;
But it does not work.
There are two problems;
1- A variable, which I want to rank data based on it, has missing values;
2- This code rank data by firms and price variable, but each firm can be repeated in groups. For example, firms A can be in 5th and second groups.
Please consider I want to make a decile based on average of price for each firm. So, firms are unique in each group and don't repeat in other groups.
Here's an example:
Table one is a sample of my data:
| name | date | price | 
| A | 1 | 6 | 
| A | 1 | . | 
| A | 2 | 7 | 
| A | 4 | 7 | 
| A | 4 | 5 | 
| A | 5 | 7 | 
| A | 6 | 5 | 
| B | 2 | . | 
| B | 3 | . | 
| B | 5 | 2 | 
| B | 5 | 1 | 
| B | 7 | . | 
| C | 1 | 3 | 
| C | 1 | 4 | 
| C | 1 | 5 | 
| C | 3 | 3 | 
| C | 3 | 2 | 
| C | 3 | 1 | 
| C | 5 | . | 
Table two, which is shown below, is what I want:
| name | date | price | mean | rank | 
| B | 2 | . | 0.6 | 1 | 
| B | 3 | . | 0.6 | 1 | 
| B | 5 | 2 | 0.6 | 1 | 
| B | 5 | 1 | 0.6 | 1 | 
| B | 7 | . | 0.6 | 1 | 
| C | 1 | 3 | 2.714286 | 2 | 
| C | 1 | 4 | 2.714286 | 2 | 
| C | 1 | 5 | 2.714286 | 2 | 
| C | 3 | 3 | 2.714286 | 2 | 
| C | 3 | 2 | 2.714286 | 2 | 
| C | 3 | 1 | 2.714286 | 2 | 
| C | 5 | . | 2.714286 | 2 | 
| A | 1 | 6 | 5.285714 | 3 | 
| A | 1 | . | 5.285714 | 3 | 
| A | 2 | 7 | 5.285714 | 3 | 
| A | 4 | 7 | 5.285714 | 3 | 
| A | 4 | 5 | 5.285714 | 3 | 
| A | 5 | 7 | 5.285714 | 3 | 
| A | 6 | 5 | 5.285714 | 3 | 
I sounds like you need these steps:
I imagine then you could run your regressions, etcs, with a BY mydecile statement.
You need one more step, because I don't think PROC SUMMARY/MEANS works here as @aminkarimid is treating missing as 0's, not missing. I guess you could do it 'manually' using SUMMARY/MEANS to calculate the sum and count automatically. The remaining steps are the same, merge in results, use PROC RANK etc.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
