BookmarkSubscribeRSS Feed
altadata1
Obsidian | Level 7
Hello, 
I need to calculate # of observations for specific percentiles of below: 
 
proc univariate data=myData noprint;
    by year sex;
    weight myWeight;
    var wage;
    output out=myNewData PCTLPTS= 1 5 10 25 75 90 95 99
PCTLPRE= PER_; run;

Data Total;
merge myData myNewData;
by year sex;
array pct(*) per:;
do i=1 to dim(pct);
if pct(i-1) <=wage <= pct(i) then index = i-1; end; drop per; run;

proc freq data= total; tables sex* year* index * (var1 var2); weight myweight; run;

But I have 7 index instead of 8 index. 

 
I appreciate your help. 
 
10 REPLIES 10
PaigeMiller
Diamond | Level 26

I don't understand the question.

 

Do you want to calculate the percent in percentile 1 and the percent in percentile 5 and the percent in percentile 25 and so on? 

OR

Do you want to calculate the percent between percentile 1 and percentile 5, and the percent between percentile 5 and percentile 25 and so on?

 

Do you have a lot of ties in your data in variable wage? Could the use of the weight statement produce lots of ties, meaning uneven distribution of observations to percentile?

--
Paige Miller
altadata1
Obsidian | Level 7

Thank you for the response PageMiller. 

I need to calculate the proportion of observations for each PI, P5, P10.... P99 by sex year Var1 and Var2. 

Does it answer your question?

 

 

Reeza
Super User

Your code as shown is calculating the number of observations (not weighted correctly) between the percentiles , ie less than 1, between 1 and 5 etc.

What do you want to accomplish? Also, wouldn't that do loop error out as it goes from p-1/i-1 which is undefined? 

altadata1
Obsidian | Level 7

Thank you for the reply Reeza. 

 

I have a dataset as mydata contents sex year wage weight var1 and var2. I need to calculate weighted P1, P5, P10... P99 for wage. Then, I would like to estimate the proportions of observations fall into each percentile disaggregated by sex year var1 and var2, like this :

               sex     year    var1     var2   

P1            1        2007     X%      Y%      

P5

P10

.

.

P99

X% or Y%: proportion  in P1

 

Thank you, 

 

Reeza
Super User

That's kind of a weird request because it's 1% fall in the 1th percentile, 5% fall under the P5 ( or 4% between P1/P5)....

That's the definition of percentiles. 

 

 

altadata1
Obsidian | Level 7

That's right Reeza. P5 here, for example, is the value of wage that  5% of observations fall in this value or below that. Now, I need to know from this 5% what proportion is,  for example, women, were paid those wages in year=2000, have bachelor degree (var 1) and are married (Var 2). For this. I need to know the number of observation for each P1, P5, P10... P99. Then do  cross tabulation with the variables. Am I right? 

Thank you, 

 

 

PaigeMiller
Diamond | Level 26

@altadata1 wrote:

Thank you for the response PageMiller. 

I need to calculate the proportion of observations for each PI, P5, P10.... P99 by sex year Var1 and Var2. 

Does it answer your question?


It seems to answer the question, but it leaves me thinking that this is a relatively meaningless thing to do. I am mystified by the request. Can you tell me why you want the proportion of observations at P1 and proportion at P5 but not at P4? What benefit is there to knowing how many values are at P5?

 

Do you want to know the proportion at exactly percentile 5, what about the proportion at P4.9 and P5.1, are those considered to be P5??

--
Paige Miller
altadata1
Obsidian | Level 7

Thank you PageMiller. Here is what I need to do:

P5 here, for example, is the value of wage that  5% of observations fall in this value or below that. Now, I need to know from this 5% what proportion is,  for example, women, were paid those wages in year=2000, have bachelor degree (var 1) and are married (Var 2). For this. I need to know the number of observation for each P1, P5, P10... P99. Then do  cross tabulation with the variables. Am I right? 

Thank you, 

PaigeMiller
Diamond | Level 26

Another explanation that is inconsistent with earlier explanations. Now you seem to be saying something different. Now it seems you are saying you want the proportion LESS THAN p5 (if I am understanding you properly) — and its still not clear what you want for P10, P25, ...

 

Are the percentiles computed separately for males and females? Or are they computed across the entire population and then this percentile is applied to the males and applied to the females

 

 

--
Paige Miller
altadata1
Obsidian | Level 7

@PageMiller. Thank you for your time and help, but I don't think I've ever been inconsistent. Please refer to my first post. 

I mentioned from the beginning that I have P1 P5 P10 P25 P75 P90 P95 and P99 and I would like to calculate the number of observation within  each percentile.  

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 10 replies
  • 1719 views
  • 2 likes
  • 3 in conversation