BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
MAC1430
Pyrite | Level 9

Hi everyone,

 

I am trying to calculate percentiles for the following data (Actual data is quite large).  Many observations have same age 36.083; therefore, 70th, 80th and 90th percentile are same. Is there a way to calculate percentiles where the high number of observations do not affect the percentiles, and I can get different percentile values for each percentile.

 

Thanks in advance for your help.

 

 

data have;
infile cards expandtabs truncover;
input stock date : yymmn6. age;
format date yymmn6.;
cards;
10006 196202 36.0833
14656 196202 36.0833
14664 196202 36.0833
14699 196202 36.0833
14701 196202 36.0833
14728 196202 36.0833
14736 196202 36.0833
14760 196202 36.0833
14779 196202 36.0833
14795 196202 36.0833
14816 196202 36.0833
14824 196202 36.0833
14859 196202 36.0833
14867 196202 36.0833
14875 196202 36.0833
14883 196202 36.0833
14891 196202 36.0833
14904 196202 36.0833
14912 196202 36.0833
14920 196202 36.0833
14955 196202 36.0833
15034 196202 36.0833
15499 196202 36.0833
15528 196202 36.0833
15560 196202 36.0833
15755 196202 36.0833
16029 196202 36.0833
16109 196202 36.0833
16117 196202 36.0833
16280 196202 36.0833
19334 196202 36.0833
25486 196202 36.0833
27561 196202 36.0833
27692 196202 36.0833
28513 196202 36.0833
75471 196202 36.0833
10014 196202 36
12298 196202 36
15536 196202 35.9167
15544 196202 35.9167
16985 196202 33.3333
17005 196202 33.3333
17013 196202 33.3333
17056 196202 33.25
17072 196202 33.25
17099 196202 33.25
17101 196202 33.25
17128 196202 33.1667
17144 196202 33.1667
17160 196202 33.1667
21573 196202 33.1667
17224 196202 33.0833
17232 196202 33.0833
17240 196202 33.0833
17267 196202 33.0833
17291 196202 33.0833
17304 196202 33
17312 196202 33
17320 196202 33
17339 196202 33
17347 196202 33
17398 196202 33
17400 196202 32.9167
17435 196202 32.9167
17443 196202 32.9167
17451 196202 32.9167
17478 196202 32.9167
17515 196202 32.8333
17523 196202 32.8333
17558 196202 32.75
17566 196202 32.75
17582 196202 32.75
17590 196202 32.75
17646 196202 32.75
17654 196202 32.75
17830 196202 32.75
17670 196202 32.6667
17689 196202 32.6667
17718 196202 32.6667
17726 196202 32.6667
17734 196202 32.6667
17865 196202 32.5833
17881 196202 32.5833
17910 196202 32.5833
17929 196202 32.5833
17945 196202 32.5
17953 196202 32.5
17961 196202 32.5
18016 196202 32.5
18032 196202 32.5
18040 196202 32.5
18067 196202 32.5
18075 196202 32.4167
18091 196202 32.4167
18112 196202 32.4167
18147 196202 32.4167
;run;
proc univariate data=HAVE noprint;
var age;
by date;
output out=WANT pctlpts = 10 20 30 40 50 60 70 80 90 pctlpre=GR;
run;

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
I think again. You should be careful ,If you do that, you gonna violate the definition of percentile , is that you really want ?

View solution in original post

5 REPLIES 5
Ksharp
Super User
Then remove those duplicated values:




proc sort data=have out=want nodupkey;
 by date age;
run;
proc univariate data=want noprint;
var age;
by date;
output out=WANT pctlpts = 10 20 30 40 50 60 70 80 90 pctlpre=GR;
run;

Ksharp
Super User
I think again. You should be careful ,If you do that, you gonna violate the definition of percentile , is that you really want ?
MAC1430
Pyrite | Level 9

Thanks a lot ksharp, its really helpful. I am using it just as a cutoff point, but will look into data again. Have a good day 🙂

Reeza
Super User

Look at proc rank instead and how it deals with ties. 

As KSharp mentioned these are not percentiles so be careful when referencing your analysis to not refer to them as such. 

MAC1430
Pyrite | Level 9

Thanks Reeza, I guess proc rank will not work for me because I need cutoff points. The beginning date of many firms is same; therefore a large number of firms have the same age. This results into few top percentiles ending up having same age. 

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 2268 views
  • 2 likes
  • 3 in conversation