Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- General Programming
- /
- Percentile in SAS

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-31-2015 01:33 PM

Hi everybody,

Could you please help me to solve the problem with percentile? I have a lot of companies for each month during a year. I need to take 20th percentile based on prices of those companies for first month and exclude those companies that lie below this 20th percentile for this month. Then, I need to go the next month, take 20th percentile based on prices and remove those firms that lie below this 20th percentile for the next month and so on. So each month I will have different 20th percentile since prices are different each month. Also I need to get this 20th percentile threshold value (say, create an extra column with this threshold value so that for each month it is going to be repeated for all firms that are left after removal).

I know this is not very difficult problem, but anyway any help would be hugely appreciated.

Kind regards,

Ruslan

Accepted Solutions

Solution

02-02-2015
06:29 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-02-2015 06:29 PM

9.2 is ancient

Switch to Proc Univariate with PCTLPTS

proc univariate data=sashelp.stocks noprint;

class date;

format date monyy5.;

var open;

output out=step1 pctlpts=20 pctlpre=p;

run;

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-31-2015 01:49 PM

You should post some sample data.

Proc univariate or proc means can calculate the 20th percentile and you'll have to merge that in to your original data set to then apply the cutoff.

Or you could use proc rank but that won't give you the percentile value.

Here's how to calculate the 20th percentile.

proc means data=sashelp.stocks noprint nway;

class date;

format date monyy5.;

var open;

output out=step1 p20=p20;

run;

or proc rank example:

proc sort data=sashelp.stocks out=stocks; by date; run;

proc rank data=stocks out=ranked(where=(open_rank=1)) groups=5;

by date;

format date monyy5.;

var open;

ranks open_rank;

run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-31-2015 02:06 PM

Thank you very much, Reeza, for your answer.

Here is the sample data:

Date (Month) Company Price

1 Firm 1 3

1 Firm 2 7

1 Firm 3 5

1 Firm 4 9

1 Firm 5 2

1 Firm 6 7.5

1 Firm 7 10

2 Firm 1 6

2 Firm 2 2

2 Firm 4 9

2 Firm 5 3

2 Firm 6 8

2 Firm 7 1

2 Firm 8 10

Now based on "Price", I need to get 20th percentile value for each month, then remove those firms that lie below this 20th percentile value. Additionally, I need to get this 20th percentile value (either in separate file or create an extra column "threshold" and repeat this threshold value for all stocks that are left after removal).

I would be grateful, if you could please provide the code for the above sample data.

Kind regards,

Ruslan

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-31-2015 04:45 PM

The structure is pretty much identical to the sashelp.stocks so the code I posted should convert.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-02-2015 10:39 AM

Are your "months" ever going to cross years such that you have data from Jan 2014 and Jan 2015? Do you want to treat both of those as the same or different "months"?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-02-2015 05:47 PM

Year matters.

So the 20th percentile based on companies in January 2014 is different from the percentile based on firms in January 2015 (of course if prices are different).

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-02-2015 05:56 PM

How do you identify what year a month belongs to? Add that variable to your class statement to ensure that you get data by month/year.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-02-2015 06:01 PM

Thanks, Reeza, for your reply.

I am working with yearly files, so within one file the year will the same and only months will be different (that's why I put only months in sample data).

I have tried your code to get 20th percentile and sas gives me an error at "p20=p20". I assume this is due to the old version of SAS (I am using SAS 9.2). Is there a way to solve this problem in my version of SAS?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-02-2015 06:05 PM

Show the full code and error message. The base procedures generating percentiles have been around since version 6 (at least), so it is very unlikely to be the SAS version.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-02-2015 06:10 PM

**proc** **means** data=t_nyse noprint nway;

class date;

format date ddmmyy10.;

var cap;

output out=step1 p20=p20;

**run**;

This is the code. SAS displays an error at "p20=p20". Any help would be hugely appreciated.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-02-2015 06:13 PM

ERROR 22-322: Syntax error, expecting one of the following: ;, (, /, CSS, CV, IDGROUP, IDGRP, KURTOSIS, LCLM, MAX, MAXID, MEAN, MEDIAN, MIN, MINID, MODE, N, NMISS, OUT, P1, P10,

P25, P5, P50, P75, P90, P95, P99, PROBT, Q1, Q3, QRANGE, RANGE, SKEWNESS, STDDEV, STDERR, SUM, SUMWGT, T, UCLM, USS, VAR.

ERROR 76-322: Syntax error, statement will be ignored

This is the error message.

Solution

02-02-2015
06:29 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-02-2015 06:29 PM

9.2 is ancient

Switch to Proc Univariate with PCTLPTS

proc univariate data=sashelp.stocks noprint;

class date;

format date monyy5.;

var open;

output out=step1 pctlpts=20 pctlpre=p;

run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-02-2015 06:34 PM

Yes, this code works perfectly. Thank you very much, Reeza, for all your help!