turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Calculating Percentile across observations and exc...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-01-2016 03:11 PM

Hi all,

I'm wondering whether there is a way to calculate percentile across variables for each observation, and excludes missing values in the calculation?

The percentile function seems to work but if there is just a single missing value within all the variables then the result is a missing variable.

Only workaround I was able to find was to achieve this in Excel, as my dataset fortunately wasn't too large.

Thanks for your help!

Accepted Solutions

Solution

03-07-2016
01:07 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to FreelanceReinhard

03-02-2016 09:17 PM

Thanks!

The corrected code is below and still does not replicate the problem:

```
data test;
array pct(100) pct1-pct100 (1:100);
x=pctl(50, of pct(*));
do i=20 to 30;
pct(i)=.;
end;
y=pctl(50, of pct(*));
z=pctl(50, of pct1--pct100);
q=pctl(50, pct1, pct2, pct3, pct21, pct25, pct29, pct30, pct50, pct99, pct100);
keep x y z q;
run;
proc print;run;
```

@tts Have you been able to replicate your issue?

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-01-2016 03:19 PM

I would recommend a transpose of your data and then using a more robust procedure such as proc univariate or proc means to calculate the percentile.

However, the behaviour your describing is not consistent with the percentile documentation, which states:

The PCTL function returns the percentile of the __ nonmissing values__ corresponding to the percentage. I can't replicate that behaviour as well, can you post a sample of your data where this was occuring?

```
data test;
array pct(100) pct1-pct100 (1:100);
x=pctl(50, of pct(*));
do i=20 to 30;
pct(i)=.;
end;
y=pctl(50, of pct(*));
keep x y;
run;
```

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Reeza

03-01-2016 05:52 PM

My data is structured similarly to your test dataset, however my variable names are not standardized. There is no standardization as they are various company tickers on NASDAQ.

The way you define the variables to calculate percentile across would not work in my instance, correct?

What I had tried was :

percentile= PCTL(95,A--VLU); with A and VLU being the first and last variables I am interested in calculating percentile across.

Thanks!

The way you define the variables to calculate percentile across would not work in my instance, correct?

What I had tried was :

percentile= PCTL(95,A--VLU); with A and VLU being the first and last variables I am interested in calculating percentile across.

Thanks!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-01-2016 06:04 PM

If you can define it that way, you could also define your array similarily.

`array stocks(*) A--VLU;`

At any rate, I still can't replicate your issue. Please post your code and sample data that replicates your problem. I'm guessing you actually have some other issue.

```
data test;
array pct(100) pct1-pct100 (1:100);
x=pctl(50, of pct(*));
do i=20 to 30;
pct(i)=.;
end;
y=pctl(50, of pct(*));
z=pctl(95, pct1--pct100);
keep x y z;
run;
proc print;run;
```

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Reeza

03-02-2016 04:15 AM

Hi @Reeza,

I think you should insert an "of" into your definition of z. Otherwise, the 95th percentile of a single value (singleton set), pct1--pct100=pct1+pct100=101, will be calculated.

Solution

03-07-2016
01:07 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to FreelanceReinhard

03-02-2016 09:17 PM

Thanks!

The corrected code is below and still does not replicate the problem:

```
data test;
array pct(100) pct1-pct100 (1:100);
x=pctl(50, of pct(*));
do i=20 to 30;
pct(i)=.;
end;
y=pctl(50, of pct(*));
z=pctl(50, of pct1--pct100);
q=pctl(50, pct1, pct2, pct3, pct21, pct25, pct29, pct30, pct50, pct99, pct100);
keep x y z q;
run;
proc print;run;
```

@tts Have you been able to replicate your issue?