Solved
N/A
Posts: 0

# Calculate 2.5 percentila and 97.5 percentile

Hello:

I have a relatively large data set (around 100,000 rows), I used the following code to calculate the 5th and 95th percentile,

proc means data=fulldata3;
VAR DV;
CLASS GROUP TREA CMT STIM;
output out=percentiles2 mean=mean P5=P5 P95=P95;
run;

The data have 30 GROUPs, each GROUP has 4 TREAs, each TREA have 4 CMTs, every CMTs have the same STIM (scheduled time). But the above code cannot calculate the 2.5 and 97.5 percentile, since proc means does not have P2.5 and P97.5

I was trying to use
proc univariate data=fulldata3;
VAR DV;
CLASS GROUP TREA CMT STIM;
output out=percentiles1 mean=mean pctlpts=2.5 97.5 pctlpre=P;
run;

The error message is "ERROR: Cannot specify more than two CLASS variables"

Then, I tried to use
proc univariate data=fulldata3;
VAR DV;
By GROUP TREA CMT STIM;
output out=percentiles1 mean=mean pctlpts=2.5 97.5 pctlpre=P;
run;

It pumped out "windows is full and must be cleared select" and after I made a selection, it keeps doing that. I have stay there and click "C to clear windows without saving ". After a while I got the output, but the title was like the following,
"the 2.5000 percentile, DV", this name is not recognized as a variable which I will further process.

Thank you

Accepted Solutions
Solution
‎07-03-2017 02:15 PM
Valued Guide
Posts: 684

## Re: Calculate 2.5 percentila and 97.5 percentile

The code by data _null_ should work. I just wrote a demonstration program (with a few class factors), and you definitely get a variable called P2_5. Everything works fine.

data a;
do A = 1 to 5;
do B = 1 to 3;
do C = 1 to 7;
do rep = 1 to 200;
y = rannor(1);
output;
end;
end;
end;
end;
run;

proc print data=a(obs=300);
run;
proc univariate data=a;
by A B C;
var y;
output out=percentiles1 mean=mean pctlpts=2.5 97.5 pctlpre=P;
run;
proc print data=percentiles1;
run;
proc means data=percentiles1;
var P2_5;
run;

All Replies
N/A
Posts: 0

## Re: Calculate 2.5 percentila and 97.5 percentile

Hello:

I think the noprint option ought to stop the pumping out of the "windows is full and must be cleared select". So, the question left was that how to make the variable name in the output file from the proc univariate so that it can be processed later. The column name right now is like ""the 2.5000 percentile, DV".
Posts: 2,655

## Re: Calculate 2.5 percentila and 97.5 percentile

I could be missing something. Do you want the percentiles across all data, or the percentiles within each sub-sub-subclass? I assume the latter.

I ran this code (with the NOPRINT option) against a dataset with several clinical chemistry endpoints and groups. The output dataset contained the following (output from PROC CONTENTS):

Alphabetic List of Variables and Attributes

# Variable Type Len Format Informat Label

4 GRP_NO Num 8
2 MEAS_CD Num 8 11. 11. MEAS_CD
6 P2_5 Num 8 the 2.5000 percentile, value
7 P97_5 Num 8 the 97.5000 percentile, value
1 insum Num 8
5 mean Num 8 the mean, value
3 measureM Char 91

(Sorry about the wrapping, and this may not look so good in a proportional font)

Anyway, the variables you need would all be there, with proper values for each of your by variables.

Steve Denham

Message was edited by: SteveDenham Message was edited by: SteveDenham
N/A
Posts: 0

## Re: Calculate 2.5 percentila and 97.5 percentile

hi SteveDenham :

Thanks for your reply. The issue is with the label for the percentile, e.g."the 2.5000 percentile, value" I need to use this variable later, but since there is a comma in the label (maybe other things I am not aware of), I cannot get access to the 2.5 percentile in the output file from the proc univariate in the following example.

Proc means data=output (this is the output from the proc univariate);
var the 2.5000 percentile, value;
run;

This code did not work. "the 2.5000 percentile, value" is not recognized
Posts: 3,805

## Re: Calculate 2.5 percentila and 97.5 percentile

You are confusing NAMES and LABELS.

[pre]
proc sort data=sashelp.class out=class;
by sex;
proc univariate data=class;
by sex;
var height;
output out=percentiles1 mean=mean pctlpts=2.5 97.5 pctlpre=P;
run;

Obs Sex mean P2_5 P97_5

1 F 60.5889 51.3 66.5
2 M 63.9100 57.3 72.0
[/pre]
N/A
Posts: 0

## Re: Calculate 2.5 percentila and 97.5 percentile

Hi data _null_;

I think there is a misunderstanding between us.

what I intended do is as following based on your code,

proc sort data=sashelp.class out=class;
by sex;
proc univariate data=class;
by sex;
var height;
output out=percentiles1 mean=mean pctlpts=2.5 97.5 pctlpre=P;
run;

proc mean data=percentiles1;
var P2_5;
run;

/*the last piece of code will not work because there is no P2_5 in percentiles1 */
Posts: 3,805

## Re: Calculate 2.5 percentila and 97.5 percentile

You are going to have to show you work. When I run my program there IS a variable P2_5.
Solution
‎07-03-2017 02:15 PM
Valued Guide
Posts: 684

## Re: Calculate 2.5 percentila and 97.5 percentile

The code by data _null_ should work. I just wrote a demonstration program (with a few class factors), and you definitely get a variable called P2_5. Everything works fine.

data a;
do A = 1 to 5;
do B = 1 to 3;
do C = 1 to 7;
do rep = 1 to 200;
y = rannor(1);
output;
end;
end;
end;
end;
run;

proc print data=a(obs=300);
run;
proc univariate data=a;
by A B C;
var y;
output out=percentiles1 mean=mean pctlpts=2.5 97.5 pctlpre=P;
run;
proc print data=percentiles1;
run;
proc means data=percentiles1;
var P2_5;
run;
N/A
Posts: 0

## Re: Calculate 2.5 percentila and 97.5 percentile

Hello:

Thanks for your guys help. The P2_5 works. I did not know this P2_5 is the name.
I am "confusing NAMES and LABELS".

Thanks again
Super User
Posts: 10,214

## Re: Calculate 2.5 percentila and 97.5 percentile

Hi.
Another workaround way is to use ' proc rank ', set rank=1000,then value of rank with 25 is
the 2.5 percentila.

Ksharp
🔒 This topic is solved and locked.