BookmarkSubscribeRSS Feed
varun1220
Fluorite | Level 6

I have a table of 6 columns and 8760 rows filled with numerical values. I need to find 99 percentiles for each row which has 6 values. 

 

Below is the gist of the table:

 

 

Load_04 Load_05 Load_06 Load_07 Load_08 Load_09
2433.896 2323.323 2720.163 2708.686 2742.605 2925.509
2381.1 2299.548 2649.212 2576.808 2638.918 2717.989
2421.829 2376.882 2595.548 2474.217 2604.292 2749.657
2485.082 2487.375 2595.651 2434.356 2579.163 2728.158
2559.993 2573.468 2607.336 2459.65 2602.624

2795.224

 

I want to calculate all the percentiles for each row for combined six values. 

Ex: For first row, I want the 99 percentiles of the values 

2433.896 2323.323 2720.163 2708.686 2742.605 2925.509

  together. 

Similarly for each row. 

 

I have tried proc univariate for this purpose, the result is not for every row but for every column. Hence I tried transposing the table but SAS can't take 8760 variables at a time. 

 

Kindly suggest a solution. 

 

Thanks a lot in advance. 

 

Varun. 

14 REPLIES 14
Jagadishkatam
Amethyst | Level 16

i believe we need  to transpose data to columns and then use proc means to calculate the percentile

data have;
input Load_04 	Load_05 	Load_06 	Load_07 	Load_08 	Load_09;
cards;
2433.896 	2323.323 	2720.163 	2708.686 	2742.605 	2925.509
2381.1 	2299.548 	2649.212 	2576.808 	     2638.918 	    2717.989
2421.829 	2376.882 	2595.548 	2474.217 	2604.292 	2749.657
2485.082 	2487.375 	2595.651 	2434.356 	2579.163 	2728.158
2559.993 	2573.468 	2607.336 	2459.65 	2602.624 	2795.224
;

proc transpose data=have out=want;
var  Load_04 	Load_05 	Load_06 	Load_07 	Load_08 	Load_09;
run;

proc means data= want ;
var  col1-col5;
output out=p99 p99=/autoname;
run;

Thanks,

Jag

Thanks,
Jag
stat_sas
Ammonite | Level 13

Use pctl function to get percentile for each row:

 

data want(drop=i);
set have;
array load(*) load:;
do i=1 to dim(load);
p_99=pctl(99,of load(*));
end;
run;

varun1220
Fluorite | Level 6

Thank you for the asnwer. This seems very helpful. 

 

Could you please tell me what does the function

 

array load(*)load: 

 

do?

 

What does load(*) stand for? 

 

Kindly reply as per your convenience. 

stat_sas
Ammonite | Level 13

array load(*) load:;

 

load is name of array and (*) indicates the dimension of array, represents number of elements in the array which is 6 in this case. If we indicate dimension of the array using (*) then SAS determines demension of the array by counting number of elements.

 

 

load: represents 6 elements in the array (load1 load2 load3 load4 load5 load6) 

 

array load(*) load:;

 

can also be written as

 

array load(6) load1 load2 load3 load4 load5 load6;

varun1220
Fluorite | Level 6

Great. Thank you so much. 

varun1220
Fluorite | Level 6

Thank you Jagdish. This is a helpful answer. 

 

Varun. 

Reeza
Super User
The 99th percentile will either be the maximum or the second largest value, when you have only 6 values. Look at using the largest or max function as it may be more efficient than the percentile function.
Rick_SAS
SAS Super FREQ

As Reeza pointed out, this exercise doesn't make sense. In general, you need more than 100 values before the 99th percentile becomes meaningful. It looks like your data does not contain duplicate or rounded values, so the only relevant percentiles are 16.6th, 33.3th,...,and 87.6th.  If you want the max of each row, that is easy to do with the DATA step and an array.

varun1220
Fluorite | Level 6

I need to calculate all the 99 percentiles for my assignment as I'm performing probablistic forecasting. 

Reeza
Super User
Sometimes its good to think about why you do what you're doing. Would your answer be different in this case if you used the 95th percentile? Or 90th percentile? Most likely not because you only have 6 variables so they would all end up being the same number.
varun1220
Fluorite | Level 6

Sincere apologies for a delayed reply. 

 

I was implemented this scenario in excel, the values weren't similar. I almost got 99 different values when I used the PERCENTILE.INC function over the 6 variables I mentioned earlier in the post. 

 

Another point, would the logic be same if I want to get 100 percentiles of 20 variables instead of 6? 6 was just an example. 

 

Thank you,

Varun. 

 

  

Reeza
Super User
100 percentiles is the maximum.

Percentiles have various definitions on how to calculate them, mostly around how to handle ties.

SAS supports 5 definitions through proc univariate and I don't think any align to Excel definitions.

https://support.sas.com/documentation/cdl/en/procstat/63104/HTML/default/viewer.htm#procstat_univari...

varun1220
Fluorite | Level 6

I am in situation where I have more than 99 values and I need calculate the 99 percentiles of each of set of values. The table consists of 8760 rows and more than 100 columns. 

 

I need to calculate 99 percentiles for each row. 

 

Kindly revert as per your convenience. 

 

Reeza
Super User
The answers from Nov 2 are correct. Either transpose your data and use proc means or use the percentile function. If those dont work you need to explain why so we can help further.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 14 replies
  • 3455 views
  • 2 likes
  • 5 in conversation