BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Tanguy
Calcite | Level 5

Dear SAS users,

 

I am struggling a bit on calculating  the percentiles across observations using a proc univariate.

 

My dataset is as follows:

 

           V1   v2   v3 PercentileV1 Percentile V2 Percentile V3

Bank1

Bank2

 

I need to created the variables Percentilve V1, V2 and V3 for a defined population of banks.

 

Does anyone has used the proc univariate to do this?

 

In advance, thanks a lot,

Best regards,

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
user24feb
Barite | Level 11

Data A;
  Length ID $10.;
  Do i=1 To 1000;
    ID=Catt('Bank_',Put(i,Z4.));
	Value=Int(Ranuni(1)*1000);
	Output;
  End;
  Drop i;
Run;

Proc Univariate Data=A NoPrint;
  Var Value;
  Output Out=Result PctlPre=PERC_ PctlPts=1 to 100 by 1; 
Run;

Data B (Keep=ID Value Info);
  Length Info $20.;
  Set A;
  If _N_ eq 1 Then Do;
    Set Result;
  End;
  Array P Perc_:;
  Do i=1 To Dim(P);
    If Missing (Info) & Value lt P[i] Then Info=VName(P[i]);
  End;
Run;

* .. only to check;
Proc Sort Data=B;
  By Value;
Run;

 

 

View solution in original post

6 REPLIES 6
Rick_SAS
SAS Super FREQ

For data in wide form (each bank in its own variable), the article "Output percentiles of multiple variables in a tabular format" contains your answer. It shows how to use PROC MEANS (near top) and PROC UNIVARIATE (scroll down).

 

For data in long for (banks identified by the value of a categorical variable), you can use the CLASS statement as follows:

 

proc means data=sashelp.class p25 p50 p75;;
class sex;
var height;
run;

 

 

Tanguy
Calcite | Level 5
Dear,
That is interesting indeed, thanks a lot!
Actually the value of the PercentileV1, V2, V3 should be the interval where the bank stands e.g. Bank1 stands between 56th and 57th percentile.
Do you have any idea how to tackle this?
user24feb
Barite | Level 11

You could search for the pctlpts-option: http://support.sas.com/documentation/cdl/en/procstat/63104/HTML/default/viewer.htm#procstat_univaria...

 

And I think the result should bel like this:

 

Data A;
  Do ID='BANK1','BANK2','BANK3';
    Do j=0 To 100;
	  Output;
    End;
  End;
Run;

Proc Univariate Data=A NoPrint;
  Var j;
  By ID;
  Output Out=Result PctlPre=PERC_ PctlPts=30, 70 to 85 by 5; 
Run;
Tanguy
Calcite | Level 5
Dear,
That is interesting indeed, thanks a lot!
Actually the value of the PercentileV1, V2, V3 should be the interval where the bank stands e.g. Bank1 stands between 56th and 57th percentile.
I was not very clear in my question.
Do you have any idea how to tackle this?

user24feb
Barite | Level 11

Data A;
  Length ID $10.;
  Do i=1 To 1000;
    ID=Catt('Bank_',Put(i,Z4.));
	Value=Int(Ranuni(1)*1000);
	Output;
  End;
  Drop i;
Run;

Proc Univariate Data=A NoPrint;
  Var Value;
  Output Out=Result PctlPre=PERC_ PctlPts=1 to 100 by 1; 
Run;

Data B (Keep=ID Value Info);
  Length Info $20.;
  Set A;
  If _N_ eq 1 Then Do;
    Set Result;
  End;
  Array P Perc_:;
  Do i=1 To Dim(P);
    If Missing (Info) & Value lt P[i] Then Info=VName(P[i]);
  End;
Run;

* .. only to check;
Proc Sort Data=B;
  By Value;
Run;

 

 

Tanguy
Calcite | Level 5
That is absolutely great, thank you!

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 2344 views
  • 0 likes
  • 3 in conversation