BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Ronein
Onyx | Level 15

Hello

I want to divide population of variable invoice in data set sashelp.cars into 4 groups by variable  invoice.

In each group there will be equal number of rows .

So, the criteria is using percentile .

Then I need to put the values (range of values) in proc format.

So, in proc format there will be defined 5 groups 

 

How should we do it pls?

 

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

Here's a try with UNIVARIATE:

proc univariate data=sashelp.cars;
var invoice;
output
  out=Pctls
  pctlpts = 10 20 30 40 50 60 70 80 90 100
  pctlpre = invoice
  pctlname = pct10 pct20 pct30 pct40 pct50 pct60 pct70 pct80 pct90 pct100
;
run;

proc sort
  data=sashelp.cars (keep=invoice)
  out=cars
;
by invoice;
run;

proc transpose
  data=pctls
  out=p_long
;
var _all_;
run;

data cntlin;
set p_long;
fmtname = "carspct";
type = "N";
start = ifn(_n_ = 1,0,lag(col1));
end = col1;
label = catx(" ","Percentile",_n_);
keep fmtname type start end label;
run;

proc format cntlin=cntlin;
run;

data carsfmt;
set cars;
format invoice carspct.;
run;

View solution in original post

8 REPLIES 8
PeterClemmensen
Tourmaline | Level 20

Percentile of what?

 

Here is something to get you started

 

proc rank data=sashelp.cars groups=10 descending out=cars;
   var MPG_City;
   ranks rank;
run;
Ronein
Onyx | Level 15

Hello,

I don't understand why the ranges are overlapped???

rank 0 is valued from 35777  till 173560

rank 1 is valued from 25371  till 35688

rank 2 is valued from 18881  till 25218

rank 3 is valued from 9875 till 18821

 

proc sort data=sashelp.cars out=cars;by  invoice ;run;

proc rank data=cars groups=4 descending out=cars;
   var invoice;
   ranks rank;
run;

PROC SQL;
	create table tbl  as
	select  0 as rank,
            min(invoice) as rank_min,
		    max(invoice) as rank_max
	from  cars
	where rank=0
	union all
  select  1 as rank,
            min(invoice) as rank_min,
		    max(invoice) as rank_max
	from  cars
	where rank=1
	union all
  select  2 as rank,
            min(invoice) as rank_min,
		    max(invoice) as rank_max
	from  cars
	where rank=2
		union all
  select  3 as rank,
            min(invoice) as rank_min,
		    max(invoice) as rank_max
	from  cars
	where rank=3

;
QUIT;

  

ballardw
Super User

@Ronein wrote:

Hello,

I don't understand why the ranges are overlapped???

rank 0 is valued from 35777  till 173560

rank 1 is valued from 25371  till 35688

rank 2 is valued from 18881  till 25218

rank 3 is valued from 9875 till 18821

  


What overlap? There is no overlap there. 35777 is larger than 35688, 25371 is larger than 25218, 18881 is larger than 18821.

 

Kurt_Bremser
Super User

Here's a try with UNIVARIATE:

proc univariate data=sashelp.cars;
var invoice;
output
  out=Pctls
  pctlpts = 10 20 30 40 50 60 70 80 90 100
  pctlpre = invoice
  pctlname = pct10 pct20 pct30 pct40 pct50 pct60 pct70 pct80 pct90 pct100
;
run;

proc sort
  data=sashelp.cars (keep=invoice)
  out=cars
;
by invoice;
run;

proc transpose
  data=pctls
  out=p_long
;
var _all_;
run;

data cntlin;
set p_long;
fmtname = "carspct";
type = "N";
start = ifn(_n_ = 1,0,lag(col1));
end = col1;
label = catx(" ","Percentile",_n_);
keep fmtname type start end label;
run;

proc format cntlin=cntlin;
run;

data carsfmt;
set cars;
format invoice carspct.;
run;
Ronein
Onyx | Level 15

May you please explain why proc rand is not working well?

I expect that in each group there will be non-overlapped values.

proc sort data=sashelp.cars out=cars;by  invoice ;run;
proc rank data=cars groups=10 descending out=cars;
   var invoice;
   ranks rank;
run;

 

PaigeMiller
Diamond | Level 26

@Ronein wrote:

May you please explain why proc rand is not working well?

I expect that in each group there will be non-overlapped values.

proc sort data=sashelp.cars out=cars;by  invoice ;run;
proc rank data=cars groups=10 descending out=cars;
   var invoice;
   ranks rank;
run;

 


What do you mean by "not working well"? PROC RANK will not produce overlapping groupings, except in the case of ties in which case there can be (and should be) some overlap. Please show us what you are seeing that doesn't work.

--
Paige Miller
Ronein
Onyx | Level 15

 

proc sort data=sashelp.cars out=cars;by  invoice ;run;
proc rank data=cars groups=4 descending out=cars;
   var invoice;
   ranks rank;
run;

 

How can it happen that invoice value in first group is between 35777  till 173560

and invoice value in second  group is between 25371  till 35688

I expect that minimum value in second group will be 173560 !!!

 

PaigeMiller
Diamond | Level 26

@Ronein wrote:

 

I expect that minimum value in second group will be 173560 !!!


I don't understand this, why do you expect that? I expect the MAXIMUM value in the FIRST group to be 173560, and that the MINIMUM value in the first group to be less than 173560; and that the maximum value in the other groups would be less than 173560, and also less than or equal to the minimum of the previous group, and that is what I see.

 

proc rank data=sashelp.cars groups=4 descending out=cars;
    var invoice;
    ranks rank;
run;
proc summary nway data=cars;
    class rank;
    var invoice;
    output out=stats min=min_invoice max=max_invoice;
run;

 

--
Paige Miller

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 3638 views
  • 1 like
  • 5 in conversation