BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Ronein
Meteorite | Level 14

Hello

I want to divide population of variable invoice in data set sashelp.cars into 4 groups by variable  invoice.

In each group there will be equal number of rows .

So, the criteria is using percentile .

Then I need to put the values (range of values) in proc format.

So, in proc format there will be defined 5 groups 

 

How should we do it pls?

 

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

Here's a try with UNIVARIATE:

proc univariate data=sashelp.cars;
var invoice;
output
  out=Pctls
  pctlpts = 10 20 30 40 50 60 70 80 90 100
  pctlpre = invoice
  pctlname = pct10 pct20 pct30 pct40 pct50 pct60 pct70 pct80 pct90 pct100
;
run;

proc sort
  data=sashelp.cars (keep=invoice)
  out=cars
;
by invoice;
run;

proc transpose
  data=pctls
  out=p_long
;
var _all_;
run;

data cntlin;
set p_long;
fmtname = "carspct";
type = "N";
start = ifn(_n_ = 1,0,lag(col1));
end = col1;
label = catx(" ","Percentile",_n_);
keep fmtname type start end label;
run;

proc format cntlin=cntlin;
run;

data carsfmt;
set cars;
format invoice carspct.;
run;

View solution in original post

8 REPLIES 8
PeterClemmensen
Tourmaline | Level 20

Percentile of what?

 

Here is something to get you started

 

proc rank data=sashelp.cars groups=10 descending out=cars;
   var MPG_City;
   ranks rank;
run;
Ronein
Meteorite | Level 14

Hello,

I don't understand why the ranges are overlapped???

rank 0 is valued from 35777  till 173560

rank 1 is valued from 25371  till 35688

rank 2 is valued from 18881  till 25218

rank 3 is valued from 9875 till 18821

 

proc sort data=sashelp.cars out=cars;by  invoice ;run;

proc rank data=cars groups=4 descending out=cars;
   var invoice;
   ranks rank;
run;

PROC SQL;
	create table tbl  as
	select  0 as rank,
            min(invoice) as rank_min,
		    max(invoice) as rank_max
	from  cars
	where rank=0
	union all
  select  1 as rank,
            min(invoice) as rank_min,
		    max(invoice) as rank_max
	from  cars
	where rank=1
	union all
  select  2 as rank,
            min(invoice) as rank_min,
		    max(invoice) as rank_max
	from  cars
	where rank=2
		union all
  select  3 as rank,
            min(invoice) as rank_min,
		    max(invoice) as rank_max
	from  cars
	where rank=3

;
QUIT;

  

ballardw
Super User

@Ronein wrote:

Hello,

I don't understand why the ranges are overlapped???

rank 0 is valued from 35777  till 173560

rank 1 is valued from 25371  till 35688

rank 2 is valued from 18881  till 25218

rank 3 is valued from 9875 till 18821

  


What overlap? There is no overlap there. 35777 is larger than 35688, 25371 is larger than 25218, 18881 is larger than 18821.

 

Kurt_Bremser
Super User

Here's a try with UNIVARIATE:

proc univariate data=sashelp.cars;
var invoice;
output
  out=Pctls
  pctlpts = 10 20 30 40 50 60 70 80 90 100
  pctlpre = invoice
  pctlname = pct10 pct20 pct30 pct40 pct50 pct60 pct70 pct80 pct90 pct100
;
run;

proc sort
  data=sashelp.cars (keep=invoice)
  out=cars
;
by invoice;
run;

proc transpose
  data=pctls
  out=p_long
;
var _all_;
run;

data cntlin;
set p_long;
fmtname = "carspct";
type = "N";
start = ifn(_n_ = 1,0,lag(col1));
end = col1;
label = catx(" ","Percentile",_n_);
keep fmtname type start end label;
run;

proc format cntlin=cntlin;
run;

data carsfmt;
set cars;
format invoice carspct.;
run;
Ronein
Meteorite | Level 14

May you please explain why proc rand is not working well?

I expect that in each group there will be non-overlapped values.

proc sort data=sashelp.cars out=cars;by  invoice ;run;
proc rank data=cars groups=10 descending out=cars;
   var invoice;
   ranks rank;
run;

 

PaigeMiller
Diamond | Level 26

@Ronein wrote:

May you please explain why proc rand is not working well?

I expect that in each group there will be non-overlapped values.

proc sort data=sashelp.cars out=cars;by  invoice ;run;
proc rank data=cars groups=10 descending out=cars;
   var invoice;
   ranks rank;
run;

 


What do you mean by "not working well"? PROC RANK will not produce overlapping groupings, except in the case of ties in which case there can be (and should be) some overlap. Please show us what you are seeing that doesn't work.

--
Paige Miller
Ronein
Meteorite | Level 14

 

proc sort data=sashelp.cars out=cars;by  invoice ;run;
proc rank data=cars groups=4 descending out=cars;
   var invoice;
   ranks rank;
run;

 

How can it happen that invoice value in first group is between 35777  till 173560

and invoice value in second  group is between 25371  till 35688

I expect that minimum value in second group will be 173560 !!!

 

PaigeMiller
Diamond | Level 26

@Ronein wrote:

 

I expect that minimum value in second group will be 173560 !!!


I don't understand this, why do you expect that? I expect the MAXIMUM value in the FIRST group to be 173560, and that the MINIMUM value in the first group to be less than 173560; and that the maximum value in the other groups would be less than 173560, and also less than or equal to the minimum of the previous group, and that is what I see.

 

proc rank data=sashelp.cars groups=4 descending out=cars;
    var invoice;
    ranks rank;
run;
proc summary nway data=cars;
    class rank;
    var invoice;
    output out=stats min=min_invoice max=max_invoice;
run;

 

--
Paige Miller

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 2001 views
  • 1 like
  • 5 in conversation