Hello
I want to divide population of variable invoice in data set sashelp.cars into 4 groups by variable invoice.
In each group there will be equal number of rows .
So, the criteria is using percentile .
Then I need to put the values (range of values) in proc format.
So, in proc format there will be defined 5 groups
How should we do it pls?
Here's a try with UNIVARIATE:
proc univariate data=sashelp.cars;
var invoice;
output
out=Pctls
pctlpts = 10 20 30 40 50 60 70 80 90 100
pctlpre = invoice
pctlname = pct10 pct20 pct30 pct40 pct50 pct60 pct70 pct80 pct90 pct100
;
run;
proc sort
data=sashelp.cars (keep=invoice)
out=cars
;
by invoice;
run;
proc transpose
data=pctls
out=p_long
;
var _all_;
run;
data cntlin;
set p_long;
fmtname = "carspct";
type = "N";
start = ifn(_n_ = 1,0,lag(col1));
end = col1;
label = catx(" ","Percentile",_n_);
keep fmtname type start end label;
run;
proc format cntlin=cntlin;
run;
data carsfmt;
set cars;
format invoice carspct.;
run;
Percentile of what?
Here is something to get you started
proc rank data=sashelp.cars groups=10 descending out=cars;
var MPG_City;
ranks rank;
run;
Hello,
I don't understand why the ranges are overlapped???
rank 0 is valued from 35777 till 173560
rank 1 is valued from 25371 till 35688
rank 2 is valued from 18881 till 25218
rank 3 is valued from 9875 till 18821
proc sort data=sashelp.cars out=cars;by invoice ;run;
proc rank data=cars groups=4 descending out=cars;
var invoice;
ranks rank;
run;
PROC SQL;
create table tbl as
select 0 as rank,
min(invoice) as rank_min,
max(invoice) as rank_max
from cars
where rank=0
union all
select 1 as rank,
min(invoice) as rank_min,
max(invoice) as rank_max
from cars
where rank=1
union all
select 2 as rank,
min(invoice) as rank_min,
max(invoice) as rank_max
from cars
where rank=2
union all
select 3 as rank,
min(invoice) as rank_min,
max(invoice) as rank_max
from cars
where rank=3
;
QUIT;
@Ronein wrote:
Hello,
I don't understand why the ranges are overlapped???
rank 0 is valued from 35777 till 173560
rank 1 is valued from 25371 till 35688
rank 2 is valued from 18881 till 25218
rank 3 is valued from 9875 till 18821
What overlap? There is no overlap there. 35777 is larger than 35688, 25371 is larger than 25218, 18881 is larger than 18821.
Here's a try with UNIVARIATE:
proc univariate data=sashelp.cars;
var invoice;
output
out=Pctls
pctlpts = 10 20 30 40 50 60 70 80 90 100
pctlpre = invoice
pctlname = pct10 pct20 pct30 pct40 pct50 pct60 pct70 pct80 pct90 pct100
;
run;
proc sort
data=sashelp.cars (keep=invoice)
out=cars
;
by invoice;
run;
proc transpose
data=pctls
out=p_long
;
var _all_;
run;
data cntlin;
set p_long;
fmtname = "carspct";
type = "N";
start = ifn(_n_ = 1,0,lag(col1));
end = col1;
label = catx(" ","Percentile",_n_);
keep fmtname type start end label;
run;
proc format cntlin=cntlin;
run;
data carsfmt;
set cars;
format invoice carspct.;
run;
May you please explain why proc rand is not working well?
I expect that in each group there will be non-overlapped values.
proc sort data=sashelp.cars out=cars;by invoice ;run;
proc rank data=cars groups=10 descending out=cars;
var invoice;
ranks rank;
run;
@Ronein wrote:
May you please explain why proc rand is not working well?
I expect that in each group there will be non-overlapped values.
proc sort data=sashelp.cars out=cars;by invoice ;run; proc rank data=cars groups=10 descending out=cars; var invoice; ranks rank; run;
What do you mean by "not working well"? PROC RANK will not produce overlapping groupings, except in the case of ties in which case there can be (and should be) some overlap. Please show us what you are seeing that doesn't work.
proc sort data=sashelp.cars out=cars;by invoice ;run;
proc rank data=cars groups=4 descending out=cars;
var invoice;
ranks rank;
run;
How can it happen that invoice value in first group is between 35777 till 173560
and invoice value in second group is between 25371 till 35688
I expect that minimum value in second group will be 173560 !!!
@Ronein wrote:
I expect that minimum value in second group will be 173560 !!!
I don't understand this, why do you expect that? I expect the MAXIMUM value in the FIRST group to be 173560, and that the MINIMUM value in the first group to be less than 173560; and that the maximum value in the other groups would be less than 173560, and also less than or equal to the minimum of the previous group, and that is what I see.
proc rank data=sashelp.cars groups=4 descending out=cars;
var invoice;
ranks rank;
run;
proc summary nway data=cars;
class rank;
var invoice;
output out=stats min=min_invoice max=max_invoice;
run;
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.