Dear all,
I am trying to create customized bins for my unevenly distributed data, skewed on left. I'd like to create more bins on the left side and less on the right side to visualize this distribution.
I have tried proc hpbin but doesnt allow to create uneven bins; alternatively, tried bin function that doesn't work (not sure why). And other numerous solutions proc univariate, proc format etc: nothing worked for what I would like; below is my hypothetical example of bins range that I would like. Thank you very much.
Cutoff values |
1500 |
1000 |
500 |
250 |
100 |
50 |
30 |
15 |
5 |
3 |
2 |
1 |
How about this?
proc format;
value bin
1 <- 5 = "1"
5 <- 10 = "2"
10 <- 20 = "3"
20 <- 30 = "4"
30 <- 50 = "5"
50 <- 100 = "6"
100 <- 200 = "7"
200 <- 500 = "8"
500 <- 1000 = "9"
1000 <- 1500 = "10"
;
run;
data have(drop=i);
do i=1 to 100;
x=ceil(rand('uniform')*1500);
output;
end;
run;
data want;
set have;
id=put(x, bin.);
run;
So your bins ranges here are 1500-1000, 1000-500 and so on?
Can you show us your PROC FORMAT code? Sounds to me like a job for PROC FORMAT
Ranges in proc format should roughly look like:
1 <- 5 = "1"
5 <- 10 = "2"
10 <- 20 = "3"
20 <- 30 = "4"
30 <- 50 = "5"
50 <- 100 = "6"
100 <- 200 = "7"
200 <- 500 = "8"
500 <- 1000 = "9"
1000 <- 1500 = "10"
I would ideally like to dump variable x in uneven bins with my choice of cutoff points and then assign those bin levels to another variable 'id', below is example:
id | x | bins of x |
….. | ….. | 1 |
….. | ….. | 1 |
….. | ….. | 2 |
….. | ….. | 2 |
….. | ….. | 2 |
….. | ….. | 3 |
….. | ….. | 3 |
….. | ….. | 3 |
….. | ….. | 3 |
….. | ….. | 4 |
….. | ….. | 4 |
….. | ….. | 4 |
….. | ….. | 4 |
….. | ….. | 4 |
Thanks,
How about this?
proc format;
value bin
1 <- 5 = "1"
5 <- 10 = "2"
10 <- 20 = "3"
20 <- 30 = "4"
30 <- 50 = "5"
50 <- 100 = "6"
100 <- 200 = "7"
200 <- 500 = "8"
500 <- 1000 = "9"
1000 <- 1500 = "10"
;
run;
data have(drop=i);
do i=1 to 100;
x=ceil(rand('uniform')*1500);
output;
end;
run;
data want;
set have;
id=put(x, bin.);
run;
@PeterClemmensen wrote:
How about this?
proc format; value bin 1 <- 5 = "1" 5 <- 10 = "2" 10 <- 20 = "3" 20 <- 30 = "4" 30 <- 50 = "5" 50 <- 100 = "6" 100 <- 200 = "7" 200 <- 500 = "8" 500 <- 1000 = "9" 1000 <- 1500 = "10" ; run; data have(drop=i); do i=1 to 100; x=ceil(rand('uniform')*1500); output; end; run; data want; set have; id=put(x, bin.); run;
To maintain sort order with a format I might suggest using two character values so that the order doesn't become 1, 10, 2 for most purposes.
Or use
id = input(put(x,bin.), 2.);
to create a numeric value.
Thank you @ballardw
Thank you very much @PeterClemmensen for your quick help.
In fact, I did not even need the second step; there was some mistake in my 'put' code after assigning the bins. The document was not clear to me.
Anytime 🙂
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.