If you don't have ties and aren't generating the quantiles for many variables (so that you aren't spending lots of time/memory sorting), you could sort your data and identify the records where your quantiles occur simply by taking the 0th, 25th, 50th, 75th, and 100th positions in the sorted order. e.g. if N=30,000,000 and no ties then you can use the following code to get the quantiles: if _N_=1 or _N_=30000000 or _N_=ceil(.25*30000000) or _N_=ceil(.50*30000000) or _N_=ceil(.75*30000000); If you have ties, the issue is much more complicated and I'd go with the HPBIN approach or one of the other recommendations. Also, creating quantiles by levels of other variables complicates determining the number of observations you'll have in your classes which you'd need to figure out and that would add time to this process. You could do that with some by processing which shouldn't be too time intensive; sort followed by two data steps; first data step with some by processing to figuring out numbers of observations in each class and then second data step to select the observations. I guess some book keeping would be needed to know _N_ values delineating the beginning and ending of each set of class records.
... View more