I have a dataset with variable X which values theoretically range between 0 and 1. In practice, not every value or the min and max may be observed in the dataset.
I want to create N groups of equal interval length and assign each observation into one of those groups into variable Y. Some of these groups may be empty if there are no observations within the fixed interval.
For example if N=100 then the first group would look like:
0-0.01
0.01-0.02
...
0.99-1
Proc rank seems to work only with percentiles or with the values observed in the data. Another option would be Proc format - however it would be very inefficient to write the boundaries of each of the N groups specifically.
Hello @KonstantinVasil,
Do you mean that Y should contain the group number (e.g. 1, 2, ..., 100 if N=100)?
Then simply use something like this:
%let N=100;
data want;
set have;
if x>.z then y=ceil(&N*x);
run;
This would assign all values x in (0, 0.01] to y=1, all values x in (0.01, 0.02] to y=2, ... (0.99, 1] to y=100. Only the special case x=0 would be assigned to y=0. But if you want to assign 0 to the first interval (y=1), you can change the formula to y=(x=0)+ceil(&N*x).
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.