I have a sample like this. In the real situation, it will contain more observations per day.
data have;input
date id volume; datalines;
20100101 01 11
20100101 02 -5
20100101 03 10
20100101 04 -8
20100101 05 14
20100101 06 1
20100102 01 -17
20100102 02 20
20100102 03 -10
20100102 04 25
20100102 05 -39
20100102 06 20
run;
Based on the absolute value of volume, I want to allocate observations into groups with a constant amount of absolute volume. In particular, this constant amount of absolute volume is the one-third of daily absolute volume.
I apply the following code to estimate the one-third of daily absolute volume.
data have;
set have;
abs_volume=abs(volume);
run;
proc means data=have noprint;
by date;
var abs_volume;
output out=total sum=daily_volume;
run;
data total;
set total;
criteria_value=floor(daily_volume/3);
run;
This is tricky part. If the volume is larger than the group size, the extra volume will put into the next group until all of the volume has been categorised.
The expected result will be following.
date id volume group
20100101 01 11 1
20100101 02 -5 1
20100101 03 10 2
20100101 04 -6 2 *This one was -8, but now it splits into -6 and -2
20100101 04 -2 3
20100101 05 14 3
20100101 06 1 4
20100102 01 -17 1
20100102 02 20 1
20100102 03 -6 1 *This one was -10, but now it splits into -6 and -4
20100102 03 -4 2
20100102 04 25 2
20100102 05 -14 2 *This one was -39, but now it splits into -14 and -25
20100102 05 -25 3
20100102 06 18 3 *This one was 20, but now it splits into 18 and 2
20100102 06 2 4
run;
Can this be achieved by SAS? Many thanks!
I update a sample as an example. The time variable in the uploaded sample can be viewed the ID in above.
... View more