Can this be achieved by SAS? Many thanks!
I have a sample like this. In the real situation, it will contain more observations per day.
data have;input
date id volume; datalines;
20100101 01 11
20100101 02 -5
20100101 03 10
20100101 04 -8
20100101 05 14
20100101 06 1
20100102 01 -17
20100102 02 20
20100102 03 -10
20100102 04 25
20100102 05 -39
20100102 06 20
run;
Based on the absolute value of volume, I want to allocate observations into groups with a constant amount of absolute volume. In particular, this constant amount of absolute volume is the one-third of daily absolute volume.
I apply the following code to estimate the one-third of daily absolute volume.
data have;
set have;
abs_volume=abs(volume);
run;
proc means data=have noprint;
by date;
var abs_volume;
output out=total sum=daily_volume;
run;
data total;
set total;
criteria_value=floor(daily_volume/3);
run;
This is tricky part. If the volume is larger than the group size, the extra volume will put into the next group until all of the volume has been categorised.
The expected result will be following.
date id volume group
20100101 01 11 1
20100101 02 -5 1
20100101 03 10 2
20100101 04 -6 2 *This one was -8, but now it splits into -6 and -2
20100101 04 -2 3
20100101 05 14 3
20100101 06 1 4
20100102 01 -17 1
20100102 02 20 1
20100102 03 -6 1 *This one was -10, but now it splits into -6 and -4
20100102 03 -4 2
20100102 04 25 2
20100102 05 -14 2 *This one was -39, but now it splits into -14 and -25
20100102 05 -25 3
20100102 06 18 3 *This one was 20, but now it splits into 18 and 2
20100102 06 2 4
run;
Can this be achieved by SAS? Many thanks!
Have you looked at proc rank?
Hi, this this not just ranked by value. It should also maintain the order of ID.
I see. Sorry I read too quickly. Why is -17 in group 1? [Got it, group by increasing ID] Can you rewrite your comment to use the sample provided instead of 49 observations?
What if you only have 2 records for a given date?
What if you have 3 records with volumes, 1, 1, and 100?
Not sure that I understand what is meant by "with a constant amount of absolute volume. "
"constant amount" is not a standard statistic. So some rules, not one or two examples, but actual rules how to derive the "constant amount" should be provided.
Like this?
data SIZE(keep=DATE SIZE );
set HAVE ;
by DATE;
if first.DATE then SIZE=0;
SIZE+abs(VOLUME);
if last.DATE then output ;
run;
data WANT;
set HAVE;
by DATE;
if first.DATE then do;
set SIZE;
TALLY=0;
GRP=1;
end;
TALLY+abs(VOLUME);
LMT=int(SIZE*GRP/3) ;
if TALLY>LMT then do;
TMP=VOLUME;
VOLUME=sign(VOLUME)*(abs(VOLUME)-TALLY+LMT);
if VOLUME then output;
GRP+1;
VOLUME=TMP-VOLUME;
output;
end;
if TALLY=LMT then do;
output;
GRP+1;
end;
if TALLY<LMT then output;
run;
DATE | ID | VOLUME | GRP |
---|---|---|---|
20100101 | 1 | 11 | 1 |
20100101 | 2 | -5 | 1 |
20100101 | 3 | 10 | 2 |
20100101 | 4 | -6 | 2 |
20100101 | 4 | -2 | 3 |
20100101 | 5 | 14 | 3 |
20100101 | 6 | 1 | 3 |
20100102 | 1 | -17 | 1 |
20100102 | 2 | 20 | 1 |
20100102 | 3 | -6 | 1 |
20100102 | 3 | -4 | 2 |
20100102 | 4 | 25 | 2 |
20100102 | 5 | -15 | 2 |
20100102 | 5 | -24 | 3 |
20100102 | 6 | 20 | 3 |
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.