This is somewhat complex (well to me at least).
Here is what I have to do: Say that I have the following dataset:
date price volume
02-Sep 40 100
03-Sep 45 200
04-Sep 46 150
05-Sep 43 300
Say that I have a breakpoint where I wish to create an interval in my dataset. For instance, let my breakpoint = 200 volume transaction.
What I want is to create an ID column and record an ID variable =1,2,3,... for every breakpoint = 200. When you sum all the volume per ID, the value must be constant across all ID variables.
So using my example above, my final dataset should look like the following:
date price volume id
02-Sep 40 100 1
03-Sep 45 100 1
03-Sep 45 100 2
04-Sep 46 100 2
04-Sep 46 50 3
05-Sep 43 150
305-Sep 43 150 4 (last row can miss some value but that is fine. I will kick out the last id)
As you can see, I had to "decompose" some rows (like the second row for instance, I break the 200 into two 100 volume) in order to have constant value of the sum, 200, of volume across all ID.
Can anyone help me out?
Hoooo, That is really not easy.
I only consider the scenario that the first obs's volume less than 200 .
data have; input date $ price volume ; datalines; 02-Sep 40 100 03-Sep 45 200 04-Sep 46 150 05-Sep 43 300 ; run; data want(keep=date price volume ); set have(obs=1); start=1; if volume lt 200 then do; output; do until(i=nobs); remain=200-volume; do i=start+1 to nobs; set have nobs=nobs point=i; if remain le volume then do; _volume=volume; do until(remain lt 200); volume=ifn((remain lt 200),remain,200); output; remain=_volume-volume; _volume=remain; end; volume=remain;output; start=i; leave; end; else do;output;remain=remain-volume; end; end; end; end; run; data want(drop=sum yes); set want; retain yes 0 id 1; sum+volume; if yes then do;id+1;yes=0;end; if sum ge 200 then do;yes=1;sum=0;end; run;
Ksharp
Message was edited by: xia keshan
Hoooo, That is really not easy.
I only consider the scenario that the first obs's volume less than 200 .
data have; input date $ price volume ; datalines; 02-Sep 40 100 03-Sep 45 200 04-Sep 46 150 05-Sep 43 300 ; run; data want(keep=date price volume ); set have(obs=1); start=1; if volume lt 200 then do; output; do until(i=nobs); remain=200-volume; do i=start+1 to nobs; set have nobs=nobs point=i; if remain le volume then do; _volume=volume; do until(remain lt 200); volume=ifn((remain lt 200),remain,200); output; remain=_volume-volume; _volume=remain; end; volume=remain;output; start=i; leave; end; else do;output;remain=remain-volume; end; end; end; end; run; data want(drop=sum yes); set want; retain yes 0 id 1; sum+volume; if yes then do;id+1;yes=0;end; if sum ge 200 then do;yes=1;sum=0;end; run;
Ksharp
Message was edited by: xia keshan
The problem you face is the complicated logic. you need to consider every situation you will take on .
I don't know how to explain it. you need to digest it on your own which is the only way you can enhance your skill .
The more you learned from others , The more power you will have.
"Also what happens if my first observation is greater than 200?"
That will render more code and more comlicated logic. But that is easy ,once you can understand what I code.
Good Luck.
Ksharp
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.