Hello experts.
Please help me how can I run this formula;
Thanks for your attention.
Assuming Volum_Sum is per TRD_STCK_CD in a TRD_STCK_CD (i.e. - per date)
you shall achieve it by next code:
data sample_data;
infile datalines;
input TRD_EVENT_DT mmddyy10.
TRD_EVENT_TM time8.
TRD_STCK_CD $5.
TRD_PR 4.
TRD_TUROVR 5.
TRD_EVENT_ROUFOR timr5.
VOLUME 8.
;
datalines;
03/24/2008 12:28:01 ALBZ1 1537 10000 12:30 15370000
03/24/2008 13:13:44 ALBZ1 1567 10 13:00 15670
03/24/2008 12:20:38 AZAB1 683 10000 12:30 6830000
03/24/2008 13:13:44 AZAB1 695 10 13:00 6950
03/24/2008 09:14:57 BALI1 850 9260 09:00 7871000
03/24/2008 09:15:06 BALI1 850 2000 09:30 1700000
03/24/2008 09:15:14 BALI1 850 10000 09:30 8500000
03/24/2008 09:15:24 BALI1 850 6000 09:30 5100000
03/24/2008 09:29:27 BALI1 850 10000 09:30 8500000
03/24/2008 12:28:00 BALI1 850 10000 12:30 8500000
03/24/2008 12:28:07 BALI1 850 10000 12:30 8500000
03/24/2008 13:13:44 BALI1 865 10 13:00 8650
; run;
data want;
set sample_data;
by TRD_STCK_CD TRD_STCK_CD ;
retain vol_sum;
drop_vol_sum;
if first.TRD_STCK_CD then vol_sum=0;
vol_sum = sum(of vol_sum, volume);
if last.TRD_STCK_CD then do;
Volume_sum = vol_sum;
output;
end;
run;
Please, next time attach the sample dataset as shown in first step.
Can you show that with a data sample instead and expected output?
This is based on the assumption you understand the formula and are looking for a technical answer on how to implement it given data. Since it partly depends on your data structure how it's achieved you should show your data.
The proper way to post SAS datasets as data step and a macro for conversion is found here. To apply your formula to data, knowledge of the data is essential.
Can you describe your data?
I guess:
- VOLUME is the amount of some material in a given time
- The Volume changes along time and measured at every half an hour
Is your data given as an array of the measured Volume ?
or as a list of observations? The SAS code will differ in those two situations,
though it is possible to convert each to the other.
Please correct my assumptions and add a sample of your data.
Hello;
This is a sample of my data:
TRD_EVENT_DT | TRD_EVENT_TM | TRD_EVENT_MS | TRD_STCK_ID | TRD_STCK_CD | TRD_INSTR_NAME | TRD_STCK_GRP_ID_CD | TRD_PR | TRD_TUROVR | TRD_TRA_NR |
3/24/2008 | 12:19:35 | IRB5IKCO8751 | IKCQ1 | 187191011 | N2 | 1 | 100 | 122197901 | |
3/24/2008 | 12:28:01 | IRO1ALBZ0001 | ALBZ1 | والبر1 | N1 | 1537 | 10000 | 122198037 | |
3/24/2008 | 13:13:44 | IRO1ALBZ0001 | ALBZ1 | والبر1 | N1 | 1567 | 10 | 9999198173 | |
3/24/2008 | 12:20:38 | IRO1AZAB0001 | AZAB1 | فاذر1 | N2 | 683 | 10000 | 122197911 | |
3/24/2008 | 13:13:44 | IRO1AZAB0001 | AZAB1 | فاذر1 | N2 | 695 | 10 | 9999198167 | |
3/24/2008 | 09:14:57 | IRO1BALI0001 | BALI1 | وبوعلي1 | N2 | 850 | 9260 | 122197242 | |
3/24/2008 | 09:15:06 | IRO1BALI0001 | BALI1 | وبوعلي1 | N2 | 850 | 2000 | 122197245 | |
3/24/2008 | 09:15:14 | IRO1BALI0001 | BALI1 | وبوعلي1 | N2 | 850 | 10000 | 122197247 | |
3/24/2008 | 09:15:24 | IRO1BALI0001 | BALI1 | وبوعلي1 | N2 | 850 | 6000 | 122197249 | |
3/24/2008 | 09:29:27 | IRO1BALI0001 | BALI1 | وبوعلي1 | N2 | 850 | 10000 | 122197296 | |
3/24/2008 | 12:28:00 | IRO1BALI0001 | BALI1 | وبوعلي1 | N2 | 850 | 10000 | 122198036 | |
3/24/2008 | 12:28:07 | IRO1BALI0001 | BALI1 | وبوعلي1 | N2 | 850 | 10000 | 122198041 | |
3/24/2008 | 13:13:44 | IRO1BALI0001 | BALI1 | وبوعلي1 | N2 | 865 | 10 | 9999198133 |
Explanation of headers of some columns:
[TRD_EVENT_DT] Month / Day / Year
,[TRD_EVENT_TM] Time: hh:mm:ss
,[TRD_EVENT_MS] Millisecond
,[TRD_STCK_ID]
,[TRD_STCK_CD] Stock ID
,[TRD_INSTR_NAME]
,[TRD_STCK_GRP_ID_CD]
,[TRD_PR] Stock Price
,[TRD_TUROVR] Turnover
,[TRD_TRA_NR]
I know how can I round time; my code is below:
*Step1: Rounding Time; DATA Sampledata87_New; SET Sampledata87; TRD_EVENT_TIME = INPUT(TRD_EVENT_TM,time16.); TRD_EVENT_ROUNDED = ROUND(TRD_EVENT_TIME,'00:30't); TRD_EVENT_ROUFOR = PUT(TRD_EVENT_ROUNDED,hhmm.);
And volume is:
*Step2: Calculation Of Volume; VOLUME = TRD_PR*TRD_TUROVR;
Now, I have this data;
SAS Output
03/24/2008 | 12:19:35 | IRB5IKCO8751 | IKCQ1 | 187191011 | N2 | 1 | 100 | 122197901 | 44375 | 45000 | 12:30 | 100 | |
03/24/2008 | 12:28:01 | IRO1ALBZ0001 | ALBZ1 | H'D(11 | N1 | 1537 | 10000 | 122198037 | 44881 | 45000 | 12:30 | 15370000 | |
03/24/2008 | 13:13:44 | IRO1ALBZ0001 | ALBZ1 | H'D(11 | N1 | 1567 | 10 | 9999198173 | 47624 | 46800 | 13:00 | 15670 | |
03/24/2008 | 12:20:38 | IRO1AZAB0001 | AZAB1 | A'011 | N2 | 683 | 10000 | 122197911 | 44438 | 45000 | 12:30 | 6830000 | |
03/24/2008 | 13:13:44 | IRO1AZAB0001 | AZAB1 | A'011 | N2 | 695 | 10 | 9999198167 | 47624 | 46800 | 13:00 | 6950 | |
03/24/2008 | 09:14:57 | IRO1BALI0001 | BALI1 | H(H9DJ1 | N2 | 850 | 9260 | 122197242 | 33297 | 32400 | 9:00 | 7871000 | |
03/24/2008 | 09:15:06 | IRO1BALI0001 | BALI1 | H(H9DJ1 | N2 | 850 | 2000 | 122197245 | 33306 | 34200 | 9:30 | 1700000 | |
03/24/2008 | 09:15:14 | IRO1BALI0001 | BALI1 | H(H9DJ1 | N2 | 850 | 10000 | 122197247 | 33314 | 34200 | 9:30 | 8500000 | |
03/24/2008 | 09:15:24 | IRO1BALI0001 | BALI1 | H(H9DJ1 | N2 | 850 | 6000 | 122197249 | 33324 | 34200 | 9:30 | 5100000 | |
03/24/2008 | 09:29:27 | IRO1BALI0001 | BALI1 | H(H9DJ1 | N2 | 850 | 10000 | 122197296 | 34167 | 34200 | 9:30 | 8500000 | |
03/24/2008 | 12:28:00 | IRO1BALI0001 | BALI1 | H(H9DJ1 | N2 | 850 | 10000 | 122198036 | 44880 | 45000 | 12:30 | 8500000 | |
03/24/2008 | 12:28:07 | IRO1BALI0001 | BALI1 | H(H9DJ1 | N2 | 850 | 10000 | 122198041 | 44887 | 45000 | 12:30 | 8500000 | |
03/24/2008 | 13:13:44 | IRO1BALI0001 | BALI1 | H(H9DJ1 | N2 | 865 | 10 | 9999198133 | 47624 | 46800 | 13:00 | 8650 |
But for volume for one day and one stock, I need sum all volume data of one day. It must Be distinguished by TRD_STCK_CD var. .
I hate to repeat myself, but:
A MACRO TO CONVERT DATASETS TO DATASTEPS FOR POSTING CAN BE FOUND HERE:
Please help the ones that are supposed to help you.
You gave more variable names then data displayed.
Anyhow, it seems to me that only next variables are used in your formula:
- TRD_EVENT_DT - The date, as you want sum volume per day
- TRD_STCK_ID or TRD_STCK_CD - One of then is enough as they relate 1:1
- TRD_PR - The price
- TRD_TUROVR - The turnover
Am I right, your code should look like:
data temp;
set have;
by TRD_EVENT_DT TRD_STCK_CD ;
retain volume;
if first.TRD_STCK_CD then volume=0;
volume = sum(of volume, TRD_PR * TRD_TUROVR);
if last.TRD_STCK_CD then output;
run;
Then you may need second step to compute the division.
I don't see the whole picture. Probably there is a procedur to it in one step.
Thank you;
But There is still a problem: I have shown it below:
432 data sampledata_new; 433 set sampledata87_New; 434 by TRD_EVENT_DT TRD_STCK_CD; 435 retain volume; 436 if first.TRD_STCK_CD then volume=0; 437 volume = sum(of volume, TRD_PR * TRD_TUROVR); 438 if last.TRD_STCK_CD then output; 439 run; NOTE: Character values have been converted to numeric values at the places given by: (Line):(Column). 437:37 437:47 ERROR: BY variables are not properly sorted on data set WORK.SAMPLEDATA87_NEW. TRD_EVENT_DT=03/24/2008 TRD_EVENT_TM=12:19:35 TRD_EVENT_MS= TRD_STCK_ID=IRB5IKCO8751 TRD_STCK_CD=IKCQ1 TRD_INSTR_NAME=187191011 TRD_STCK_GRP_ID_CD=N2 TRD_PR=1 TRD_TUROVR=100 TRD_TRA_NR=122197901 TRD_EVENT_TIME=44375 TRD_EVENT_ROUNDED=45000 TRD_EVENT_ROUFOR=12:30 VOLUME=100 FIRST.TRD_EVENT_DT=1 LAST.TRD_EVENT_DT=1 FIRST.TRD_STCK_CD=1 LAST.TRD_STCK_CD=1 _ERROR_=1 _N_=1 NOTE: The SAS System stopped processing this step because of errors. NOTE: There were 2 observations read from the data set WORK.SAMPLEDATA87_NEW. WARNING: The data set WORK.SAMPLEDATA_NEW may be incomplete. When this step was stopped there were 0 observations and 14 variables. WARNING: Data set WORK.SAMPLEDATA_NEW was not replaced because this step was stopped. NOTE: DATA statement used (Total process time): real time 0.04 seconds cpu time 0.03 seconds
Thanks for your attention.
All you need is to preceed the datastep with sorting:
proc sort data=sampledata87_New ; by TRD_EVENT_DT TRD_STCK_CD; run;
Hello everybody;
there is a sample of my data:
TRD_EVENT_DT | TRD_EVENT_TM | TRD_STCK_CD | TRD_PR | TRD_TUROVR | TRD_EVENT_ROUFOR | VOLUME | |||||||
3/24/2008 | 12:28:01 | ALBZ1 | 1537 | 10000 | 12:30 | 15370000 | |||||||
3/24/2008 | 13:13:44 | ALBZ1 | 1567 | 10 | 13:00 | 15670 | |||||||
3/24/2008 | 12:20:38 | AZAB1 | 683 | 10000 | 12:30 | 6830000 | |||||||
3/24/2008 | 13:13:44 | AZAB1 | 695 | 10 | 13:00 | 6950 | |||||||
3/24/2008 | 09:14:57 | BALI1 | 850 | 9260 | 9:00 | 7871000 | |||||||
3/24/2008 | 09:15:06 | BALI1 | 850 | 2000 | 9:30 | 1700000 | |||||||
3/24/2008 | 09:15:14 | BALI1 | 850 | 10000 | 9:30 | 8500000 | |||||||
3/24/2008 | 09:15:24 | BALI1 | 850 | 6000 | 9:30 | 5100000 | |||||||
3/24/2008 | 09:29:27 | BALI1 | 850 | 10000 | 9:30 | 8500000 | |||||||
3/24/2008 | 12:28:00 | BALI1 | 850 | 10000 | 12:30 | 8500000 | |||||||
3/24/2008 | 12:28:07 | BALI1 | 850 | 10000 | 12:30 | 8500000 | |||||||
3/24/2008 | 13:13:44 | BALI1 | 865 | 10 | 13:00 | 8650 |
I have deleted some col. for simplicity. In next step, I want a table such as below:
TRD_EVENT_DT | TRD_EVENT_TM | TRD_STCK_CD | TRD_PR | TRD_TUROVR | TRD_EVENT_ROUFOR | VOLUME | Volume_Sum | |||||||
3/24/2008 | 12:28:01 | ALBZ1 | 1537 | 10000 | 12:30 | 15370000 | ||||||||
3/24/2008 | 13:13:44 | ALBZ1 | 1567 | 10 | 13:00 | 15670 | 15385670 | |||||||
3/24/2008 | 12:20:38 | AZAB1 | 683 | 10000 | 12:30 | 6830000 | ||||||||
3/24/2008 | 13:13:44 | AZAB1 | 695 | 10 | 13:00 | 6950 | 6836950 | |||||||
3/24/2008 | 09:14:57 | BALI1 | 850 | 9260 | 9:00 | 7871000 | ||||||||
3/24/2008 | 09:15:06 | BALI1 | 850 | 2000 | 9:30 | 1700000 | ||||||||
3/24/2008 | 09:15:14 | BALI1 | 850 | 10000 | 9:30 | 8500000 | ||||||||
3/24/2008 | 09:15:24 | BALI1 | 850 | 6000 | 9:30 | 5100000 | ||||||||
3/24/2008 | 09:29:27 | BALI1 | 850 | 10000 | 9:30 | 8500000 | ||||||||
3/24/2008 | 12:28:00 | BALI1 | 850 | 10000 | 12:30 | 8500000 | ||||||||
3/24/2008 | 12:28:07 | BALI1 | 850 | 10000 | 12:30 | 8500000 | ||||||||
3/24/2008 | 13:13:44 | BALI1 | 865 | 10 | 13:00 | 8650 | 48679650 |
Please pay attention to last col. It has been generated by summing volumes that have same TRD_STCK_CD var. .So each TRD_STCK_CD obs. has just one Volume_Sum data.
Thank you.
Thank you Shmuel; But this code gives the last volume number ,and then It is multiplied by 2.
Thanks for your attention.
Assuming Volum_Sum is per TRD_STCK_CD in a TRD_STCK_CD (i.e. - per date)
you shall achieve it by next code:
data sample_data;
infile datalines;
input TRD_EVENT_DT mmddyy10.
TRD_EVENT_TM time8.
TRD_STCK_CD $5.
TRD_PR 4.
TRD_TUROVR 5.
TRD_EVENT_ROUFOR timr5.
VOLUME 8.
;
datalines;
03/24/2008 12:28:01 ALBZ1 1537 10000 12:30 15370000
03/24/2008 13:13:44 ALBZ1 1567 10 13:00 15670
03/24/2008 12:20:38 AZAB1 683 10000 12:30 6830000
03/24/2008 13:13:44 AZAB1 695 10 13:00 6950
03/24/2008 09:14:57 BALI1 850 9260 09:00 7871000
03/24/2008 09:15:06 BALI1 850 2000 09:30 1700000
03/24/2008 09:15:14 BALI1 850 10000 09:30 8500000
03/24/2008 09:15:24 BALI1 850 6000 09:30 5100000
03/24/2008 09:29:27 BALI1 850 10000 09:30 8500000
03/24/2008 12:28:00 BALI1 850 10000 12:30 8500000
03/24/2008 12:28:07 BALI1 850 10000 12:30 8500000
03/24/2008 13:13:44 BALI1 865 10 13:00 8650
; run;
data want;
set sample_data;
by TRD_STCK_CD TRD_STCK_CD ;
retain vol_sum;
drop_vol_sum;
if first.TRD_STCK_CD then vol_sum=0;
vol_sum = sum(of vol_sum, volume);
if last.TRD_STCK_CD then do;
Volume_sum = vol_sum;
output;
end;
run;
Please, next time attach the sample dataset as shown in first step.
Here is another solution:
http://stackoverflow.com/a/44034569/7996949
/*Sort by TRD_STCK_CD and temporal variables.*/ proc sort data=have out=have_sorted; by TRD_STCK_CD TRD_EVENT_DT TRD_EVENT_TM; run; /*Sum VOLUME until the last of each TRD_STCK_CD is reached.*/ data want; set have_sorted; by TRD_STCK_CD TRD_EVENT_DT TRD_EVENT_TM; retain tmp_volume_sum; tmp_volume_sum + VOLUME; if last.TRD_STCK_CD then do; Volume_Sum = tmp_volume_sum; call missing(tmp_volume_sum); end; drop tmp_:; run;
Thank you.
Here another solution!
https://stackoverflow.com/a/44031791/7996949
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.