BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
aminkarimid
Lapis Lazuli | Level 10

Hello experts.

Please help me how can I run this formula;

NormalizedVolume.PNG

Thanks for your attention.

1 ACCEPTED SOLUTION

Accepted Solutions
Shmuel
Garnet | Level 18

Assuming Volum_Sum is per TRD_STCK_CD in a TRD_STCK_CD  (i.e. - per date)

you shall achieve it by next code:

data sample_data;
 infile datalines;
 input TRD_EVENT_DT mmddyy10.
       TRD_EVENT_TM time8. 
	   TRD_STCK_CD  $5. 
	   TRD_PR       4.
       TRD_TUROVR   5. 
	   TRD_EVENT_ROUFOR timr5. 
	   VOLUME       8.
   ;
	   
datalines;
03/24/2008 12:28:01 ALBZ1 1537 10000 12:30 15370000
03/24/2008 13:13:44 ALBZ1 1567 10	 13:00 15670
03/24/2008 12:20:38 AZAB1 683  10000 12:30 6830000
03/24/2008 13:13:44 AZAB1 695  10	 13:00 6950
03/24/2008 09:14:57 BALI1 850  9260	 09:00 7871000
03/24/2008 09:15:06 BALI1 850  2000	 09:30 1700000
03/24/2008 09:15:14 BALI1 850  10000 09:30 8500000
03/24/2008 09:15:24 BALI1 850  6000	 09:30 5100000
03/24/2008 09:29:27 BALI1 850  10000 09:30 8500000
03/24/2008 12:28:00 BALI1 850  10000 12:30 8500000
03/24/2008 12:28:07 BALI1 850  10000 12:30 8500000
03/24/2008 13:13:44 BALI1 865  10	 13:00 8650
; run;
 
data want;
  set sample_data;
        by TRD_STCK_CD  TRD_STCK_CD  ;
            retain vol_sum;
			drop_vol_sum;
            if first.TRD_STCK_CD    then vol_sum=0;
            vol_sum = sum(of vol_sum, volume);
            if last.TRD_STCK_CD then do;
			   Volume_sum = vol_sum;
			   output;
			end;
run;

Please, next time attach the sample dataset as shown in first step.

View solution in original post

14 REPLIES 14
Reeza
Super User

Can you show that with a data sample instead and expected output? 

This is based on the assumption you understand the formula and are looking for a technical answer on how to implement it given data. Since it partly depends on your data structure how it's achieved you should show your data. 

Kurt_Bremser
Super User

The proper way to post SAS datasets as data step and a macro for conversion is found here. To apply your formula to data, knowledge of the data is essential.

Shmuel
Garnet | Level 18

Can you describe your data?

I guess:

   - VOLUME is the amount of some material  in a given time

   - The Volume changes along time and measured at every half an hour

   Is your data given as an array of the measured Volume ? 

   or as a list of observations? The SAS code will differ in those two situations,

   though it is possible to convert each to the other.

 

Please correct my assumptions and add a sample of your data.

aminkarimid
Lapis Lazuli | Level 10

Hello;

This is a sample of my data:

TRD_EVENT_DTTRD_EVENT_TMTRD_EVENT_MSTRD_STCK_IDTRD_STCK_CDTRD_INSTR_NAMETRD_STCK_GRP_ID_CDTRD_PRTRD_TUROVRTRD_TRA_NR
3/24/200812:19:35 IRB5IKCO8751IKCQ1187191011N21100122197901
3/24/200812:28:01 IRO1ALBZ0001ALBZ1والبر1N1153710000122198037
3/24/200813:13:44 IRO1ALBZ0001ALBZ1والبر1N11567109999198173
3/24/200812:20:38 IRO1AZAB0001AZAB1فاذر1N268310000122197911
3/24/200813:13:44 IRO1AZAB0001AZAB1فاذر1N2695109999198167
3/24/200809:14:57 IRO1BALI0001BALI1وبوعلي1N28509260122197242
3/24/200809:15:06 IRO1BALI0001BALI1وبوعلي1N28502000122197245
3/24/200809:15:14 IRO1BALI0001BALI1وبوعلي1N285010000122197247
3/24/200809:15:24 IRO1BALI0001BALI1وبوعلي1N28506000122197249
3/24/200809:29:27 IRO1BALI0001BALI1وبوعلي1N285010000122197296
3/24/200812:28:00 IRO1BALI0001BALI1وبوعلي1N285010000122198036
3/24/200812:28:07 IRO1BALI0001BALI1وبوعلي1N285010000122198041
3/24/200813:13:44 IRO1BALI0001BALI1وبوعلي1N2865109999198133

Explanation of headers of some columns:

[TRD_EVENT_DT] Month / Day / Year
,[TRD_EVENT_TM] Time: hh:mm:ss
,[TRD_EVENT_MS] Millisecond
,[TRD_STCK_ID] 
,[TRD_STCK_CD] Stock ID 
,[TRD_INSTR_NAME] 
,[TRD_STCK_GRP_ID_CD] 
,[TRD_PR] Stock Price
,[TRD_TUROVR] Turnover
,[TRD_TRA_NR] 

 

I know how can I round time; my code is below:

 

*Step1: Rounding Time;
DATA Sampledata87_New;
SET Sampledata87;
TRD_EVENT_TIME = INPUT(TRD_EVENT_TM,time16.);
TRD_EVENT_ROUNDED = ROUND(TRD_EVENT_TIME,'00:30't);
TRD_EVENT_ROUFOR = PUT(TRD_EVENT_ROUNDED,hhmm.);

And volume is:

 

*Step2: Calculation Of Volume;
VOLUME = TRD_PR*TRD_TUROVR;

Now, I have this data;

SAS Output

Obs TRD_EVENT_DT TRD_EVENT_TM TRD_EVENT_MS TRD_STCK_ID TRD_STCK_CD TRD_INSTR_NAME TRD_STCK_GRP_ID_CD TRD_PR TRD_TUROVR TRD_TRA_NR TRD_EVENT_TIME TRD_EVENT_ROUNDED TRD_EVENT_ROUFOR VOLUME 1 2 3 4 5 6 7 8 9 10 11 12 13
03/24/200812:19:35 IRB5IKCO8751IKCQ1187191011N21100122197901443754500012:30100
03/24/200812:28:01 IRO1ALBZ0001ALBZ1H'D(11N1153710000122198037448814500012:3015370000
03/24/200813:13:44 IRO1ALBZ0001ALBZ1H'D(11N11567109999198173476244680013:0015670
03/24/200812:20:38 IRO1AZAB0001AZAB1A'011N268310000122197911444384500012:306830000
03/24/200813:13:44 IRO1AZAB0001AZAB1A'011N2695109999198167476244680013:006950
03/24/200809:14:57 IRO1BALI0001BALI1H(H9DJ1N2850926012219724233297324009:007871000
03/24/200809:15:06 IRO1BALI0001BALI1H(H9DJ1N2850200012219724533306342009:301700000
03/24/200809:15:14 IRO1BALI0001BALI1H(H9DJ1N28501000012219724733314342009:308500000
03/24/200809:15:24 IRO1BALI0001BALI1H(H9DJ1N2850600012219724933324342009:305100000
03/24/200809:29:27 IRO1BALI0001BALI1H(H9DJ1N28501000012219729634167342009:308500000
03/24/200812:28:00 IRO1BALI0001BALI1H(H9DJ1N285010000122198036448804500012:308500000
03/24/200812:28:07 IRO1BALI0001BALI1H(H9DJ1N285010000122198041448874500012:308500000
03/24/200813:13:44 IRO1BALI0001BALI1H(H9DJ1N2865109999198133476244680013:008650

 

But for volume for one day and one stock, I need sum all volume data of one day. It must Be distinguished by  TRD_STCK_CD var. .

 

 

 

Shmuel
Garnet | Level 18

You gave more variable names then data displayed.

Anyhow, it seems to me that only next variables are used in your formula:

- TRD_EVENT_DT                                 - The date, as you want sum volume per day

- TRD_STCK_ID  or TRD_STCK_CD     - One of then is enough as they relate 1:1

- TRD_PR                                              - The price

- TRD_TUROVR                                    - The turnover

 

Am I right, your code should look like:

 

data temp;

  set have;

        by TRD_EVENT_DT  TRD_STCK_CD  ;

             retain volume;

            if first.TRD_STCK_CD    then volume=0;

            volume = sum(of volume, TRD_PR *  TRD_TUROVR);

            if last.TRD_STCK_CD then output;

run;

 

Then you may need second step to compute the division.

I don't see the whole picture. Probably there is a procedur to it in one step.

 

aminkarimid
Lapis Lazuli | Level 10

Thank you;

But There is still a problem: I have shown it below:

 

 

432  data sampledata_new;
433    set sampledata87_New;
434          by TRD_EVENT_DT TRD_STCK_CD;
435               retain volume;
436              if first.TRD_STCK_CD    then volume=0;
437              volume = sum(of volume, TRD_PR *  TRD_TUROVR);
438              if last.TRD_STCK_CD then output;
439  run;

NOTE: Character values have been converted to numeric values at the places given by: (Line):(Column).
      437:37   437:47
ERROR: BY variables are not properly sorted on data set WORK.SAMPLEDATA87_NEW.
TRD_EVENT_DT=03/24/2008 TRD_EVENT_TM=12:19:35 TRD_EVENT_MS=  TRD_STCK_ID=IRB5IKCO8751
TRD_STCK_CD=IKCQ1 TRD_INSTR_NAME=187191011 TRD_STCK_GRP_ID_CD=N2 TRD_PR=1 TRD_TUROVR=100
TRD_TRA_NR=122197901 TRD_EVENT_TIME=44375 TRD_EVENT_ROUNDED=45000 TRD_EVENT_ROUFOR=12:30 VOLUME=100
FIRST.TRD_EVENT_DT=1 LAST.TRD_EVENT_DT=1 FIRST.TRD_STCK_CD=1 LAST.TRD_STCK_CD=1 _ERROR_=1 _N_=1
NOTE: The SAS System stopped processing this step because of errors.
NOTE: There were 2 observations read from the data set WORK.SAMPLEDATA87_NEW.
WARNING: The data set WORK.SAMPLEDATA_NEW may be incomplete.  When this step was stopped there were 0
         observations and 14 variables.
WARNING: Data set WORK.SAMPLEDATA_NEW was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
      real time           0.04 seconds
      cpu time            0.03 seconds

 

 

Thanks for your attention.

Shmuel
Garnet | Level 18

All you need is to preceed the datastep with sorting:

 

proc sort data=sampledata87_New ; by TRD_EVENT_DT TRD_STCK_CD;  run;

aminkarimid
Lapis Lazuli | Level 10
Thank you so much Shmuel, But the final results is wrong. I have described it below, in next post.
Thanks for your attention.
aminkarimid
Lapis Lazuli | Level 10

Hello everybody;

there is a sample of my data:

TRD_EVENT_DTTRD_EVENT_TM  TRD_STCK_CD  TRD_PRTRD_TUROVR   TRD_EVENT_ROUFORVOLUME
3/24/200812:28:01  ALBZ1  153710000   12:3015370000
3/24/200813:13:44  ALBZ1  156710   13:0015670
3/24/200812:20:38  AZAB1  68310000   12:306830000
3/24/200813:13:44  AZAB1  69510   13:006950
3/24/200809:14:57  BALI1  8509260    9:007871000
3/24/200809:15:06  BALI1  8502000    9:301700000
3/24/200809:15:14  BALI1  85010000    9:308500000
3/24/200809:15:24  BALI1  8506000    9:305100000
3/24/200809:29:27  BALI1  85010000    9:308500000
3/24/200812:28:00  BALI1  85010000   12:308500000
3/24/200812:28:07  BALI1  85010000   12:308500000
3/24/200813:13:44  BALI1  86510   13:008650

 

 

I have deleted some col. for simplicity. In next step, I want a table such as below:

TRD_EVENT_DTTRD_EVENT_TM  TRD_STCK_CD  TRD_PRTRD_TUROVR   TRD_EVENT_ROUFORVOLUMEVolume_Sum
3/24/200812:28:01  ALBZ1  153710000   12:3015370000 
3/24/200813:13:44  ALBZ1  156710   13:001567015385670
3/24/200812:20:38  AZAB1  68310000   12:306830000 
3/24/200813:13:44  AZAB1  69510   13:0069506836950
3/24/200809:14:57  BALI1  8509260    9:007871000 
3/24/200809:15:06  BALI1  8502000    9:301700000 
3/24/200809:15:14  BALI1  85010000    9:308500000 
3/24/200809:15:24  BALI1  8506000    9:305100000 
3/24/200809:29:27  BALI1  85010000    9:308500000 
3/24/200812:28:00  BALI1  85010000   12:308500000 
3/24/200812:28:07  BALI1  85010000   12:308500000 
3/24/200813:13:44  BALI1  86510   13:00865048679650

 

 

Please pay attention to last col. It has been generated by summing volumes that have same TRD_STCK_CD var. .So each TRD_STCK_CD obs. has just one Volume_Sum data.

Thank you.

aminkarimid
Lapis Lazuli | Level 10

Thank you Shmuel; But this code gives the last volume number ,and then It is multiplied by 2.
Thanks for your attention.

Shmuel
Garnet | Level 18

Assuming Volum_Sum is per TRD_STCK_CD in a TRD_STCK_CD  (i.e. - per date)

you shall achieve it by next code:

data sample_data;
 infile datalines;
 input TRD_EVENT_DT mmddyy10.
       TRD_EVENT_TM time8. 
	   TRD_STCK_CD  $5. 
	   TRD_PR       4.
       TRD_TUROVR   5. 
	   TRD_EVENT_ROUFOR timr5. 
	   VOLUME       8.
   ;
	   
datalines;
03/24/2008 12:28:01 ALBZ1 1537 10000 12:30 15370000
03/24/2008 13:13:44 ALBZ1 1567 10	 13:00 15670
03/24/2008 12:20:38 AZAB1 683  10000 12:30 6830000
03/24/2008 13:13:44 AZAB1 695  10	 13:00 6950
03/24/2008 09:14:57 BALI1 850  9260	 09:00 7871000
03/24/2008 09:15:06 BALI1 850  2000	 09:30 1700000
03/24/2008 09:15:14 BALI1 850  10000 09:30 8500000
03/24/2008 09:15:24 BALI1 850  6000	 09:30 5100000
03/24/2008 09:29:27 BALI1 850  10000 09:30 8500000
03/24/2008 12:28:00 BALI1 850  10000 12:30 8500000
03/24/2008 12:28:07 BALI1 850  10000 12:30 8500000
03/24/2008 13:13:44 BALI1 865  10	 13:00 8650
; run;
 
data want;
  set sample_data;
        by TRD_STCK_CD  TRD_STCK_CD  ;
            retain vol_sum;
			drop_vol_sum;
            if first.TRD_STCK_CD    then vol_sum=0;
            vol_sum = sum(of vol_sum, volume);
            if last.TRD_STCK_CD then do;
			   Volume_sum = vol_sum;
			   output;
			end;
run;

Please, next time attach the sample dataset as shown in first step.

aminkarimid
Lapis Lazuli | Level 10

Here is another solution:

http://stackoverflow.com/a/44034569/7996949

 

/*Sort by TRD_STCK_CD and temporal variables.*/
proc sort data=have out=have_sorted;
    by  TRD_STCK_CD
        TRD_EVENT_DT
        TRD_EVENT_TM;
run;

/*Sum VOLUME until the last of each TRD_STCK_CD is reached.*/
data want;
    set have_sorted;
    by  TRD_STCK_CD
        TRD_EVENT_DT
        TRD_EVENT_TM;
    retain tmp_volume_sum;
    tmp_volume_sum + VOLUME;
    if last.TRD_STCK_CD then do;
        Volume_Sum = tmp_volume_sum;
        call missing(tmp_volume_sum);
    end;
    drop tmp_:;
run;

 

 

Thank you.

aminkarimid
Lapis Lazuli | Level 10

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 14 replies
  • 1926 views
  • 2 likes
  • 4 in conversation