BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
aminkarimid
Lapis Lazuli | Level 10

Hello experts.

Please help me how can I run this formula;

NormalizedVolume.PNG

Thanks for your attention.

1 ACCEPTED SOLUTION

Accepted Solutions
Shmuel
Garnet | Level 18

Assuming Volum_Sum is per TRD_STCK_CD in a TRD_STCK_CD  (i.e. - per date)

you shall achieve it by next code:

data sample_data;
 infile datalines;
 input TRD_EVENT_DT mmddyy10.
       TRD_EVENT_TM time8. 
	   TRD_STCK_CD  $5. 
	   TRD_PR       4.
       TRD_TUROVR   5. 
	   TRD_EVENT_ROUFOR timr5. 
	   VOLUME       8.
   ;
	   
datalines;
03/24/2008 12:28:01 ALBZ1 1537 10000 12:30 15370000
03/24/2008 13:13:44 ALBZ1 1567 10	 13:00 15670
03/24/2008 12:20:38 AZAB1 683  10000 12:30 6830000
03/24/2008 13:13:44 AZAB1 695  10	 13:00 6950
03/24/2008 09:14:57 BALI1 850  9260	 09:00 7871000
03/24/2008 09:15:06 BALI1 850  2000	 09:30 1700000
03/24/2008 09:15:14 BALI1 850  10000 09:30 8500000
03/24/2008 09:15:24 BALI1 850  6000	 09:30 5100000
03/24/2008 09:29:27 BALI1 850  10000 09:30 8500000
03/24/2008 12:28:00 BALI1 850  10000 12:30 8500000
03/24/2008 12:28:07 BALI1 850  10000 12:30 8500000
03/24/2008 13:13:44 BALI1 865  10	 13:00 8650
; run;
 
data want;
  set sample_data;
        by TRD_STCK_CD  TRD_STCK_CD  ;
            retain vol_sum;
			drop_vol_sum;
            if first.TRD_STCK_CD    then vol_sum=0;
            vol_sum = sum(of vol_sum, volume);
            if last.TRD_STCK_CD then do;
			   Volume_sum = vol_sum;
			   output;
			end;
run;

Please, next time attach the sample dataset as shown in first step.

View solution in original post

14 REPLIES 14
Reeza
Super User

Can you show that with a data sample instead and expected output? 

This is based on the assumption you understand the formula and are looking for a technical answer on how to implement it given data. Since it partly depends on your data structure how it's achieved you should show your data. 

Kurt_Bremser
Super User

The proper way to post SAS datasets as data step and a macro for conversion is found here. To apply your formula to data, knowledge of the data is essential.

Shmuel
Garnet | Level 18

Can you describe your data?

I guess:

   - VOLUME is the amount of some material  in a given time

   - The Volume changes along time and measured at every half an hour

   Is your data given as an array of the measured Volume ? 

   or as a list of observations? The SAS code will differ in those two situations,

   though it is possible to convert each to the other.

 

Please correct my assumptions and add a sample of your data.

aminkarimid
Lapis Lazuli | Level 10

Hello;

This is a sample of my data:

TRD_EVENT_DTTRD_EVENT_TMTRD_EVENT_MSTRD_STCK_IDTRD_STCK_CDTRD_INSTR_NAMETRD_STCK_GRP_ID_CDTRD_PRTRD_TUROVRTRD_TRA_NR
3/24/200812:19:35 IRB5IKCO8751IKCQ1187191011N21100122197901
3/24/200812:28:01 IRO1ALBZ0001ALBZ1والبر1N1153710000122198037
3/24/200813:13:44 IRO1ALBZ0001ALBZ1والبر1N11567109999198173
3/24/200812:20:38 IRO1AZAB0001AZAB1فاذر1N268310000122197911
3/24/200813:13:44 IRO1AZAB0001AZAB1فاذر1N2695109999198167
3/24/200809:14:57 IRO1BALI0001BALI1وبوعلي1N28509260122197242
3/24/200809:15:06 IRO1BALI0001BALI1وبوعلي1N28502000122197245
3/24/200809:15:14 IRO1BALI0001BALI1وبوعلي1N285010000122197247
3/24/200809:15:24 IRO1BALI0001BALI1وبوعلي1N28506000122197249
3/24/200809:29:27 IRO1BALI0001BALI1وبوعلي1N285010000122197296
3/24/200812:28:00 IRO1BALI0001BALI1وبوعلي1N285010000122198036
3/24/200812:28:07 IRO1BALI0001BALI1وبوعلي1N285010000122198041
3/24/200813:13:44 IRO1BALI0001BALI1وبوعلي1N2865109999198133

Explanation of headers of some columns:

[TRD_EVENT_DT] Month / Day / Year
,[TRD_EVENT_TM] Time: hh:mm:ss
,[TRD_EVENT_MS] Millisecond
,[TRD_STCK_ID] 
,[TRD_STCK_CD] Stock ID 
,[TRD_INSTR_NAME] 
,[TRD_STCK_GRP_ID_CD] 
,[TRD_PR] Stock Price
,[TRD_TUROVR] Turnover
,[TRD_TRA_NR] 

 

I know how can I round time; my code is below:

 

*Step1: Rounding Time;
DATA Sampledata87_New;
SET Sampledata87;
TRD_EVENT_TIME = INPUT(TRD_EVENT_TM,time16.);
TRD_EVENT_ROUNDED = ROUND(TRD_EVENT_TIME,'00:30't);
TRD_EVENT_ROUFOR = PUT(TRD_EVENT_ROUNDED,hhmm.);

And volume is:

 

*Step2: Calculation Of Volume;
VOLUME = TRD_PR*TRD_TUROVR;

Now, I have this data;

SAS Output

Obs TRD_EVENT_DT TRD_EVENT_TM TRD_EVENT_MS TRD_STCK_ID TRD_STCK_CD TRD_INSTR_NAME TRD_STCK_GRP_ID_CD TRD_PR TRD_TUROVR TRD_TRA_NR TRD_EVENT_TIME TRD_EVENT_ROUNDED TRD_EVENT_ROUFOR VOLUME 1 2 3 4 5 6 7 8 9 10 11 12 13
03/24/200812:19:35 IRB5IKCO8751IKCQ1187191011N21100122197901443754500012:30100
03/24/200812:28:01 IRO1ALBZ0001ALBZ1H'D(11N1153710000122198037448814500012:3015370000
03/24/200813:13:44 IRO1ALBZ0001ALBZ1H'D(11N11567109999198173476244680013:0015670
03/24/200812:20:38 IRO1AZAB0001AZAB1A'011N268310000122197911444384500012:306830000
03/24/200813:13:44 IRO1AZAB0001AZAB1A'011N2695109999198167476244680013:006950
03/24/200809:14:57 IRO1BALI0001BALI1H(H9DJ1N2850926012219724233297324009:007871000
03/24/200809:15:06 IRO1BALI0001BALI1H(H9DJ1N2850200012219724533306342009:301700000
03/24/200809:15:14 IRO1BALI0001BALI1H(H9DJ1N28501000012219724733314342009:308500000
03/24/200809:15:24 IRO1BALI0001BALI1H(H9DJ1N2850600012219724933324342009:305100000
03/24/200809:29:27 IRO1BALI0001BALI1H(H9DJ1N28501000012219729634167342009:308500000
03/24/200812:28:00 IRO1BALI0001BALI1H(H9DJ1N285010000122198036448804500012:308500000
03/24/200812:28:07 IRO1BALI0001BALI1H(H9DJ1N285010000122198041448874500012:308500000
03/24/200813:13:44 IRO1BALI0001BALI1H(H9DJ1N2865109999198133476244680013:008650

 

But for volume for one day and one stock, I need sum all volume data of one day. It must Be distinguished by  TRD_STCK_CD var. .

 

 

 

Shmuel
Garnet | Level 18

You gave more variable names then data displayed.

Anyhow, it seems to me that only next variables are used in your formula:

- TRD_EVENT_DT                                 - The date, as you want sum volume per day

- TRD_STCK_ID  or TRD_STCK_CD     - One of then is enough as they relate 1:1

- TRD_PR                                              - The price

- TRD_TUROVR                                    - The turnover

 

Am I right, your code should look like:

 

data temp;

  set have;

        by TRD_EVENT_DT  TRD_STCK_CD  ;

             retain volume;

            if first.TRD_STCK_CD    then volume=0;

            volume = sum(of volume, TRD_PR *  TRD_TUROVR);

            if last.TRD_STCK_CD then output;

run;

 

Then you may need second step to compute the division.

I don't see the whole picture. Probably there is a procedur to it in one step.

 

aminkarimid
Lapis Lazuli | Level 10

Thank you;

But There is still a problem: I have shown it below:

 

 

432  data sampledata_new;
433    set sampledata87_New;
434          by TRD_EVENT_DT TRD_STCK_CD;
435               retain volume;
436              if first.TRD_STCK_CD    then volume=0;
437              volume = sum(of volume, TRD_PR *  TRD_TUROVR);
438              if last.TRD_STCK_CD then output;
439  run;

NOTE: Character values have been converted to numeric values at the places given by: (Line):(Column).
      437:37   437:47
ERROR: BY variables are not properly sorted on data set WORK.SAMPLEDATA87_NEW.
TRD_EVENT_DT=03/24/2008 TRD_EVENT_TM=12:19:35 TRD_EVENT_MS=  TRD_STCK_ID=IRB5IKCO8751
TRD_STCK_CD=IKCQ1 TRD_INSTR_NAME=187191011 TRD_STCK_GRP_ID_CD=N2 TRD_PR=1 TRD_TUROVR=100
TRD_TRA_NR=122197901 TRD_EVENT_TIME=44375 TRD_EVENT_ROUNDED=45000 TRD_EVENT_ROUFOR=12:30 VOLUME=100
FIRST.TRD_EVENT_DT=1 LAST.TRD_EVENT_DT=1 FIRST.TRD_STCK_CD=1 LAST.TRD_STCK_CD=1 _ERROR_=1 _N_=1
NOTE: The SAS System stopped processing this step because of errors.
NOTE: There were 2 observations read from the data set WORK.SAMPLEDATA87_NEW.
WARNING: The data set WORK.SAMPLEDATA_NEW may be incomplete.  When this step was stopped there were 0
         observations and 14 variables.
WARNING: Data set WORK.SAMPLEDATA_NEW was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
      real time           0.04 seconds
      cpu time            0.03 seconds

 

 

Thanks for your attention.

Shmuel
Garnet | Level 18

All you need is to preceed the datastep with sorting:

 

proc sort data=sampledata87_New ; by TRD_EVENT_DT TRD_STCK_CD;  run;

aminkarimid
Lapis Lazuli | Level 10
Thank you so much Shmuel, But the final results is wrong. I have described it below, in next post.
Thanks for your attention.
aminkarimid
Lapis Lazuli | Level 10

Hello everybody;

there is a sample of my data:

TRD_EVENT_DTTRD_EVENT_TM  TRD_STCK_CD  TRD_PRTRD_TUROVR   TRD_EVENT_ROUFORVOLUME
3/24/200812:28:01  ALBZ1  153710000   12:3015370000
3/24/200813:13:44  ALBZ1  156710   13:0015670
3/24/200812:20:38  AZAB1  68310000   12:306830000
3/24/200813:13:44  AZAB1  69510   13:006950
3/24/200809:14:57  BALI1  8509260    9:007871000
3/24/200809:15:06  BALI1  8502000    9:301700000
3/24/200809:15:14  BALI1  85010000    9:308500000
3/24/200809:15:24  BALI1  8506000    9:305100000
3/24/200809:29:27  BALI1  85010000    9:308500000
3/24/200812:28:00  BALI1  85010000   12:308500000
3/24/200812:28:07  BALI1  85010000   12:308500000
3/24/200813:13:44  BALI1  86510   13:008650

 

 

I have deleted some col. for simplicity. In next step, I want a table such as below:

TRD_EVENT_DTTRD_EVENT_TM  TRD_STCK_CD  TRD_PRTRD_TUROVR   TRD_EVENT_ROUFORVOLUMEVolume_Sum
3/24/200812:28:01  ALBZ1  153710000   12:3015370000 
3/24/200813:13:44  ALBZ1  156710   13:001567015385670
3/24/200812:20:38  AZAB1  68310000   12:306830000 
3/24/200813:13:44  AZAB1  69510   13:0069506836950
3/24/200809:14:57  BALI1  8509260    9:007871000 
3/24/200809:15:06  BALI1  8502000    9:301700000 
3/24/200809:15:14  BALI1  85010000    9:308500000 
3/24/200809:15:24  BALI1  8506000    9:305100000 
3/24/200809:29:27  BALI1  85010000    9:308500000 
3/24/200812:28:00  BALI1  85010000   12:308500000 
3/24/200812:28:07  BALI1  85010000   12:308500000 
3/24/200813:13:44  BALI1  86510   13:00865048679650

 

 

Please pay attention to last col. It has been generated by summing volumes that have same TRD_STCK_CD var. .So each TRD_STCK_CD obs. has just one Volume_Sum data.

Thank you.

aminkarimid
Lapis Lazuli | Level 10

Thank you Shmuel; But this code gives the last volume number ,and then It is multiplied by 2.
Thanks for your attention.

Shmuel
Garnet | Level 18

Assuming Volum_Sum is per TRD_STCK_CD in a TRD_STCK_CD  (i.e. - per date)

you shall achieve it by next code:

data sample_data;
 infile datalines;
 input TRD_EVENT_DT mmddyy10.
       TRD_EVENT_TM time8. 
	   TRD_STCK_CD  $5. 
	   TRD_PR       4.
       TRD_TUROVR   5. 
	   TRD_EVENT_ROUFOR timr5. 
	   VOLUME       8.
   ;
	   
datalines;
03/24/2008 12:28:01 ALBZ1 1537 10000 12:30 15370000
03/24/2008 13:13:44 ALBZ1 1567 10	 13:00 15670
03/24/2008 12:20:38 AZAB1 683  10000 12:30 6830000
03/24/2008 13:13:44 AZAB1 695  10	 13:00 6950
03/24/2008 09:14:57 BALI1 850  9260	 09:00 7871000
03/24/2008 09:15:06 BALI1 850  2000	 09:30 1700000
03/24/2008 09:15:14 BALI1 850  10000 09:30 8500000
03/24/2008 09:15:24 BALI1 850  6000	 09:30 5100000
03/24/2008 09:29:27 BALI1 850  10000 09:30 8500000
03/24/2008 12:28:00 BALI1 850  10000 12:30 8500000
03/24/2008 12:28:07 BALI1 850  10000 12:30 8500000
03/24/2008 13:13:44 BALI1 865  10	 13:00 8650
; run;
 
data want;
  set sample_data;
        by TRD_STCK_CD  TRD_STCK_CD  ;
            retain vol_sum;
			drop_vol_sum;
            if first.TRD_STCK_CD    then vol_sum=0;
            vol_sum = sum(of vol_sum, volume);
            if last.TRD_STCK_CD then do;
			   Volume_sum = vol_sum;
			   output;
			end;
run;

Please, next time attach the sample dataset as shown in first step.

aminkarimid
Lapis Lazuli | Level 10

Here is another solution:

http://stackoverflow.com/a/44034569/7996949

 

/*Sort by TRD_STCK_CD and temporal variables.*/
proc sort data=have out=have_sorted;
    by  TRD_STCK_CD
        TRD_EVENT_DT
        TRD_EVENT_TM;
run;

/*Sum VOLUME until the last of each TRD_STCK_CD is reached.*/
data want;
    set have_sorted;
    by  TRD_STCK_CD
        TRD_EVENT_DT
        TRD_EVENT_TM;
    retain tmp_volume_sum;
    tmp_volume_sum + VOLUME;
    if last.TRD_STCK_CD then do;
        Volume_Sum = tmp_volume_sum;
        call missing(tmp_volume_sum);
    end;
    drop tmp_:;
run;

 

 

Thank you.

aminkarimid
Lapis Lazuli | Level 10

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 14 replies
  • 3184 views
  • 2 likes
  • 4 in conversation