Hi Astounding,
1) Thanks very much for your reply. I apologize for my unclear statement. Actually, there are some data missing within the 100 stocks, most likely for different one month. So, the date will not be continuously from 01-01-2015 to 31-12-2017, instead, some stocks will miss data for Jan 2017 or Nov 2017, etc. The interval will be continuously repeated from 00:00:00 to 23:59:00 for each date. Obviously, not every minute has a trading record. In this case, there will be an issue arise since I want to calculate the market return based on the same one-minute interval (i.e. by date interval). In this case, I have to exclude the stock which missing data for the specific month and include it back for later month calculation.
2) For example, let's say there are 5 stocks (A B C D and E). If there is no missing data, the one minute market return of A equals the accumulate same one minute return of BCDE divided by 4. However, if there is missing data on JAN 2017 for stock B, then for this month the market return for A should equal accumulate return of CDE divided by 3 (given stock B has no available data in this month).
3) The good thing is I already calculate return by using the following code ('midquote' is a variable that I use to calculate return but I did not show it in the following pics):
%macro flower(sourcelib=,from=,going=);
proc sql noprint; /*read datasets in a library*/
create table mytables as
select *
from dictionary.tables
where libname = &sourcelib
order by memname ;
select count(memname)
into:obs
from mytables;
%let obs=&obs.;
select memname
into : memname1-:memname&obs.
from mytables;
quit;
%do i=1 %to &obs.;
data
&going.&&memname&i;
set
&from.&&memname&i;
by date_g_ interval;
if last.interval then midquote = (Avg_BP+Avg_AP)/2;
run;
data
&going.&&memname&i;
set
&from.&&memname&i;
by date_g_ interval;
if last.interval then output;
run;
data
&going.&&memname&i;
set
&going.&&memname&i;
return = log(midquote/lag1(midquote));
run;
data
&going.&&memname&i;
set
&from.&&memname&i;
merge &going.&&memname&i &from.&&memname&i;
by date_g_ interval;
if last.interval then stock_return = return;
run;
%end;
%mend;
%flower(sourcelib='BEE',from=BEE.,going=LEAF.);
4) Here is the selected partial data sample for the ' last.interval ' in the attachment, data of each stock looks pretty much the same, just has different '_RIC'. As I mentioned before, I have 100 datasets for 100 stocks. The biggest issue for me is how to calculate the market return for each stock if there is the issue arise as I described in part 2).
I know I have to use %macro and %do loop to do this. But I don't know how to accumulate the value 'by date_g_ interval' from the rest different datasets when I calculate the market return for stock A at e.g. 01-01-2015 14:00:00, it means I should only accumulate the rest of stocks which have data for this specific time.
I hope I describe it better, and I sincerely appreciate your help.