@phillylee wrote:
Thank you for your feedback. To answer some of your issues raised
1. multiple dates: I didn't know adding ":" would do the trick! thank you!
I should have mentioned: you are ignoring the variable SYM_SUFFIX at your peril. If a company has multiple share classes trading, then a the corresponding SYM_ROOT will have multiple SYM_SUFFIX values. By sorting your data by DATE RTIME SYM_ROOT, you risk interleaving multiple share classes for a given SYM_ROOT, thereby conflating PRICE and SIZE values for two different equities. Now it might be that your two sym_root's (AA and AAL) only have one share class, so you'd be lucky. But, for any sorting or use of BY statements, I'd use DATE SYM_ROOT SYM_SUFFIX DATE, and I'd include SYM_SUFFIX in any class statements. And for any sql statements you should match on sym_suffix as well as sym_root.
The CTM datasets (one dataset per date) are sorted by SYM_ROOT SYM_SUFFIX TIME. So I would suggest dispensing with your own sorts. In your first set statement you read multiple dates, so data set dailytrade is defacto sorted by DATE SYM_ROOT SYM_SUFFIX TIME.
Also in your by statements, use BY DATE SYM_ROOT SYM_SUFFIX TIME. Each CTM dataset is sorted by those three variables. I suspect in the case of your companies, there is probably only a blank SYM_SUFFIX (i.e. only one share class trading per SYM_ROOT). But if a given SYM_ROOT has multiple suffixes, you have a problem, because later on you sort by DATE RTIME SYM_ROOT, which means that multiple share classes can be interleaved, thereby contaminating your PRICE and SIZE values.
Also
3. event time: stocks I am studying are mostly large firms. There will most likely be trade at a given time. I will have to see whether there is any incidence where there is no trade.
You assumption is probably true for market trading hours, but you asked for time stamps well outside those hours, so be aware.
... View more