turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- General Programming
- /
- How to calculate a ratio using time series data

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-28-2013 12:39 PM

Hi good day everyone,

I am at present, a complete novice at using SAS, and I am tasked with having to utilize this software to conduct statistical analysis for a project. Thus far using ‘proc tabulate’, I have been able to garner the basic statistics necessary for describing the financial data. The SAS data file consists of the variables: stock symbol (ticker), the volume traded (tvol), trade price (tprice), the trade value (tval), the client who initiated the trade (client), trade date (tdate), the time of the trade (rt), the sign indicating whether the trade initiated was a buy/sell (sign) as well as industry sector (sector) and the market the stock traded belongs to (market). However, the major part of the data analysis requires me to calculate the Sias (2004) financial herding measure.

The calculation process for the measure consists of multiple steps, at the moment though, I am having trouble getting past step (1):

In this scenario a client/institution is considered a buyer if their overall activity in stock *i*, during the time period t (measured in weeks) results in a cumulative buy.

If I can determine the number of clients that are buying, that would put me a long way ahead to completing the calculation for the measure.

Thus far I have attempted to clean the data as well as calculate the cumulative position (stvol) from the trades in hopes that it might enable me to determine Step (1):

data ttsetrades;

set TMP1.ttsetrades;

run;

data a (drop = market ticket buyer seller);

set work.ttsetrades ( where = (market = 'First Tier'));

run;

proc sort;

by client ticker tdate;

run;

data b;

do until (last.tdate);

set a;

by client ticker tdate;

stvol = sum(stvol, tvol*sign);

end;

run;

Firstly am I heading in the right direction? And if I am how do I move on from here to find the ratio?

Any help given will be so much appreciated.

Thanks.

Aleres.

Accepted Solutions

Solution

06-29-2013
10:40 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-29-2013 10:40 PM

Hi Aleres, you were getting pretty close. The main problem was the reference to the wrong variable in the DO UNTIL statement.This is what I woulld do to get the *financial herding measures* (untested):

**data a (drop = market buyer seller tdate);****set TMP1.ttsetrades(where = (market = 'First Tier'));****timePeriod = intnx("WEEK", tdate, 0);****format timePeriod date9.;****run;**

** **

**proc sort data=a; **

**by ticker timePeriod client;****run;**

** **

**data b(drop=tvol sign);****do until (last.client);**** set a; **** by ticker timePeriod client;**** stvol = sum(stvol, tvol*sign);**** end;****run;**

** **

**data herding(keep=timePeriod ticker buyingClients totalClients herding);****buyingClients = 0; totalClients = 0;****do until(last.timePeriod);**** set b; **** by ticker timePeriod;**** buyingClients + (stvol > 0);**** totalClients + 1;**** end;****herding = buyingClients/totalClients;****run;**

PG

Message was edited by: PG Added a weekly time period.

PG

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-29-2013 10:15 PM

It's great that you explain the data and posted a formula. It's still kind of hard for me to fully understand your data though.

I believe you would get the best answers if you can post some data sample (eg. a data step creating such a sample and/or a csv attachment).

Then explain what you have and how the desired end result should look like. Give us the expected measure as a number for a specific case for which you also provide source data. This will allow us to test our code proposals.

Please give us also an idea about the volumes you're dealing with (thousands or hundred of millions of rows?) and where your source data is stored (a SAS table or a data base table).

You explain how a buyer is defined. What's the definition of a seller?

Below some code creating a table with buyer sums per week - but as I don't understand your data it won't be correct.

data work.ttsetrades;

infile datalines truncover dlm=',' dsd;

input client $ ticker $ tdate:date9. tvol sign;

format tdate date9.;

market='First Tier';

datalines;

A,X,01jun2013,5,1

A,X,02jun2013,4,-1

A,X,03jun2013,9,-1

A,X,04jun2013,6,1

B,X,01jun2013,5,1

B,X,02jun2013,4,-1

B,X,03jun2013,9,1

B,X,04jun2013,6,1

;

run;

proc sql;

create table buyer as

select

client

,ticker

,intnx('week',tdate,0,'b') as Week_Commencing_Date format=date9.

,sum(tvol*sign) as stvol

,sum(tvol*sign)>0 as buyer_flag

from work.ttsetrades

where market='First Tier'

group by client ,ticker ,calculated Week_Commencing_Date

;

create table buyer_sum as

select ticker ,Week_Commencing_Date ,sum(buyer_flag) as buyer_sum

from buyer

group by ticker, Week_Commencing_Date

;

quit;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-01-2013 07:52 AM

Hi Patrick

Thanks for responding.

Regarding your comments, I am dealing with about 205,00 trades/observations before selecting only those in the first tier market, concerning the source in which my data is stored it’s a SAS data set.

Concerning how the seller would be defined, I determined that once the number of buyers was determined then # of sellers = total - # of buyers…. however it just occurred to me that there may be clients that are neither sellers nor buyers.

As I have confirmed that that is the case, I define that a seller is a client with an stvol <0 and a buyer is a client with stvol >0?

Solution

06-29-2013
10:40 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-29-2013 10:40 PM

Hi Aleres, you were getting pretty close. The main problem was the reference to the wrong variable in the DO UNTIL statement.This is what I woulld do to get the *financial herding measures* (untested):

**data a (drop = market buyer seller tdate);****set TMP1.ttsetrades(where = (market = 'First Tier'));****timePeriod = intnx("WEEK", tdate, 0);****format timePeriod date9.;****run;**

** **

**proc sort data=a; **

**by ticker timePeriod client;****run;**

** **

**data b(drop=tvol sign);****do until (last.client);**** set a; **** by ticker timePeriod client;**** stvol = sum(stvol, tvol*sign);**** end;****run;**

** **

**data herding(keep=timePeriod ticker buyingClients totalClients herding);****buyingClients = 0; totalClients = 0;****do until(last.timePeriod);**** set b; **** by ticker timePeriod;**** buyingClients + (stvol > 0);**** totalClients + 1;**** end;****herding = buyingClients/totalClients;****run;**

PG

Message was edited by: PG Added a weekly time period.

PG

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-01-2013 07:55 AM

Hi PG

Thank you, I ran through the code you sent and it looks great thus far, I do have some questions though….

If I want to aggregate the herding values of each ticker to yearly using the weekly values calculated from the code you sent, how would I go about doing so?

Also regarding my concern with there being clients that have stvol = 0, i.e. neither the buyers nor sellers, can I do the following?

--Between the ‘data b’ step and the ‘data herding' step from your code…

data b(drop=tvol sign);

do until (last.client);

set a;

by ticker timePeriod client;

stvol = sum(stvol, tvol*sign);

end;

run;

data c11;

set b ;

where stvol <0;

run;

data c12;

set b1;

where stvol >0;

run;

data c;

set c11 c12;

by ticker timePeriod client;

run;

data herding(keep=timePeriod ticker buyingClients totalClients herding);

buyingClients = 0; totalClients = 0;

do until(last.timePeriod);

set c;

by ticker timePeriod;

buyingClients + (stvol > 0);

totalClients + 1;

end;

herding = buyingClients/totalClients;

run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-01-2013 11:43 AM

You would have to be more precise about what kind of aggregation you need.

To exclude stvol=0 cases from the calculations, note that they are already excluded in the statement **buyingClients + (stvol > 0);** all you need, is to change **totalClients + 1;** to **totalClients + (stvol ne 0);** This might generate some warnings when dividing by zero, so you might want to test for that and generate a missing herding value or skip the time period.

PG

PG

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-01-2013 06:12 PM

Regarding the aggregation, I would like to be able to report for a particular stock(s), an annual herding value. For the particular stock/ticker, can I take all the weekly herding values calculated, for a specific year, and average them to create the annual herding value that I'm seeking? Hope that I explained that better...

Also can you advise me in step 2 & 3 shown below...

for step (2) I did:

**data herdratio;**

** set herding;**

** zherding = herding;**

**run;**

**proc standard data = herdratio mean = 0 std = 1 out = zherdingratio**

** var zherding;**

**run;**

step (3) on the other hand completely befuddles me... I've come across multiple procedures including **proc reg **which I tried and didn't work, for I believe I was missing information.

**proc reg data = zherdingratio;**

** model zherding = beta*lag(****herding) + e; *----- problem defining beta--------; **

**run;**

I also looked at a sample for how to estimate an AR(1) Process (as step (3) is such a process) but they referred to the system of equations below.

Aleres

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-01-2013 09:53 PM

To get yearly average of weekly herding, replace the last step by :

**data herding(keep=year timePeriod ticker buyingClients totalClients herding);****buyingClients = 0; totalClients = 0;****do until(last.timePeriod);**** set b; **** by ticker timePeriod;**** buyingClients + (stvol > 0);**** totalClients + (stvol ne 0);**** end;****herding = buyingClients/totalClients;****year = intnx("YEAR",timePeriod,0);****format year year4.;****run;**

**proc means data=herding noprint nway;****by ticker year;****var herding buyingClients totalClients;****output out=yearlyHerding **** mean(herding)=meanHerding **** sum(buyingClients totalClients)=yearlyBuying yearlyClients;****run;**

Note that weeks without activity for a given stock will not be included in the yearly average and that all weeks with some activity will contribute with the same weight to the yearly average.

PG

PG

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-05-2013 11:41 AM

Hi PG,

I have a question, is it possible to conduct Binomial Distribution testing on the herding ratio values that were initially calculated? I came across an article that utilised the binomial testing to indicate whether the herding ratios found were statistically significant i n indicating buy/sell herding within a stock.

The authors of the article infer that herding was "associated to the observation of a high value or low value of *h* by evaluating the probability to observe a number of buying firms equal or larger than the empirically detected one under the binomial null hypothesis." (Lillo et al. 2008)

They tested using a significance level of 5%, ratios that generated probability values less that 5% were inferred to indicate herding.

My research thus far in binomial testing has led me to the following code, if binomial testing on the ratios is possible can you tell me if the code below is correct, thanks.

**proc freq data=herding;**

** tables Herding / Binomial (p=0.50) alpha=0.05;**

** exact binomial;**

** weight freq;**

**run;**

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-05-2013 04:09 PM

You might benefit from reposting this question to the Statistical Procedures Community (with some contextual introduction).

The binomial test reports the probability of observing a proportion further away from expected, given the number of INDEPENDENT observations performed.

PG

PG

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-06-2013 09:52 AM

Those ratios look bounded between zero and one, so they probably follow some sort of beta distribution. A generalized linear model (GENMOD or GLIMMIX) could answer the OP's question.

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-01-2013 10:21 PM

In step 2, you are standardizing by the average and standard deviation of ALL herding values (all stocks, all years), is that what's required?

In step 3, you need a regression coefficient for each year, the same for all stocks. Is that right?

PG

PG

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-04-2013 09:27 PM

Hi, sorry for the delay, however after a meeting it was decided that the Sias method would be somewhat inappropriate for the research.

I just want to say thank you for all your help for it was much appreciated.

Aleres