BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
pakms
Calcite | Level 5

Good days to all,

I have a SAS dataset (qtrforecast). I wish to calculate the frequency of meeting analysts' earnings forecast (MEET) in the past 4 quarters.

MEET = 1 if meeting forecast; else 0.

datadate = financial year-end.

fpedats = forecast date

cusip6 = companies' identification code

For example, for the company with cusip6 of "000360", its financial year-end = 31DEC2003, then earnings forecast for the past 4 quarters will be on 31MAR2003, 30JUN2003, 30SEP2003 and 31DEC2003. As only 1x of meeting forecast.

The "FREQ" = 1 and PERCENT_MEET = 1/4 = 0.25.

I have a problem to calculate this variable as datadate is not available for quarter 1 - 3. For observations without any datadate will be deleted.

I wish to get any useful help on this problem.

Thanks in advance for helps.

Regards,

pakms

1 ACCEPTED SOLUTION

Accepted Solutions
Haikuo
Onyx | Level 15

My bad, I took your data for granted, not checking the details, the month of your 'datadate' actually have been seen in every possible month of a year, from Jan to Dec, all of them. Here is another attempt, let me know how it works:

proc sql noprint;

create table want as

select  a.cusip6, sum(a.meet) as FREQ,

                calculated freq/count(distinct a.FPEDATS) as  PERCENT_MEET format=percent6.2,

                b.datadate

from  (select distinct cusip6, fpedats, meet,datadate from qtrforecast)    a

inner join

(select distinct cusip6, datadate from qtrforecast(where=(not missing(datadate)))) b

on a.cusip6=b.cusip6 and intnx('qtr',b.datadate,-3,'b') <= a.fpedats <= b.datadate

group by a.cusip6, b.datadate

;

quit;

Haikuo

Updated using quarter instead of year.

View solution in original post

22 REPLIES 22
Amir
PROC Star

Hi,

I am not sure I understand your requirements. Please provide some sample output data based on the input data you have provided.

Regards,

Amir.

pakms
Calcite | Level 5

Hi Amir,

I would like to have an output data (with the following 4 variables) as follows:

cusip6    FREQ    PERCENT_MEET    DATADATE

Thank you.

Regards,

pakms

Amir
PROC Star

Hi,

For the example you were working through for a cusip6 of "000360", please give the values you would like to see for the four columns you have specified.

Regards,

Amir.

pakms
Calcite | Level 5

Hi again,

For the cusip6 of "000360", the following should be shown:

cusip6    FREQ    PERCENT_MEET    DATADATE

000360      1               0.25                 31DEC2003


Thank you.


Regards,

Pakms

pakms
Calcite | Level 5

Hi again,

For the cusip6 of "000360", the following should be shown:

cusip6    FREQ    PERCENT_MEET    DATADATE

000360      1               0.25                 31DEC2003

000360      1               0.25                 31DEC2004

000360      1               0.25                 31DEC2005

000360      3               0.75                 31DEC2006

000360      3               0.75                 31DEC2007

000360      4               1.00                 31DEC2008

000360      3               0.75                 31DEC2009

000360      1               0.25                 31DEC2010


Any duplicate, for example the qtr4 of year ended 31 Dec 2004 and 2005 should be ignored.

 

This example is clearer.

Thank you.


Regards,

Pakms

Amir
PROC Star

Hi,

The sample output data has helped and I came up with the below code, but when I checked the results they did not all look right. Then I noticed that the input data was not entirely consistent as Art has pointed out.

data want(keep=cusip6 freq percent_meet datadate);

  set tmp1.Qtrforecast;

  retain freq         0

         old_datadate 0

  ;

  freq+sum(0,meet);

  if datadate ne . then

    if datadate=old_datadate then

    do;

      link reset;

      delete;

    end;

    else

    do;

      percent_meet=freq/4;

      output;

      link reset;

    end;

  return;

  reset:

    freq=0;

    old_datadate=datadate;

  return;

run;

Regards,

Amir.

pakms
Calcite | Level 5

Thanks Amir for suggestion.

But this output did not work well as it contains the percentage of meeting earnings' forecasts > 100%.

Amir
PROC Star

Hi,

Yes, that was one of the issues I noticed. How about the following?

data want(keep=cusip6 freq percent_meet datadate);

  set tmp1.Qtrforecast;

  by cusip6;

  retain freq         0

         old_datadate 0

  ;

  if datadate ne . then

    if datadate=old_datadate then

  do;

    link reset;

    delete;

  end;

  else

  do;

    freq+sum(0,meet);

    percent_meet=freq/4;

    output;

    link reset;

  end;

  if last.cusip6 then

    link reset;

  return;

  reset:

    freq=0;

    old_datadate=datadate;

  return;

run;

I essentially moved the freq calculation.

I also added some by processing to output data for records such as cusip6=02365V, which didn't have 4 quarters of data and so would otherwise not be output. I was not sure if you wanted these out or not.

Regards,

Amir.

art297
Opal | Level 21

Pakms,

There are a couple of discrepancies that you will have to decide how you want/should approach the problem.

First, there is one duplicate cusip, namely 000000, that is shared by more than one company.

Second, there are some companies in your data that changed their fiscal year within the span of time that your data covers.  How do you want to deal with those companies?

Third, there are some companies in your data that don't have any datadate for any of their records.

Art

pakms
Calcite | Level 5

Hi Art,

1. The duplicate cusip, with more than more than one companies' name - it assumes that the company changed its company's name. - same company assumed. 

2. there are some companies in your data that changed their fiscal year within the span of time that your data covers.  How do you want to deal with those companies? I have to calculate the frequency of meeting forecast within the last 4 quarters based on the datadate. There should be 4 times of estimates

3. there are some companies in your data that don't have any datadate for any of their records. - Companies without datdate will be ignored or deleted.

Thank you for your advise and concerns.

Regards,

pakms

Haikuo
Onyx | Level 15

Hi,

Like others have pointed out, your data itself needs some work. Here is a SQL approach:

proc sql noprint;

create table want as

select  cusip6,

                intnx('year',FPEDATS,0,'end') as datadate format=date9.,

                sum(meet) as FREQ,

                calculated freq/count(distinct FPEDATS) as  PERCENT_MEET format=percent6.2

        

from  (select distinct cusip6, fpedats, meet from qtrforecast)

group by cusip6, calculated datadate ;

quit;

Haikuo

pakms
Calcite | Level 5

Thanks Haikuo for your program. But I don't think this is the answer too.

If the datadate is missing for any firm, this firm should not be included in output. For example, if the datadate is 31MAR2005. Then the past 4 quarters forecast will be on 30JUN2004, 30SEP2004, 31DEC2004 and 31MAR2005.

cusip6 with "000000" should not be included as the datadate is not available. the "distinct" function for variables of  "cusip6","fpedats" and "meet" from qtrforecas do help in data cleaning as this reduces duplication of forecast for the same period.

Thank you.

Regards,

pakms

Haikuo
Onyx | Level 15

Oops, did not notice you have that requirement. There is an easy fix on my current code, you sacrifice some performance, but it gets the job done.

proc sql noprint;

create table want as

select  cusip6,

                intnx('year',FPEDATS,0,'end') as datadate format=date9.,

                sum(meet) as FREQ,

                calculated freq/count(distinct FPEDATS) as  PERCENT_MEET format=percent6.2

from  (select distinct cusip6, fpedats, meet from qtrforecast)

group by cusip6, calculated datadate

having cats(cusip6,datadate) in (select cats(cusip6,datadate) from qtrforecast);

quit;

Haikuo

pakms
Calcite | Level 5

Hi Haikuo,

I provided the datadate in the table as follows (only cusip6 of 000360 is shown):

FPEDATScusip6datadateMEET
31-Mar-03000360.0
30-Jun-03000360.0
30-Sep-03000360.0
31-Dec-0300036031-Dec-031
31-Mar-04000360.0
30-Jun-04000360.0
30-Sep-04000360.0
31-Dec-0400036031-Dec-041
31-Dec-0400036031-Dec-041
31-Mar-05000360.1
30-Jun-05000360.0
30-Sep-05000360.0
31-Dec-0500036031-Dec-050
31-Dec-0500036031-Dec-050
31-Mar-06000360.1
30-Jun-06000360.0
30-Sep-06000360.1
31-Dec-0600036031-Dec-061
31-Mar-07000360.1
30-Jun-07000360.1
30-Sep-07000360.0
31-Dec-0700036031-Dec-071
31-Mar-08000360.1
30-Jun-08000360.1
30-Sep-08000360.1
31-Dec-080036031-Dec-081
31-Mar-0900360.1
30-Jun-0900360.0
30-Sep-0900360.1
31-Dec-090036031-Dec-091
31-Mar-1000360.0
30-Jun-1000360.0
30-Sep-1000360.0
31-Dec-100036031-Dec-101
31-Mar-1100360.0
30-Jun-1100360.0
30-Sep-1100360.0
31-Dec-1100360.0

 

FPEDATS of 31MAR2003, 30JUN2003, 30SEP2003 and 31DEC2003 are 4 quarters within the same fiscal year-end of 31DEC2003 (datadate). So, the FREQ =  0+0+0+1 = 1; PERCENT_FREQ = 1/4 = 0.25.

The output for your suggested program - all the observations have the same financial year-end which is incorrect. Some firms might not have the same fiscal year-end of 31DEC.

If firms with year-end of 30SEP2004, then 31DEC2003, 31MAR2004, 30JUN2004 and 30SEP2004 should be within the same datadate of 30SEP2004.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 22 replies
  • 1652 views
  • 6 likes
  • 4 in conversation