BookmarkSubscribeRSS Feed
mspak
Quartz | Level 8

Dear all,

I wish match each firm-year (tic-fyear) observation with another from the same two-digit SIC code (sic2) and year (fyear) with the closest return on assets in the current year, ROA1. Then, I wish to obtain a performance-matched AEM (ie Jones_PDA) for all firm-years,  by subtracting the AEM of the firm with the closest ROA that is in the same industry as the firm.

The closest ROA = min(abs(a.ROA1, -b.ROA1)). The lowest difference between ROA1 would be defined as closest ROA1.

I ran a sql using the following program, however, it provides Jones_PDA for all the matched firms. I only wish to have the Jones_PDA for the matched firm with closest ROA instead. It does mean that I wish to have Jones_PDA for only ONE matched firm.

proc sql;

create table Jones_matchedROA1 as

select a.tic,a.fyear,a.sic2,

min(abs(a.ROA1, -b.ROA1)) as closest_ROA1,

sum(a.aem,-b.aem) as Jones_PDA

from geog.data a, geog.data b

where a.tic ne b.tic and a.fyear=b.fyear and a.sic2=b.sic2

group by a.tic,a.fyear,a.sic2

order by a.tic,a.fyear;

quit;

Could anyone guide me on how to modify the program/code to achieve my desired output?

Thank you in advance for advices.

Regards,

mspak

11 REPLIES 11
Ksharp
Super User

I can't open your attachment. Maybe it is corroded.

And What outpu do you want ?

If you only need to remove replicated observations. Proc sort + nodupkey is a good tool.

Ksharp

mspak
Quartz | Level 8

Thanks Ksharp,

I changed the file name as "mydata" for this new file. I have the following variables:

1. tic - company identification code

2. fyear - financial year

3. ROA1 - performance measure, returns on assets

4. SIC2 - 2-digit SIC industry classification code

5. AEM - it is a residual as an output from proc reg

I wish to have an output file with the following columns:

1. tic

2. fyear

3. Jones_PDA

Jones_PDA is the difference between the firm's AEM with the another matched firm in the following criteria:

a) same 2-digit SIC code

b) same fyear

c) with the closest ROA1.

The lowest difference between ROA1 of the firm itself and the matched firm would be defined as closest ROA1. As such, I calculate the difference as = min(abs(a.ROA1, -b.ROA1))

There are many firms can be matched with the firm based on SIC code and fyear. However, I would wish to restrict the ONLY ONE matched firm, that is, the one with the lowest difference in terms of their ROA1. In order to avoid self-matched, so I provide a restriction a.tic ne b.tic.

Finally, I would wish to calculate the differences in AEM between the firm and the matched firm (ie. Jones_PDA).

I hope this make my points clearer.

Thank you in advance.

mspak

mspak
Quartz | Level 8

Hi again,

One correction:

The lowest difference between ROA1 of the firm itself and the matched firm would be defined as closest ROA1. As such, I calculate the difference as = min(abs(sum(a.ROA1, -b.ROA1))).

1. the difference between ROA1

2. as the difference can be positive and negative, so I used ABS (absolute value).

3. In order to further restrict to only firm, so I should choose the lowest value.

Thanks again.

mspak

mspak
Quartz | Level 8

Hi all,

This is the file. I tried to upload the file from my notebook's desktop instead of my hard disk. I am not sure whether it works.

mspak

mspak
Quartz | Level 8

Hi all,

I tried to open this file. This data can be opened.

Thank you.

Regards,

MEI SEN

Ksharp
Super User

OK. MEI SEN

I noticed a problem in your code.

min(abs(a.ROA1-b.ROA1)) as closest_ROA1,

sum(a.aem-b.aem) as Jones_PDA

You don't need a comma between them.

If you remove comma, maybe the result is what you need.

Ksharp

mspak
Quartz | Level 8

Thanks Ksharp again,

As I used sum() function, so I include comma between them. If I remove the comma, the following error message appeared:

ERROR: Summary functions nested in this way are not supported.

I used SUM instead of operator of MINUS (ie. -) as SUM will provide a value even there is missing value either on a.ROA1 or b.ROA1.

Regards,

Mspak

mspak
Quartz | Level 8

Hi Ksharp,

I ran the following program, it seems leading to my desired output:

proc sql;

create table Jones_matchedROA1 as

select unique a.tic,a.fyear,a.sic2,a.aem, a.roa1, b.roa1,

abs(sum(a.ROA1, -b.ROA1))as DIFF_ROA1, min(calculated DIFF_ROA1)as CLOSEST_ROA1,

sum(a.aem,-b.aem)as Jones_PDA

from geog.mydata a, geog.mydata b

where a.tic ne b.tic and a.fyear=b.fyear and a.sic2=b.sic2

group by a.tic,a.fyear,a.sic2

having DIFF_ROA1=CLOSEST_ROA1;

quit;

Only ONE observation per firm-year will be derived by providing "having DIFF_ROA1=CLOSEST_ROA1" clause, that is the one with lowest difference.

Can anyone confirm this is correct? 

Thank you very much.

Regards,

MEI SEN

Ksharp
Super User

En....

You code looks good.

having DIFF_ROA1=CLOSEST_ROA1;

Only pick up the observation which has the minimize value for each group.

Ksharp

mspak
Quartz | Level 8

Thanks Ksharp and other experts' help,

I have learned a lot from the discussion in SAS forum.

Smiley HappySmiley HappySmiley Happy

Regards,

mspak

fafrin420
Fluorite | Level 6

Hi,

Good afternoon. I am working on the same replication paper: SP Kothari 2005. Could you tell me if the you have encountered the problem that a performance matched firm can bot be obtained? I mean I used your code with the modification of Ksharp. I did not find any missing var for the ***PDAs.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 11 replies
  • 3327 views
  • 0 likes
  • 3 in conversation