turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- Base SAS Programming
- /
- match data find means

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-31-2013 09:46 PM

Hi, all

My unit of analysis is firm-year. main interest is variable disclose (binary 0,1).

First, I will partition the sample into two sets : set ONE contains firm-year observations that have disclose=0, and set TWO contains firm-year observations that have disclose=1,

What I want to know is : for set ONE observations that share the same industry code with set ONE observations, what is the mean value of disclose,

Similarly, for set TWO observations that share the same industry code with set ONE observations, what is the mean value of disclose,

that is, I try to show whether industry peers in set ONE are more or less likely to disclose compared with industry peers in set TWO.

What would be the best way of coding this? Attached is hypothetical data. Feel free to expand it.

Thanks!

firm | year | industry | disclose |

a | 2000 | 1 | 0 |

a | 2001 | 1 | 0 |

a | 2002 | 1 | 1 |

a | 2003 | 1 | 1 |

b | 2000 | 1 | 0 |

b | 2001 | 1 | 1 |

c | 2001 | 2 | 0 |

c | 2002 | 2 | 1 |

c | 2003 | 2 | 1 |

c | 2004 | 2 | 1 |

d | 2001 | 2 | 0 |

d | 2002 | 2 | 1 |

d | 2003 | 2 | 0 |

e | 2001 | 2 | 0 |

e | 2002 | 2 | 1 |

e | 2003 | 2 | 1 |

e | 2004 | 2 | 1 |

Lan

Accepted Solutions

Solution

09-02-2013
10:24 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-02-2013 10:24 PM

From the little I understand, I looks like you want to do this:

**data test;****input firm $ year industry disclose;****datalines; ****a 2000 1 0 ****a 2001 1 0 ****b 2000 1 0 ****c 2001 1 0 ****d 2001 2 0 ****d 2002 2 0 ****e 2001 2 0 ****a 2002 1 1 ****a 2003 1 1 ****b 2001 1 1 ****c 2000 1 1 ****c 2002 1 1 ****c 2003 1 1 ****c 2004 1 1 ****d 2002 2 1 ****e 2002 2 1 ****e 2003 2 1 ****e 2004 2 1 ****;**

** **

**proc sql;****create table disc as****select t1.disclose, t1.industry, t1.firm, t1.year, **** mean(t2.disclose) as meanCompetitorDisclosure****from test as t1 left join test as t2**** on t1.year=t2.year and t1.industry=t2.industry and t1.firm ne t2.firm****group by t1.disclose, t1.industry, t1.firm, t1.year;**

**select * from disc;**

**quit;**

** **

**proc ttest data=disc;****class disclose;****var meanCompetitorDisclosure;****run;**

PG

PG

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-01-2013 11:25 AM

It sounds like splitting the data is too drastic a step. For example, the mean DISCLOSE for any subset of ONE observations has to be 0. And the mean DISCLOSE for any subset of TWO observations has to be 1. Maybe a better starting point would be to create two summaries: which industry codes have any DISCLOSE=0 observations, and which industry codes have any DISCLOSE=1 observations. It's a little unclear to me whether that should be at the industry code level or the industry code + year level. At any rate, if you were to take your sample data and show what numbers you would like to end up with, that would help.

Good luck.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-02-2013 01:10 AM

Hi LanMin,

Is this what you are trying to achieve?

PROC MEANS DATA=HAVE NONOBS NOPRINT NWAY MISSING;

CLASS INDUSTRY;

VAR DISCLOSE;

OUTPUT OUT=WANT (DROP =_ MEAN=;

RUN;

Regards,

Scott

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-02-2013 10:56 AM

thanks to both of you.

Scott, your code is not what I want exactly.

i paste my question again here, and give an example (see ********** in the end):

My unit of analysis is firm-year. main interest is variable disclose (binary 0,1).

First, I will partition the sample into two sets : set ONE contains firm-year observations that have disclose=0, and set TWO contains firm-year observations that have disclose=1,

What I want to know is : for set ONE observations that share the same industry code with set ONE observations, what is the mean value of disclose,

Similarly, for set TWO observations that share the same industry code with set ONE observations, what is the mean value of disclose,

that is, I try to show whether industry peers in set ONE are more or less likely to disclose compared with industry peers in set TWO.

***************

example:

My original data would be partition into

disclose =0

firm | year | industry | disclose |

a | 2000 | 1 | 0 |

a | 2001 | 1 | 0 |

b | 2000 | 1 | 0 |

c | 2001 | 1 | 0 |

d | 2001 | 2 | 0 |

d | 2002 | 2 | 0 |

e | 2001 | 2 | 0 |

disclose =1

firm | year | industry | disclose |

a | 2002 | 1 | 1 |

a | 2003 | 1 | 1 |

b | 2001 | 1 | 1 |

c | 2000 | 1 | 1 |

c | 2002 | 1 | 1 |

c | 2003 | 1 | 1 |

c | 2004 | 1 | 1 |

d | 2002 | 2 | 1 |

e | 2002 | 2 | 1 |

e | 2003 | 2 | 1 |

e | 2004 | 2 | 1 |

for firm a in year 2000, industry=1, it has 2 competitors in year 2000, b and c, b's disclose=0 and c's disclose=1, so what I want is to output the average disclose value of these two competitors as 0.5.

what I want as final results is a table with mean values in 2 cells below,

disclose =1 | disclose =0 | |

peers average diclose value |

then do a t-test to see if peers are more or less likely to disclose in each subsample (i.e. disclose =1 vs. disclose =0 subset)..

thanks,

Lan

Solution

09-02-2013
10:24 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-02-2013 10:24 PM

From the little I understand, I looks like you want to do this:

**data test;****input firm $ year industry disclose;****datalines; ****a 2000 1 0 ****a 2001 1 0 ****b 2000 1 0 ****c 2001 1 0 ****d 2001 2 0 ****d 2002 2 0 ****e 2001 2 0 ****a 2002 1 1 ****a 2003 1 1 ****b 2001 1 1 ****c 2000 1 1 ****c 2002 1 1 ****c 2003 1 1 ****c 2004 1 1 ****d 2002 2 1 ****e 2002 2 1 ****e 2003 2 1 ****e 2004 2 1 ****;**

** **

**proc sql;****create table disc as****select t1.disclose, t1.industry, t1.firm, t1.year, **** mean(t2.disclose) as meanCompetitorDisclosure****from test as t1 left join test as t2**** on t1.year=t2.year and t1.industry=t2.industry and t1.firm ne t2.firm****group by t1.disclose, t1.industry, t1.firm, t1.year;**

**select * from disc;**

**quit;**

** **

**proc ttest data=disc;****class disclose;****var meanCompetitorDisclosure;****run;**

PG

PG

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-04-2013 09:12 AM

thanks so much PG !!