BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
shalmali
Calcite | Level 5

Hello All,

I am trying to create four subsets from my current dataset. I am not good with SAS codes and thus am not able to construct them.

My main dataset looks as follows:

FIRM     YEAR      NAME   DUMMY    CHANGE

A             2004       Mr. X     0              0

A             2005       Mr. X     0              0

A             2006       Mr. Y     0              1

A             2007       Mr. Y     0              0

A             2008       Mr. Y     0              0

B             2004       Mr. S     0              0

B             2005       Mr. S     0              0

B             2006       Mr. T     1              1

B             2007       Mr. T     1              0

B             2008       Mr. T     1              0

C             2004       Mr. M   1              0

C             2005       Mr. M   1              0

C             2006       Mr. M   1              0

C             2007       Mr. N    1              1

C             2008       Mr. N    1              0

D             2004       Mr. R     1              0

D             2005       Mr. R     1              0

D             2006       Mr. V     0              1

D             2007       Mr. V     0              0

D             2008       Mr. V     0              0

Name is the name of the CEO in a firm. Dummy is 1 if the CEO meets some condition and 0 otherwise. CHANGE is 1 when there is change in the CEO. As you may see, there are some firms like A with DUMMY to be 0 for all observations, some firms like C with DUMMY to be 1 for all observations, some firms like B with DUMMY to be 0 for years before the CEO change and 1 for new CEO firm years, and some firms like D with DUMMY to be 1 for years before the CEO change and 0 for the new CEO firm year.

I am trying to create four new subsets which will be as follows:

Subset 1 will have only those firms whose DUMMY was 0 for all years. That is

FIRM     YEAR      NAME   DUMMY    CHANGE

A             2004       Mr. X     0              0

A             2005       Mr. X     0              0

A             2006       Mr. Y     0              1

A             2007       Mr. Y     0              0

A             2008       Mr. Y     0              0

,

Subset 2 will have only those firms whose DUMMY was 1 for all the years. That is

FIRM     YEAR      NAME   DUMMY    CHANGE

C             2004       Mr. M   1              0

C             2005       Mr. M   1              0

C             2006       Mr. M   1              0

C             2007       Mr. N    1              1

C             2008       Mr. N    1              0

.

Subset 3 will have firms whose DUMMY was 0 before the CEO change and 1 after the CEO change. That is,

FIRM     YEAR      NAME   DUMMY    CHANGE

B             2004       Mr. S     0              0

B             2005       Mr. S     0              0

B             2006       Mr. T     1              1

B             2007       Mr. T     1              0

B             2008       Mr. T     1              0

Subset 4 will have firms whose DUMMY was 1 before the CEO change and 0 after the CEO change. That is,

FIRM     YEAR      NAME   DUMMY    CHANGE

D             2004       Mr. R     1              0

D             2005       Mr. R     1              0

D             2006       Mr. V     0              1

D             2007       Mr. V     0              0

D             2008       Mr. V     0              0

I would really appreciate if someone share a code with me.

Thank you,

SS

1 ACCEPTED SOLUTION

Accepted Solutions
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Hi,

Have done the first three.  You can work out the fourth from the code, just need to mix up counts on the where - have to leave now.

data have;
attrib FIRM format=$1.  YEAR format=best.   NAME format=$200.   DUMMY    CHANGE format=best.;
infile datalines delimiter=',';
input firm $ year name $ dummy change;
datalines;
A,2004,Mr.X,0,0
A,2005,Mr.X,0,0
A,2006,Mr.Y,0,1
A,2007,Mr.Y,0,0
A,2008,Mr.Y,0,0
B,2004,Mr.S,0,0
B,2005,Mr.S,0,0
B,2006,Mr.T,1,1
B,2007,Mr.T,1,0
B,2008,Mr.T,1,0
C,2004,Mr.M,1,0
C,2005,Mr.M,1,0
C,2006,Mr.M,1,0
C,2007,Mr.N,1,1
C,2008,Mr.N,1,0
D,2004,Mr.R,1,0
D,2005,Mr.R,1,0
D,2006,Mr.V,0,1
D,2007,Mr.V,0,0
D,2008,Mr.V,0,0
;
run;

proc sql;
  create table subset1 as
  select  *
  from    have
  group by firm
  having   sum(dummy)=0;
  create table subset2 as
  select  *
  from    have
  group by firm
  having   sum(dummy)=5;
quit;

data tmp;
  set have;
  length flag $4;
  retain flag;
  by firm;
  if first.firm then flag='pre';
  if change=1 then flag='on';
  if change=0 and flag='on' then flag='post';
run;

proc sql;
  create table subset3 as
  select  *
  from    have
  where   firm in (select distinct firm from (select * from tmp where flag="pre") group by firm having sum(dummy)=0)
     and  firm in (select distinct firm from (select * from tmp where flag="post") group by firm having sum(dummy)>0);
quit;

View solution in original post

2 REPLIES 2
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Hi,

Have done the first three.  You can work out the fourth from the code, just need to mix up counts on the where - have to leave now.

data have;
attrib FIRM format=$1.  YEAR format=best.   NAME format=$200.   DUMMY    CHANGE format=best.;
infile datalines delimiter=',';
input firm $ year name $ dummy change;
datalines;
A,2004,Mr.X,0,0
A,2005,Mr.X,0,0
A,2006,Mr.Y,0,1
A,2007,Mr.Y,0,0
A,2008,Mr.Y,0,0
B,2004,Mr.S,0,0
B,2005,Mr.S,0,0
B,2006,Mr.T,1,1
B,2007,Mr.T,1,0
B,2008,Mr.T,1,0
C,2004,Mr.M,1,0
C,2005,Mr.M,1,0
C,2006,Mr.M,1,0
C,2007,Mr.N,1,1
C,2008,Mr.N,1,0
D,2004,Mr.R,1,0
D,2005,Mr.R,1,0
D,2006,Mr.V,0,1
D,2007,Mr.V,0,0
D,2008,Mr.V,0,0
;
run;

proc sql;
  create table subset1 as
  select  *
  from    have
  group by firm
  having   sum(dummy)=0;
  create table subset2 as
  select  *
  from    have
  group by firm
  having   sum(dummy)=5;
quit;

data tmp;
  set have;
  length flag $4;
  retain flag;
  by firm;
  if first.firm then flag='pre';
  if change=1 then flag='on';
  if change=0 and flag='on' then flag='post';
run;

proc sql;
  create table subset3 as
  select  *
  from    have
  where   firm in (select distinct firm from (select * from tmp where flag="pre") group by firm having sum(dummy)=0)
     and  firm in (select distinct firm from (select * from tmp where flag="post") group by firm having sum(dummy)>0);
quit;

shalmali
Calcite | Level 5

Thanks a lot.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 714 views
  • 0 likes
  • 2 in conversation