BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
shalmali
Calcite | Level 5

Hello All,

I am trying to create four subsets from my current dataset. I am not good with SAS codes and thus am not able to construct them.

My main dataset looks as follows:

FIRM     YEAR      NAME   DUMMY    CHANGE

A             2004       Mr. X     0              0

A             2005       Mr. X     0              0

A             2006       Mr. Y     0              1

A             2007       Mr. Y     0              0

A             2008       Mr. Y     0              0

B             2004       Mr. S     0              0

B             2005       Mr. S     0              0

B             2006       Mr. T     1              1

B             2007       Mr. T     1              0

B             2008       Mr. T     1              0

C             2004       Mr. M   1              0

C             2005       Mr. M   1              0

C             2006       Mr. M   1              0

C             2007       Mr. N    1              1

C             2008       Mr. N    1              0

D             2004       Mr. R     1              0

D             2005       Mr. R     1              0

D             2006       Mr. V     0              1

D             2007       Mr. V     0              0

D             2008       Mr. V     0              0

Name is the name of the CEO in a firm. Dummy is 1 if the CEO meets some condition and 0 otherwise. CHANGE is 1 when there is change in the CEO. As you may see, there are some firms like A with DUMMY to be 0 for all observations, some firms like C with DUMMY to be 1 for all observations, some firms like B with DUMMY to be 0 for years before the CEO change and 1 for new CEO firm years, and some firms like D with DUMMY to be 1 for years before the CEO change and 0 for the new CEO firm year.

I am trying to create four new subsets which will be as follows:

Subset 1 will have only those firms whose DUMMY was 0 for all years. That is

FIRM     YEAR      NAME   DUMMY    CHANGE

A             2004       Mr. X     0              0

A             2005       Mr. X     0              0

A             2006       Mr. Y     0              1

A             2007       Mr. Y     0              0

A             2008       Mr. Y     0              0

,

Subset 2 will have only those firms whose DUMMY was 1 for all the years. That is

FIRM     YEAR      NAME   DUMMY    CHANGE

C             2004       Mr. M   1              0

C             2005       Mr. M   1              0

C             2006       Mr. M   1              0

C             2007       Mr. N    1              1

C             2008       Mr. N    1              0

.

Subset 3 will have firms whose DUMMY was 0 before the CEO change and 1 after the CEO change. That is,

FIRM     YEAR      NAME   DUMMY    CHANGE

B             2004       Mr. S     0              0

B             2005       Mr. S     0              0

B             2006       Mr. T     1              1

B             2007       Mr. T     1              0

B             2008       Mr. T     1              0

Subset 4 will have firms whose DUMMY was 1 before the CEO change and 0 after the CEO change. That is,

FIRM     YEAR      NAME   DUMMY    CHANGE

D             2004       Mr. R     1              0

D             2005       Mr. R     1              0

D             2006       Mr. V     0              1

D             2007       Mr. V     0              0

D             2008       Mr. V     0              0

I would really appreciate if someone share a code with me.

Thank you,

SS

1 ACCEPTED SOLUTION

Accepted Solutions
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Hi,

Have done the first three.  You can work out the fourth from the code, just need to mix up counts on the where - have to leave now.

data have;
attrib FIRM format=$1.  YEAR format=best.   NAME format=$200.   DUMMY    CHANGE format=best.;
infile datalines delimiter=',';
input firm $ year name $ dummy change;
datalines;
A,2004,Mr.X,0,0
A,2005,Mr.X,0,0
A,2006,Mr.Y,0,1
A,2007,Mr.Y,0,0
A,2008,Mr.Y,0,0
B,2004,Mr.S,0,0
B,2005,Mr.S,0,0
B,2006,Mr.T,1,1
B,2007,Mr.T,1,0
B,2008,Mr.T,1,0
C,2004,Mr.M,1,0
C,2005,Mr.M,1,0
C,2006,Mr.M,1,0
C,2007,Mr.N,1,1
C,2008,Mr.N,1,0
D,2004,Mr.R,1,0
D,2005,Mr.R,1,0
D,2006,Mr.V,0,1
D,2007,Mr.V,0,0
D,2008,Mr.V,0,0
;
run;

proc sql;
  create table subset1 as
  select  *
  from    have
  group by firm
  having   sum(dummy)=0;
  create table subset2 as
  select  *
  from    have
  group by firm
  having   sum(dummy)=5;
quit;

data tmp;
  set have;
  length flag $4;
  retain flag;
  by firm;
  if first.firm then flag='pre';
  if change=1 then flag='on';
  if change=0 and flag='on' then flag='post';
run;

proc sql;
  create table subset3 as
  select  *
  from    have
  where   firm in (select distinct firm from (select * from tmp where flag="pre") group by firm having sum(dummy)=0)
     and  firm in (select distinct firm from (select * from tmp where flag="post") group by firm having sum(dummy)>0);
quit;

View solution in original post

2 REPLIES 2
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Hi,

Have done the first three.  You can work out the fourth from the code, just need to mix up counts on the where - have to leave now.

data have;
attrib FIRM format=$1.  YEAR format=best.   NAME format=$200.   DUMMY    CHANGE format=best.;
infile datalines delimiter=',';
input firm $ year name $ dummy change;
datalines;
A,2004,Mr.X,0,0
A,2005,Mr.X,0,0
A,2006,Mr.Y,0,1
A,2007,Mr.Y,0,0
A,2008,Mr.Y,0,0
B,2004,Mr.S,0,0
B,2005,Mr.S,0,0
B,2006,Mr.T,1,1
B,2007,Mr.T,1,0
B,2008,Mr.T,1,0
C,2004,Mr.M,1,0
C,2005,Mr.M,1,0
C,2006,Mr.M,1,0
C,2007,Mr.N,1,1
C,2008,Mr.N,1,0
D,2004,Mr.R,1,0
D,2005,Mr.R,1,0
D,2006,Mr.V,0,1
D,2007,Mr.V,0,0
D,2008,Mr.V,0,0
;
run;

proc sql;
  create table subset1 as
  select  *
  from    have
  group by firm
  having   sum(dummy)=0;
  create table subset2 as
  select  *
  from    have
  group by firm
  having   sum(dummy)=5;
quit;

data tmp;
  set have;
  length flag $4;
  retain flag;
  by firm;
  if first.firm then flag='pre';
  if change=1 then flag='on';
  if change=0 and flag='on' then flag='post';
run;

proc sql;
  create table subset3 as
  select  *
  from    have
  where   firm in (select distinct firm from (select * from tmp where flag="pre") group by firm having sum(dummy)=0)
     and  firm in (select distinct firm from (select * from tmp where flag="post") group by firm having sum(dummy)>0);
quit;

shalmali
Calcite | Level 5

Thanks a lot.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 1202 views
  • 0 likes
  • 2 in conversation