Help using Base SAS procedures

create subsets from original data

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 96
Accepted Solution

create subsets from original data

Hello All,

I am trying to create four subsets from my current dataset. I am not good with SAS codes and thus am not able to construct them.

My main dataset looks as follows:

FIRM     YEAR      NAME   DUMMY    CHANGE

A             2004       Mr. X     0              0

A             2005       Mr. X     0              0

A             2006       Mr. Y     0              1

A             2007       Mr. Y     0              0

A             2008       Mr. Y     0              0

B             2004       Mr. S     0              0

B             2005       Mr. S     0              0

B             2006       Mr. T     1              1

B             2007       Mr. T     1              0

B             2008       Mr. T     1              0

C             2004       Mr. M   1              0

C             2005       Mr. M   1              0

C             2006       Mr. M   1              0

C             2007       Mr. N    1              1

C             2008       Mr. N    1              0

D             2004       Mr. R     1              0

D             2005       Mr. R     1              0

D             2006       Mr. V     0              1

D             2007       Mr. V     0              0

D             2008       Mr. V     0              0

Name is the name of the CEO in a firm. Dummy is 1 if the CEO meets some condition and 0 otherwise. CHANGE is 1 when there is change in the CEO. As you may see, there are some firms like A with DUMMY to be 0 for all observations, some firms like C with DUMMY to be 1 for all observations, some firms like B with DUMMY to be 0 for years before the CEO change and 1 for new CEO firm years, and some firms like D with DUMMY to be 1 for years before the CEO change and 0 for the new CEO firm year.

I am trying to create four new subsets which will be as follows:

Subset 1 will have only those firms whose DUMMY was 0 for all years. That is

FIRM     YEAR      NAME   DUMMY    CHANGE

A             2004       Mr. X     0              0

A             2005       Mr. X     0              0

A             2006       Mr. Y     0              1

A             2007       Mr. Y     0              0

A             2008       Mr. Y     0              0

,

Subset 2 will have only those firms whose DUMMY was 1 for all the years. That is

FIRM     YEAR      NAME   DUMMY    CHANGE

C             2004       Mr. M   1              0

C             2005       Mr. M   1              0

C             2006       Mr. M   1              0

C             2007       Mr. N    1              1

C             2008       Mr. N    1              0

.

Subset 3 will have firms whose DUMMY was 0 before the CEO change and 1 after the CEO change. That is,

FIRM     YEAR      NAME   DUMMY    CHANGE

B             2004       Mr. S     0              0

B             2005       Mr. S     0              0

B             2006       Mr. T     1              1

B             2007       Mr. T     1              0

B             2008       Mr. T     1              0

Subset 4 will have firms whose DUMMY was 1 before the CEO change and 0 after the CEO change. That is,

FIRM     YEAR      NAME   DUMMY    CHANGE

D             2004       Mr. R     1              0

D             2005       Mr. R     1              0

D             2006       Mr. V     0              1

D             2007       Mr. V     0              0

D             2008       Mr. V     0              0

I would really appreciate if someone share a code with me.

Thank you,

SS


Accepted Solutions
Solution
‎03-21-2014 11:38 AM
Super User
Super User
Posts: 7,407

Re: create subsets from original data

Hi,

Have done the first three.  You can work out the fourth from the code, just need to mix up counts on the where - have to leave now.

data have;
attrib FIRM format=$1.  YEAR format=best.   NAME format=$200.   DUMMY    CHANGE format=best.;
infile datalines delimiter=',';
input firm $ year name $ dummy change;
datalines;
A,2004,Mr.X,0,0
A,2005,Mr.X,0,0
A,2006,Mr.Y,0,1
A,2007,Mr.Y,0,0
A,2008,Mr.Y,0,0
B,2004,Mr.S,0,0
B,2005,Mr.S,0,0
B,2006,Mr.T,1,1
B,2007,Mr.T,1,0
B,2008,Mr.T,1,0
C,2004,Mr.M,1,0
C,2005,Mr.M,1,0
C,2006,Mr.M,1,0
C,2007,Mr.N,1,1
C,2008,Mr.N,1,0
D,2004,Mr.R,1,0
D,2005,Mr.R,1,0
D,2006,Mr.V,0,1
D,2007,Mr.V,0,0
D,2008,Mr.V,0,0
;
run;

proc sql;
  create table subset1 as
  select  *
  from    have
  group by firm
  having   sum(dummy)=0;
  create table subset2 as
  select  *
  from    have
  group by firm
  having   sum(dummy)=5;
quit;

data tmp;
  set have;
  length flag $4;
  retain flag;
  by firm;
  if first.firm then flag='pre';
  if change=1 then flag='on';
  if change=0 and flag='on' then flag='post';
run;

proc sql;
  create table subset3 as
  select  *
  from    have
  where   firm in (select distinct firm from (select * from tmp where flag="pre") group by firm having sum(dummy)=0)
     and  firm in (select distinct firm from (select * from tmp where flag="post") group by firm having sum(dummy)>0);
quit;

View solution in original post


All Replies
Solution
‎03-21-2014 11:38 AM
Super User
Super User
Posts: 7,407

Re: create subsets from original data

Hi,

Have done the first three.  You can work out the fourth from the code, just need to mix up counts on the where - have to leave now.

data have;
attrib FIRM format=$1.  YEAR format=best.   NAME format=$200.   DUMMY    CHANGE format=best.;
infile datalines delimiter=',';
input firm $ year name $ dummy change;
datalines;
A,2004,Mr.X,0,0
A,2005,Mr.X,0,0
A,2006,Mr.Y,0,1
A,2007,Mr.Y,0,0
A,2008,Mr.Y,0,0
B,2004,Mr.S,0,0
B,2005,Mr.S,0,0
B,2006,Mr.T,1,1
B,2007,Mr.T,1,0
B,2008,Mr.T,1,0
C,2004,Mr.M,1,0
C,2005,Mr.M,1,0
C,2006,Mr.M,1,0
C,2007,Mr.N,1,1
C,2008,Mr.N,1,0
D,2004,Mr.R,1,0
D,2005,Mr.R,1,0
D,2006,Mr.V,0,1
D,2007,Mr.V,0,0
D,2008,Mr.V,0,0
;
run;

proc sql;
  create table subset1 as
  select  *
  from    have
  group by firm
  having   sum(dummy)=0;
  create table subset2 as
  select  *
  from    have
  group by firm
  having   sum(dummy)=5;
quit;

data tmp;
  set have;
  length flag $4;
  retain flag;
  by firm;
  if first.firm then flag='pre';
  if change=1 then flag='on';
  if change=0 and flag='on' then flag='post';
run;

proc sql;
  create table subset3 as
  select  *
  from    have
  where   firm in (select distinct firm from (select * from tmp where flag="pre") group by firm having sum(dummy)=0)
     and  firm in (select distinct firm from (select * from tmp where flag="post") group by firm having sum(dummy)>0);
quit;

Frequent Contributor
Posts: 96

Re: create subsets from original data

Thanks a lot.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 2 replies
  • 190 views
  • 0 likes
  • 2 in conversation