Hello All,
I am trying to create four subsets from my current dataset. I am not good with SAS codes and thus am not able to construct them.
My main dataset looks as follows:
FIRM YEAR NAME DUMMY CHANGE
A 2004 Mr. X 0 0
A 2005 Mr. X 0 0
A 2006 Mr. Y 0 1
A 2007 Mr. Y 0 0
A 2008 Mr. Y 0 0
B 2004 Mr. S 0 0
B 2005 Mr. S 0 0
B 2006 Mr. T 1 1
B 2007 Mr. T 1 0
B 2008 Mr. T 1 0
C 2004 Mr. M 1 0
C 2005 Mr. M 1 0
C 2006 Mr. M 1 0
C 2007 Mr. N 1 1
C 2008 Mr. N 1 0
D 2004 Mr. R 1 0
D 2005 Mr. R 1 0
D 2006 Mr. V 0 1
D 2007 Mr. V 0 0
D 2008 Mr. V 0 0
Name is the name of the CEO in a firm. Dummy is 1 if the CEO meets some condition and 0 otherwise. CHANGE is 1 when there is change in the CEO. As you may see, there are some firms like A with DUMMY to be 0 for all observations, some firms like C with DUMMY to be 1 for all observations, some firms like B with DUMMY to be 0 for years before the CEO change and 1 for new CEO firm years, and some firms like D with DUMMY to be 1 for years before the CEO change and 0 for the new CEO firm year.
I am trying to create four new subsets which will be as follows:
Subset 1 will have only those firms whose DUMMY was 0 for all years. That is
FIRM YEAR NAME DUMMY CHANGE
A 2004 Mr. X 0 0
A 2005 Mr. X 0 0
A 2006 Mr. Y 0 1
A 2007 Mr. Y 0 0
A 2008 Mr. Y 0 0
,
Subset 2 will have only those firms whose DUMMY was 1 for all the years. That is
FIRM YEAR NAME DUMMY CHANGE
C 2004 Mr. M 1 0
C 2005 Mr. M 1 0
C 2006 Mr. M 1 0
C 2007 Mr. N 1 1
C 2008 Mr. N 1 0
.
Subset 3 will have firms whose DUMMY was 0 before the CEO change and 1 after the CEO change. That is,
FIRM YEAR NAME DUMMY CHANGE
B 2004 Mr. S 0 0
B 2005 Mr. S 0 0
B 2006 Mr. T 1 1
B 2007 Mr. T 1 0
B 2008 Mr. T 1 0
Subset 4 will have firms whose DUMMY was 1 before the CEO change and 0 after the CEO change. That is,
FIRM YEAR NAME DUMMY CHANGE
D 2004 Mr. R 1 0
D 2005 Mr. R 1 0
D 2006 Mr. V 0 1
D 2007 Mr. V 0 0
D 2008 Mr. V 0 0
I would really appreciate if someone share a code with me.
Thank you,
SS
Hi,
Have done the first three. You can work out the fourth from the code, just need to mix up counts on the where - have to leave now.
data have;
attrib FIRM format=$1. YEAR format=best. NAME format=$200. DUMMY CHANGE format=best.;
infile datalines delimiter=',';
input firm $ year name $ dummy change;
datalines;
A,2004,Mr.X,0,0
A,2005,Mr.X,0,0
A,2006,Mr.Y,0,1
A,2007,Mr.Y,0,0
A,2008,Mr.Y,0,0
B,2004,Mr.S,0,0
B,2005,Mr.S,0,0
B,2006,Mr.T,1,1
B,2007,Mr.T,1,0
B,2008,Mr.T,1,0
C,2004,Mr.M,1,0
C,2005,Mr.M,1,0
C,2006,Mr.M,1,0
C,2007,Mr.N,1,1
C,2008,Mr.N,1,0
D,2004,Mr.R,1,0
D,2005,Mr.R,1,0
D,2006,Mr.V,0,1
D,2007,Mr.V,0,0
D,2008,Mr.V,0,0
;
run;
proc sql;
create table subset1 as
select *
from have
group by firm
having sum(dummy)=0;
create table subset2 as
select *
from have
group by firm
having sum(dummy)=5;
quit;
data tmp;
set have;
length flag $4;
retain flag;
by firm;
if first.firm then flag='pre';
if change=1 then flag='on';
if change=0 and flag='on' then flag='post';
run;
proc sql;
create table subset3 as
select *
from have
where firm in (select distinct firm from (select * from tmp where flag="pre") group by firm having sum(dummy)=0)
and firm in (select distinct firm from (select * from tmp where flag="post") group by firm having sum(dummy)>0);
quit;
Hi,
Have done the first three. You can work out the fourth from the code, just need to mix up counts on the where - have to leave now.
data have;
attrib FIRM format=$1. YEAR format=best. NAME format=$200. DUMMY CHANGE format=best.;
infile datalines delimiter=',';
input firm $ year name $ dummy change;
datalines;
A,2004,Mr.X,0,0
A,2005,Mr.X,0,0
A,2006,Mr.Y,0,1
A,2007,Mr.Y,0,0
A,2008,Mr.Y,0,0
B,2004,Mr.S,0,0
B,2005,Mr.S,0,0
B,2006,Mr.T,1,1
B,2007,Mr.T,1,0
B,2008,Mr.T,1,0
C,2004,Mr.M,1,0
C,2005,Mr.M,1,0
C,2006,Mr.M,1,0
C,2007,Mr.N,1,1
C,2008,Mr.N,1,0
D,2004,Mr.R,1,0
D,2005,Mr.R,1,0
D,2006,Mr.V,0,1
D,2007,Mr.V,0,0
D,2008,Mr.V,0,0
;
run;
proc sql;
create table subset1 as
select *
from have
group by firm
having sum(dummy)=0;
create table subset2 as
select *
from have
group by firm
having sum(dummy)=5;
quit;
data tmp;
set have;
length flag $4;
retain flag;
by firm;
if first.firm then flag='pre';
if change=1 then flag='on';
if change=0 and flag='on' then flag='post';
run;
proc sql;
create table subset3 as
select *
from have
where firm in (select distinct firm from (select * from tmp where flag="pre") group by firm having sum(dummy)=0)
and firm in (select distinct firm from (select * from tmp where flag="post") group by firm having sum(dummy)>0);
quit;
Thanks a lot.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and save with the early bird rate—just $795!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.