Hello All,
I am trying to create four subsets from my current dataset. I am not good with SAS codes and thus am not able to construct them.
My main dataset looks as follows:
FIRM YEAR NAME DUMMY CHANGE
A 2004 Mr. X 0 0
A 2005 Mr. X 0 0
A 2006 Mr. Y 0 1
A 2007 Mr. Y 0 0
A 2008 Mr. Y 0 0
B 2004 Mr. S 0 0
B 2005 Mr. S 0 0
B 2006 Mr. T 1 1
B 2007 Mr. T 1 0
B 2008 Mr. T 1 0
C 2004 Mr. M 1 0
C 2005 Mr. M 1 0
C 2006 Mr. M 1 0
C 2007 Mr. N 1 1
C 2008 Mr. N 1 0
D 2004 Mr. R 1 0
D 2005 Mr. R 1 0
D 2006 Mr. V 0 1
D 2007 Mr. V 0 0
D 2008 Mr. V 0 0
Name is the name of the CEO in a firm. Dummy is 1 if the CEO meets some condition and 0 otherwise. CHANGE is 1 when there is change in the CEO. As you may see, there are some firms like A with DUMMY to be 0 for all observations, some firms like C with DUMMY to be 1 for all observations, some firms like B with DUMMY to be 0 for years before the CEO change and 1 for new CEO firm years, and some firms like D with DUMMY to be 1 for years before the CEO change and 0 for the new CEO firm year.
I am trying to create four new subsets which will be as follows:
Subset 1 will have only those firms whose DUMMY was 0 for all years. That is
FIRM YEAR NAME DUMMY CHANGE
A 2004 Mr. X 0 0
A 2005 Mr. X 0 0
A 2006 Mr. Y 0 1
A 2007 Mr. Y 0 0
A 2008 Mr. Y 0 0
,
Subset 2 will have only those firms whose DUMMY was 1 for all the years. That is
FIRM YEAR NAME DUMMY CHANGE
C 2004 Mr. M 1 0
C 2005 Mr. M 1 0
C 2006 Mr. M 1 0
C 2007 Mr. N 1 1
C 2008 Mr. N 1 0
.
Subset 3 will have firms whose DUMMY was 0 before the CEO change and 1 after the CEO change. That is,
FIRM YEAR NAME DUMMY CHANGE
B 2004 Mr. S 0 0
B 2005 Mr. S 0 0
B 2006 Mr. T 1 1
B 2007 Mr. T 1 0
B 2008 Mr. T 1 0
Subset 4 will have firms whose DUMMY was 1 before the CEO change and 0 after the CEO change. That is,
FIRM YEAR NAME DUMMY CHANGE
D 2004 Mr. R 1 0
D 2005 Mr. R 1 0
D 2006 Mr. V 0 1
D 2007 Mr. V 0 0
D 2008 Mr. V 0 0
I would really appreciate if someone share a code with me.
Thank you,
SS
Hi,
Have done the first three. You can work out the fourth from the code, just need to mix up counts on the where - have to leave now.
data have;
attrib FIRM format=$1. YEAR format=best. NAME format=$200. DUMMY CHANGE format=best.;
infile datalines delimiter=',';
input firm $ year name $ dummy change;
datalines;
A,2004,Mr.X,0,0
A,2005,Mr.X,0,0
A,2006,Mr.Y,0,1
A,2007,Mr.Y,0,0
A,2008,Mr.Y,0,0
B,2004,Mr.S,0,0
B,2005,Mr.S,0,0
B,2006,Mr.T,1,1
B,2007,Mr.T,1,0
B,2008,Mr.T,1,0
C,2004,Mr.M,1,0
C,2005,Mr.M,1,0
C,2006,Mr.M,1,0
C,2007,Mr.N,1,1
C,2008,Mr.N,1,0
D,2004,Mr.R,1,0
D,2005,Mr.R,1,0
D,2006,Mr.V,0,1
D,2007,Mr.V,0,0
D,2008,Mr.V,0,0
;
run;
proc sql;
create table subset1 as
select *
from have
group by firm
having sum(dummy)=0;
create table subset2 as
select *
from have
group by firm
having sum(dummy)=5;
quit;
data tmp;
set have;
length flag $4;
retain flag;
by firm;
if first.firm then flag='pre';
if change=1 then flag='on';
if change=0 and flag='on' then flag='post';
run;
proc sql;
create table subset3 as
select *
from have
where firm in (select distinct firm from (select * from tmp where flag="pre") group by firm having sum(dummy)=0)
and firm in (select distinct firm from (select * from tmp where flag="post") group by firm having sum(dummy)>0);
quit;
Hi,
Have done the first three. You can work out the fourth from the code, just need to mix up counts on the where - have to leave now.
data have;
attrib FIRM format=$1. YEAR format=best. NAME format=$200. DUMMY CHANGE format=best.;
infile datalines delimiter=',';
input firm $ year name $ dummy change;
datalines;
A,2004,Mr.X,0,0
A,2005,Mr.X,0,0
A,2006,Mr.Y,0,1
A,2007,Mr.Y,0,0
A,2008,Mr.Y,0,0
B,2004,Mr.S,0,0
B,2005,Mr.S,0,0
B,2006,Mr.T,1,1
B,2007,Mr.T,1,0
B,2008,Mr.T,1,0
C,2004,Mr.M,1,0
C,2005,Mr.M,1,0
C,2006,Mr.M,1,0
C,2007,Mr.N,1,1
C,2008,Mr.N,1,0
D,2004,Mr.R,1,0
D,2005,Mr.R,1,0
D,2006,Mr.V,0,1
D,2007,Mr.V,0,0
D,2008,Mr.V,0,0
;
run;
proc sql;
create table subset1 as
select *
from have
group by firm
having sum(dummy)=0;
create table subset2 as
select *
from have
group by firm
having sum(dummy)=5;
quit;
data tmp;
set have;
length flag $4;
retain flag;
by firm;
if first.firm then flag='pre';
if change=1 then flag='on';
if change=0 and flag='on' then flag='post';
run;
proc sql;
create table subset3 as
select *
from have
where firm in (select distinct firm from (select * from tmp where flag="pre") group by firm having sum(dummy)=0)
and firm in (select distinct firm from (select * from tmp where flag="post") group by firm having sum(dummy)>0);
quit;
Thanks a lot.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.