I have a dataset of firm-year observations(sorted by CUSIP).I need to delete firms that have less than 2 year of observations.
The code I used to enumerate the number of years each firm had data for was(i hope this code is right since it is giving me the desired output):
data work.merge2;
set work.merge1;
by cusp;
count+1;
if first.cusp then count=1;
run;
So what I have now is something like this
Output:
cusp count
1 1
1 2
1 3
2 1
2 2
3 1
Now I need to delete all those firms/cusps(id) that have less than 2 counts.For instance,here I need to delete firm with cusp 3.
Any suggestions please?
For a quick and dirty approach,
data have;
input cusp count ;
cards;
1 1
1 2
1 3
2 1
2 2
3 1
;
proc sql;
create table want as
select * from have
group by cusp
having max(count)>1
order by cusp, count
;
quit;
proc print; run;
There can be many other data step approaches if you are interested.
Regards,
Haikuo
Update: Since your criteria is 1, there actually is a quickdirty datastep approach, without involving DOWs or Hash:
data want;
set have;
by cusp;
if first.cusp*last.cusp ne 1;
run;
BTW, your code is correct. However, a minor modification will make it a lot more readable and appearing clear flow of logic:
data work.merge2;
set work.merge1;
by cusp;
if first.cusp then count=1;
eles count+1;
run;
For a quick and dirty approach,
data have;
input cusp count ;
cards;
1 1
1 2
1 3
2 1
2 2
3 1
;
proc sql;
create table want as
select * from have
group by cusp
having max(count)>1
order by cusp, count
;
quit;
proc print; run;
There can be many other data step approaches if you are interested.
Regards,
Haikuo
Update: Since your criteria is 1, there actually is a quickdirty datastep approach, without involving DOWs or Hash:
data want;
set have;
by cusp;
if first.cusp*last.cusp ne 1;
run;
BTW, your code is correct. However, a minor modification will make it a lot more readable and appearing clear flow of logic:
data work.merge2;
set work.merge1;
by cusp;
if first.cusp then count=1;
eles count+1;
run;
Thanks.
I tried the code(update by Haikuo). It worked
Well,I had been trying to do all this(delete firms-year obs with missing values & firms with less than 2 years of time series data) in response to this error message that was popping up for a panel regression:
There is only one cross section or time series observation. Computations will be terminated.
Suggestions again very much welcome!
data have;
input cusp;
cards;
1
1
1
2
2
3
4
4
4
4
;
run;
data want;
set have;
by cusp;
if first.cusp then count=1;else
count+1;
if last.cusp and count>1 then output;
proc print; run;
Obs cusp count
1 1 3
2 2 2
3 4 4
Thanks Linlin.
Well,would you have any suggestion to fix the problem I have mentioned above?I am running panel regression and this error pops up:
There is only one cross section or time series observation. Computations will be terminated.
I tried to remove all missing values from the variables as well as ensure that each firm/group has atleast two time series observations.
Thank you.
Regards
Namrata
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.