I have a dataset of firm-year observations(sorted by CUSIP).I need to delete firms that have less than 2 year of observations.
The code I used to enumerate the number of years each firm had data for was(i hope this code is right since it is giving me the desired output):
data work.merge2;
set work.merge1;
by cusp;
count+1;
if first.cusp then count=1;
run;
So what I have now is something like this
Output:
cusp count
1 1
1 2
1 3
2 1
2 2
3 1
Now I need to delete all those firms/cusps(id) that have less than 2 counts.For instance,here I need to delete firm with cusp 3.
Any suggestions please?
For a quick and dirty approach,
data have;
input cusp count ;
cards;
1 1
1 2
1 3
2 1
2 2
3 1
;
proc sql;
create table want as
select * from have
group by cusp
having max(count)>1
order by cusp, count
;
quit;
proc print; run;
There can be many other data step approaches if you are interested.
Regards,
Haikuo
Update: Since your criteria is 1, there actually is a quickdirty datastep approach, without involving DOWs or Hash:
data want;
set have;
by cusp;
if first.cusp*last.cusp ne 1;
run;
BTW, your code is correct. However, a minor modification will make it a lot more readable and appearing clear flow of logic:
data work.merge2;
set work.merge1;
by cusp;
if first.cusp then count=1;
eles count+1;
run;
For a quick and dirty approach,
data have;
input cusp count ;
cards;
1 1
1 2
1 3
2 1
2 2
3 1
;
proc sql;
create table want as
select * from have
group by cusp
having max(count)>1
order by cusp, count
;
quit;
proc print; run;
There can be many other data step approaches if you are interested.
Regards,
Haikuo
Update: Since your criteria is 1, there actually is a quickdirty datastep approach, without involving DOWs or Hash:
data want;
set have;
by cusp;
if first.cusp*last.cusp ne 1;
run;
BTW, your code is correct. However, a minor modification will make it a lot more readable and appearing clear flow of logic:
data work.merge2;
set work.merge1;
by cusp;
if first.cusp then count=1;
eles count+1;
run;
Thanks.
I tried the code(update by Haikuo). It worked
Well,I had been trying to do all this(delete firms-year obs with missing values & firms with less than 2 years of time series data) in response to this error message that was popping up for a panel regression:
There is only one cross section or time series observation. Computations will be terminated.
Suggestions again very much welcome!
data have;
input cusp;
cards;
1
1
1
2
2
3
4
4
4
4
;
run;
data want;
set have;
by cusp;
if first.cusp then count=1;else
count+1;
if last.cusp and count>1 then output;
proc print; run;
Obs cusp count
1 1 3
2 2 2
3 4 4
Thanks Linlin.
Well,would you have any suggestion to fix the problem I have mentioned above?I am running panel regression and this error pops up:
There is only one cross section or time series observation. Computations will be terminated.
I tried to remove all missing values from the variables as well as ensure that each firm/group has atleast two time series observations.
Thank you.
Regards
Namrata
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.