I have a dataset of firm-year observations(sorted by CUSIP).I need to delete firms that have less than 2 year of observations.
The code I used to enumerate the number of years each firm had data for was(i hope this code is right since it is giving me the desired output):
data work.merge2;
set work.merge1;
by cusp;
count+1;
if first.cusp then count=1;
run;
So what I have now is something like this
Output:
cusp count
1 1
1 2
1 3
2 1
2 2
3 1
Now I need to delete all those firms/cusps(id) that have less than 2 counts.For instance,here I need to delete firm with cusp 3.
Any suggestions please?
For a quick and dirty approach,
data have;
input cusp count ;
cards;
1 1
1 2
1 3
2 1
2 2
3 1
;
proc sql;
create table want as
select * from have
group by cusp
having max(count)>1
order by cusp, count
;
quit;
proc print; run;
There can be many other data step approaches if you are interested.
Regards,
Haikuo
Update: Since your criteria is 1, there actually is a quickdirty datastep approach, without involving DOWs or Hash:
data want;
set have;
by cusp;
if first.cusp*last.cusp ne 1;
run;
BTW, your code is correct. However, a minor modification will make it a lot more readable and appearing clear flow of logic:
data work.merge2;
set work.merge1;
by cusp;
if first.cusp then count=1;
eles count+1;
run;
For a quick and dirty approach,
data have;
input cusp count ;
cards;
1 1
1 2
1 3
2 1
2 2
3 1
;
proc sql;
create table want as
select * from have
group by cusp
having max(count)>1
order by cusp, count
;
quit;
proc print; run;
There can be many other data step approaches if you are interested.
Regards,
Haikuo
Update: Since your criteria is 1, there actually is a quickdirty datastep approach, without involving DOWs or Hash:
data want;
set have;
by cusp;
if first.cusp*last.cusp ne 1;
run;
BTW, your code is correct. However, a minor modification will make it a lot more readable and appearing clear flow of logic:
data work.merge2;
set work.merge1;
by cusp;
if first.cusp then count=1;
eles count+1;
run;
Thanks.
I tried the code(update by Haikuo). It worked
Well,I had been trying to do all this(delete firms-year obs with missing values & firms with less than 2 years of time series data) in response to this error message that was popping up for a panel regression:
There is only one cross section or time series observation. Computations will be terminated.
Suggestions again very much welcome!
data have;
input cusp;
cards;
1
1
1
2
2
3
4
4
4
4
;
run;
data want;
set have;
by cusp;
if first.cusp then count=1;else
count+1;
if last.cusp and count>1 then output;
proc print; run;
Obs cusp count
1 1 3
2 2 2
3 4 4
Thanks Linlin.
Well,would you have any suggestion to fix the problem I have mentioned above?I am running panel regression and this error pops up:
There is only one cross section or time series observation. Computations will be terminated.
I tried to remove all missing values from the variables as well as ensure that each firm/group has atleast two time series observations.
Thank you.
Regards
Namrata
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.