Hello,
I have an unbalanced panel data set where my data is from 1997 to 2012 (denoted by year). However, some firms (denoted by cusip) does not exist for all years. Therefore I want to drop those firms. I want to keep only firms where I have data from 1997 to 2012. Any suggestions on how to do that? I would really appreciate any help.
Thanks!
How many such customers do you expect?
Something like this (untested as no test data was provided)?
proc sql;
create table DROPCUS as
select CUSIP
from TABLE
group by CUSIP
having count(unique YEAR) < 16 ;
How many such customers do you expect?
Something like this (untested as no test data was provided)?
proc sql;
create table DROPCUS as
select CUSIP
from TABLE
group by CUSIP
having count(unique YEAR) < 16 ;
It would be helpful if you could provide some sample data for these questions.
Now, in the following code, 16 rows from 1997 to 2012 are extracted as the key.
/* create test data */
data have;
length cusip $10 year 8;
do cusip='A','C';
do year=1997 to 2012;
output;
end;
end;
cusip='B';
do year=1997 to 2000;
output;
end;
do cusip='D';
do year=1998 to 2013;
output;
end;
end;
run;
/* extract */
proc sort data=have nodupkey;
by cusip year;
where 1997<=year<=2012;
run;
proc freq data=have noprint;
table cusip / out=freq(where=(COUNT=16));
run;
proc sql;
create table want as
select have.* from have, freq
where have.cusip=freq.cusip
;
quit;
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.