Hello,
I have an unbalanced panel data set where my data is from 1997 to 2012 (denoted by year). However, some firms (denoted by cusip) does not exist for all years. Therefore I want to drop those firms. I want to keep only firms where I have data from 1997 to 2012. Any suggestions on how to do that? I would really appreciate any help.
Thanks!
How many such customers do you expect?
Something like this (untested as no test data was provided)?
proc sql;
create table DROPCUS as
select CUSIP
from TABLE
group by CUSIP
having count(unique YEAR) < 16 ;
How many such customers do you expect?
Something like this (untested as no test data was provided)?
proc sql;
create table DROPCUS as
select CUSIP
from TABLE
group by CUSIP
having count(unique YEAR) < 16 ;
It would be helpful if you could provide some sample data for these questions.
Now, in the following code, 16 rows from 1997 to 2012 are extracted as the key.
/* create test data */
data have;
length cusip $10 year 8;
do cusip='A','C';
do year=1997 to 2012;
output;
end;
end;
cusip='B';
do year=1997 to 2000;
output;
end;
do cusip='D';
do year=1998 to 2013;
output;
end;
end;
run;
/* extract */
proc sort data=have nodupkey;
by cusip year;
where 1997<=year<=2012;
run;
proc freq data=have noprint;
table cusip / out=freq(where=(COUNT=16));
run;
proc sql;
create table want as
select have.* from have, freq
where have.cusip=freq.cusip
;
quit;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.