Solved
Contributor
Posts: 52

# Counting datapoints by group(panel data)

I have a dataset of firm-year observations(sorted by CUSIP).I need to delete firms that have less than 2 year of observations.

The code I used to enumerate the number of years each firm had data for was(i hope this code is right since it is giving me the desired output):

data work.merge2;

set work.merge1;

by cusp;

count+1;

if first.cusp then count=1;

run;

So what I have now is something like this

Output:
cusp     count
1           1
1           2
1           3
2           1
2           2
3           1

Now I need to delete all those firms/cusps(id) that have less than 2 counts.For instance,here I need to delete firm with cusp 3.

Accepted Solutions
Solution
‎03-24-2012 01:01 AM
Posts: 3,167

## Re: Counting datapoints by group(panel data)

For a quick and dirty approach,

data have;

input cusp     count ;

cards;

1           1

1           2

1           3

2           1

2           2

3           1

;

proc sql;

create table want as

select * from have

group by cusp

having max(count)>1

order by cusp, count

;

quit;

proc print; run;

There can be many other data step approaches if you are interested.

Regards,

Haikuo

Update: Since your criteria is 1, there actually is a quickdirty datastep approach, without involving DOWs or Hash:

data want;

set have;

by cusp;

if first.cusp*last.cusp ne 1;

run;

BTW, your code is correct. However, a minor modification will make it a lot more readable and appearing clear flow of logic:

data work.merge2;

set work.merge1;

by cusp;

if first.cusp then count=1;

eles count+1;

run;

All Replies
Solution
‎03-24-2012 01:01 AM
Posts: 3,167

## Re: Counting datapoints by group(panel data)

For a quick and dirty approach,

data have;

input cusp     count ;

cards;

1           1

1           2

1           3

2           1

2           2

3           1

;

proc sql;

create table want as

select * from have

group by cusp

having max(count)>1

order by cusp, count

;

quit;

proc print; run;

There can be many other data step approaches if you are interested.

Regards,

Haikuo

Update: Since your criteria is 1, there actually is a quickdirty datastep approach, without involving DOWs or Hash:

data want;

set have;

by cusp;

if first.cusp*last.cusp ne 1;

run;

BTW, your code is correct. However, a minor modification will make it a lot more readable and appearing clear flow of logic:

data work.merge2;

set work.merge1;

by cusp;

if first.cusp then count=1;

eles count+1;

run;

Contributor
Posts: 52

## Re: Counting datapoints by group(panel data)

Thanks.

I tried the code(update by Haikuo). It worked

Well,I had been trying to do all this(delete firms-year obs with missing values & firms with less than 2 years of time series data) in response to this error message that was popping up for a panel regression:

There is only one cross section or time series observation. Computations will be terminated.

Suggestions again very much welcome!

Super Contributor
Posts: 1,636

## Re: Counting datapoints by group(panel data)

data have;

input cusp;

cards;

1

1

1

2

2

3

4

4

4

4

;

run;

data want;

set have;

by cusp;

if first.cusp then count=1;else

count+1;

if last.cusp and count>1 then output;

proc print; run;

Obs    cusp    count

1       1       3

2       2       2

3       4       4

Contributor
Posts: 52

## Re: Counting datapoints by group(panel data)

Thanks Linlin.

Well,would you have any suggestion to fix the problem I have mentioned above?I am running panel regression and this error pops up:

There is only one cross section or time series observation. Computations will be terminated.

I tried to remove all missing values from the variables as well as ensure that each firm/group has atleast two time series observations.

Thank you.

Regards

Namrata

🔒 This topic is solved and locked.