Finding first occurrence within a dataset

Reply
Occasional Contributor
Posts: 12

Finding first occurrence within a dataset

I have the following dataset where I need to only have the first year (occurrence) where the count is >=2 (I also want to keep any subsequent years thereafter). For instance for ID1, I need data starting in FY12 and I also want to keep FY 13 - 15 because I know the index year for this ID is FY12. I would keep ID2 and keep ID3 starting at FY12.

 

I can't figure out quite how to keep everything I need. Any suggestions?

 

ID Count FY

1   1         10

1   1         11

1   3         12

1   1         13

1   12       14

1    2        15

2    2        11

3    1        11

3    7        12

3    2        13

3    1        14

 

Super User
Posts: 5,972

Re: Finding first occurrence within a dataset

Assuming your data is in sorted order as shown:

 

data want;

set have;

by id;

if first.id then output_flag=0;

if count >= 2 then output_flag + 1;

if output_flag;

drop output_flag;

run;

PROC Star
Posts: 432

Re: Finding first occurrence within a dataset

[ Edited ]

one way to do this

data have;

input @1 ID @5 Count @15 FY;
datalines;
1 1 10
1 1 11
1 3 12
1 1 13
1 12 14
1 2 15
2 2 11
3 1 11
3 7 12
3 2 13
3 1 14
;

proc sql;
create table want as
select a.* from 
(select * from have)a
inner join
(select id, min(FY) as min_fy from have
where count >= 2
group by id)b
on a.id = b.id
and a.fy >= b.min_fy;

 

Ask a Question
Discussion stats
  • 2 replies
  • 90 views
  • 1 like
  • 3 in conversation