BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
shenflow
Obsidian | Level 7

I have a panel of the following structure:

ID     YEAR    EARN

1       1960      450

1       1961      310

1       1962      529

2       1978      10

2       1979      15

2       1980      8

2       1982      10

3       1972      1000

3       1973      1599    

 

I want to delete every ID, for which the time series is incomplete. In this case, this is ID 2, since 1981 is missing.

 

I am not yet very familiar with SAS and I can not come up with a method. I would appreciate help on this. Thank you in advance.

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User

Assuming there are not same years within a ID group.

 

data have;
	input ID YEAR EARN;
	datalines;
1       1960      450
1       1961      310
1       1962      529
2       1978      10
2       1979      15
2       1980      8
2       1982      10
3       1972      1000
3       1973      1599
;
run;
proc sql;
create table want as
select * from have
 group by id
  having range(year)+1=count(*)
order by id,year ; quit;

View solution in original post

3 REPLIES 3
ed_sas_member
Meteorite | Level 14

Hi @shenflow 

 

Here is an approach to achieve this:

data have;
	input ID YEAR EARN;
	datalines;
1       1960      450
1       1961      310
1       1962      529
2       1978      10
2       1979      15
2       1980      8
2       1982      10
3       1972      1000
3       1973      1599
;
run;

/* Identify ID to exclude form the input dataset */
data list_exclude (keep=ID);
	do until (last.ID);
		set have;
		by ID;
		_lag = lag(YEAR) + 1;
		if first.ID then call missing(_lag);
		if not missing(_lag) and _lag ne YEAR then output;
	end;
run;

/* Remove ID belonging to the previous list */
proc sql;
	select *
	from have
	where ID not in (select * from list_exclude);
run;

Output:

Capture d’écran 2020-05-08 à 12.03.03.png

All the best,

 

Kurt_Bremser
Super User

In a single data step:

data have;
input ID $ YEAR EARN;
datalines;
1       1960      450
1       1961      310
1       1962      529
2       1978      10
2       1979      15
2       1980      8
2       1982      10
3       1972      1000
3       1973      1599
;

data want;
set have;
del = 0;
old_year = .;
do until (last.ID);
  set have;
  by ID;
  if not first.ID and year - old_year > 1 then del = 1;
  old_year = year;
end;
do until (last.ID);
  set have;
  by ID;
  if not del then output;
end;
drop old_year del;
run;

Note how the input dataset is presented in a data step with datalines; this leaves no doubts about contents, variable types and other attributes. Please do so in the future.

Ksharp
Super User

Assuming there are not same years within a ID group.

 

data have;
	input ID YEAR EARN;
	datalines;
1       1960      450
1       1961      310
1       1962      529
2       1978      10
2       1979      15
2       1980      8
2       1982      10
3       1972      1000
3       1973      1599
;
run;
proc sql;
create table want as
select * from have
 group by id
  having range(year)+1=count(*)
order by id,year ; quit;

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 969 views
  • 4 likes
  • 4 in conversation