I have a panel of the following structure:
ID YEAR EARN
1 1960 450
1 1961 310
1 1962 529
2 1978 10
2 1979 15
2 1980 8
2 1982 10
3 1972 1000
3 1973 1599
I want to delete every ID, for which the time series is incomplete. In this case, this is ID 2, since 1981 is missing.
I am not yet very familiar with SAS and I can not come up with a method. I would appreciate help on this. Thank you in advance.
Assuming there are not same years within a ID group.
data have;
input ID YEAR EARN;
datalines;
1 1960 450
1 1961 310
1 1962 529
2 1978 10
2 1979 15
2 1980 8
2 1982 10
3 1972 1000
3 1973 1599
;
run;
proc sql;
create table want as
select * from have
group by id
having range(year)+1=count(*)
order by id,year ;
quit;
Hi @shenflow
Here is an approach to achieve this:
data have;
input ID YEAR EARN;
datalines;
1 1960 450
1 1961 310
1 1962 529
2 1978 10
2 1979 15
2 1980 8
2 1982 10
3 1972 1000
3 1973 1599
;
run;
/* Identify ID to exclude form the input dataset */
data list_exclude (keep=ID);
do until (last.ID);
set have;
by ID;
_lag = lag(YEAR) + 1;
if first.ID then call missing(_lag);
if not missing(_lag) and _lag ne YEAR then output;
end;
run;
/* Remove ID belonging to the previous list */
proc sql;
select *
from have
where ID not in (select * from list_exclude);
run;
Output:
All the best,
In a single data step:
data have;
input ID $ YEAR EARN;
datalines;
1 1960 450
1 1961 310
1 1962 529
2 1978 10
2 1979 15
2 1980 8
2 1982 10
3 1972 1000
3 1973 1599
;
data want;
set have;
del = 0;
old_year = .;
do until (last.ID);
set have;
by ID;
if not first.ID and year - old_year > 1 then del = 1;
old_year = year;
end;
do until (last.ID);
set have;
by ID;
if not del then output;
end;
drop old_year del;
run;
Note how the input dataset is presented in a data step with datalines; this leaves no doubts about contents, variable types and other attributes. Please do so in the future.
Assuming there are not same years within a ID group.
data have;
input ID YEAR EARN;
datalines;
1 1960 450
1 1961 310
1 1962 529
2 1978 10
2 1979 15
2 1980 8
2 1982 10
3 1972 1000
3 1973 1599
;
run;
proc sql;
create table want as
select * from have
group by id
having range(year)+1=count(*)
order by id,year ;
quit;
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.