Hi all,
I want to delete all observations for an id if the value of 2 comes after 1 for var. Here is what I have:
ID year var
11 2000 0
11 2001 0
11 2002 1
11 2003 0
11 2004 2
11 2005 0
11 2006 0
12 2000 0
12 2001 2
12 2002 0
12 2003 0
12 2004 1
12 2005 0
12 2006 0
13 2000 0
13 2001 0
13 2002 0
13 2003 1
13 2004 0
13 2005 0
13 2006 0
I want the id 11 to be eliminated for all years, because the number 2 is coming after 1.
What I want:
ID year var
12 2000 0
12 2001 2
12 2002 0
12 2003 0
12 2004 1
12 2005 0
12 2006 0
13 2000 0
13 2001 0
13 2002 0
13 2003 1
13 2004 0
13 2005 0
13 2006 0
I appreciate your help.
data have;
input ID$ year var;
datalines;
11 2000 0
11 2001 0
11 2002 1
11 2003 0
11 2004 2
11 2005 0
11 2006 0
12 2000 0
12 2001 2
12 2002 0
12 2003 0
12 2004 1
12 2005 0
12 2006 0
13 2000 0
13 2001 0
13 2002 0
13 2003 1
13 2004 0
13 2005 0
13 2006 0
;
proc sort data=have;
by ID;
run;
data help;
set have;
by id;
if first.id then do;
OneObs=0;
deleteflag=0;
end;
if var=1 then OneObs=1;
if var=2 & OneObs=1 then deleteflag=1;
retain OneObs deleteflag 0;
run;
proc sql;
create table want as
select ID, year, var
from have
where ID not in (select distinct ID from help where deleteflag=1);
quit;
data have;
input ID$ year var;
datalines;
11 2000 0
11 2001 0
11 2002 1
11 2003 0
11 2004 2
11 2005 0
11 2006 0
12 2000 0
12 2001 2
12 2002 0
12 2003 0
12 2004 1
12 2005 0
12 2006 0
13 2000 0
13 2001 0
13 2002 0
13 2003 1
13 2004 0
13 2005 0
13 2006 0
;
proc sort data=have;
by ID;
run;
data help;
set have;
by id;
if first.id then do;
OneObs=0;
deleteflag=0;
end;
if var=1 then OneObs=1;
if var=2 & OneObs=1 then deleteflag=1;
retain OneObs deleteflag 0;
run;
proc sql;
create table want as
select ID, year, var
from have
where ID not in (select distinct ID from help where deleteflag=1);
quit;
Anytime, glad to help 🙂
Thoug, @RW9s approach is probably better given that your year variable is increasing with observations.
Something like:
proc sql; delete from HAVE where ID in (select distinct A.ID from (select * from HAVE where VAR=1) A left join (select * from HAVE where VAR=2) B on A.ID=B.ID where A.YEAR > B.YEAR); quit;
Note, not tested - post test data in the form of a datastep using the {i} code window.
data have;
input ID$ year var;
datalines;
11 2000 0
11 2001 0
11 2002 1
11 2003 0
11 2004 2
11 2005 0
11 2006 0
12 2000 0
12 2001 2
12 2002 0
12 2003 0
12 2004 1
12 2005 0
12 2006 0
13 2000 0
13 2001 0
13 2002 0
13 2003 1
13 2004 0
13 2005 0
13 2006 0
;
data want;
call missing(_f1,_f2);
do _n_=1 by 1 until(last.id);
set have;
by id year;
if missing(_f2) and var=2 then _f2=_n_;
if missing(_f1) and var=1 then _f1=_n_;
end;
do until(last.id);
set have;
by id year;
if not(not missing(_f2) and not missing(_f1) and _f1<_f2) then output;
end;
drop _:;
run;
data have;
input ID$ year var;
datalines;
11 2000 0
11 2001 0
11 2002 1
11 2003 0
11 2004 2
11 2005 0
11 2006 0
12 2000 0
12 2001 2
12 2002 0
12 2003 0
12 2004 1
12 2005 0
12 2006 0
13 2000 0
13 2001 0
13 2002 0
13 2003 1
13 2004 0
13 2005 0
13 2006 0
;
data temp;
set have(where=(var in (1 2)));
run;
data temp1;
set temp;
if id=lag(id) and lag(var)=1 and var=2;
run;
proc sql;
create table want as
select * from have where id not in (select id from temp1);
quit;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.