Hi all,
I want to delete all observations for an id if the value of 2 comes after 1 for var. Here is what I have:
ID year var
11 2000 0
11 2001 0
11 2002 1
11 2003 0
11 2004 2
11 2005 0
11 2006 0
12 2000 0
12 2001 2
12 2002 0
12 2003 0
12 2004 1
12 2005 0
12 2006 0
13 2000 0
13 2001 0
13 2002 0
13 2003 1
13 2004 0
13 2005 0
13 2006 0
I want the id 11 to be eliminated for all years, because the number 2 is coming after 1.
What I want:
ID year var
12 2000 0
12 2001 2
12 2002 0
12 2003 0
12 2004 1
12 2005 0
12 2006 0
13 2000 0
13 2001 0
13 2002 0
13 2003 1
13 2004 0
13 2005 0
13 2006 0
I appreciate your help.
data have;
input ID$ year var;
datalines;
11 2000 0
11 2001 0
11 2002 1
11 2003 0
11 2004 2
11 2005 0
11 2006 0
12 2000 0
12 2001 2
12 2002 0
12 2003 0
12 2004 1
12 2005 0
12 2006 0
13 2000 0
13 2001 0
13 2002 0
13 2003 1
13 2004 0
13 2005 0
13 2006 0
;
proc sort data=have;
by ID;
run;
data help;
set have;
by id;
if first.id then do;
OneObs=0;
deleteflag=0;
end;
if var=1 then OneObs=1;
if var=2 & OneObs=1 then deleteflag=1;
retain OneObs deleteflag 0;
run;
proc sql;
create table want as
select ID, year, var
from have
where ID not in (select distinct ID from help where deleteflag=1);
quit;
data have;
input ID$ year var;
datalines;
11 2000 0
11 2001 0
11 2002 1
11 2003 0
11 2004 2
11 2005 0
11 2006 0
12 2000 0
12 2001 2
12 2002 0
12 2003 0
12 2004 1
12 2005 0
12 2006 0
13 2000 0
13 2001 0
13 2002 0
13 2003 1
13 2004 0
13 2005 0
13 2006 0
;
proc sort data=have;
by ID;
run;
data help;
set have;
by id;
if first.id then do;
OneObs=0;
deleteflag=0;
end;
if var=1 then OneObs=1;
if var=2 & OneObs=1 then deleteflag=1;
retain OneObs deleteflag 0;
run;
proc sql;
create table want as
select ID, year, var
from have
where ID not in (select distinct ID from help where deleteflag=1);
quit;
Anytime, glad to help 🙂
Thoug, @RW9s approach is probably better given that your year variable is increasing with observations.
Something like:
proc sql;
delete from HAVE
where ID in (select distinct A.ID
from (select * from HAVE where VAR=1) A
left join (select * from HAVE where VAR=2) B
on A.ID=B.ID
where A.YEAR > B.YEAR);
quit;
Note, not tested - post test data in the form of a datastep using the {i} code window.
data have;
input ID$ year var;
datalines;
11 2000 0
11 2001 0
11 2002 1
11 2003 0
11 2004 2
11 2005 0
11 2006 0
12 2000 0
12 2001 2
12 2002 0
12 2003 0
12 2004 1
12 2005 0
12 2006 0
13 2000 0
13 2001 0
13 2002 0
13 2003 1
13 2004 0
13 2005 0
13 2006 0
;
data want;
call missing(_f1,_f2);
do _n_=1 by 1 until(last.id);
set have;
by id year;
if missing(_f2) and var=2 then _f2=_n_;
if missing(_f1) and var=1 then _f1=_n_;
end;
do until(last.id);
set have;
by id year;
if not(not missing(_f2) and not missing(_f1) and _f1<_f2) then output;
end;
drop _:;
run;
data have;
input ID$ year var;
datalines;
11 2000 0
11 2001 0
11 2002 1
11 2003 0
11 2004 2
11 2005 0
11 2006 0
12 2000 0
12 2001 2
12 2002 0
12 2003 0
12 2004 1
12 2005 0
12 2006 0
13 2000 0
13 2001 0
13 2002 0
13 2003 1
13 2004 0
13 2005 0
13 2006 0
;
data temp;
set have(where=(var in (1 2)));
run;
data temp1;
set temp;
if id=lag(id) and lag(var)=1 and var=2;
run;
proc sql;
create table want as
select * from have where id not in (select id from temp1);
quit;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.