Hi all,
I want to delete all observations for an id if the value of 2 comes after 1 for var. Here is what I have:
ID year var
11 2000 0
11 2001 0
11 2002 1
11 2003 0
11 2004 2
11 2005 0
11 2006 0
12 2000 0
12 2001 2
12 2002 0
12 2003 0
12 2004 1
12 2005 0
12 2006 0
13 2000 0
13 2001 0
13 2002 0
13 2003 1
13 2004 0
13 2005 0
13 2006 0
I want the id 11 to be eliminated for all years, because the number 2 is coming after 1.
What I want:
ID year var
12 2000 0
12 2001 2
12 2002 0
12 2003 0
12 2004 1
12 2005 0
12 2006 0
13 2000 0
13 2001 0
13 2002 0
13 2003 1
13 2004 0
13 2005 0
13 2006 0
I appreciate your help.
data have;
input ID$ year var;
datalines;
11 2000 0
11 2001 0
11 2002 1
11 2003 0
11 2004 2
11 2005 0
11 2006 0
12 2000 0
12 2001 2
12 2002 0
12 2003 0
12 2004 1
12 2005 0
12 2006 0
13 2000 0
13 2001 0
13 2002 0
13 2003 1
13 2004 0
13 2005 0
13 2006 0
;
proc sort data=have;
by ID;
run;
data help;
set have;
by id;
if first.id then do;
OneObs=0;
deleteflag=0;
end;
if var=1 then OneObs=1;
if var=2 & OneObs=1 then deleteflag=1;
retain OneObs deleteflag 0;
run;
proc sql;
create table want as
select ID, year, var
from have
where ID not in (select distinct ID from help where deleteflag=1);
quit;
data have;
input ID$ year var;
datalines;
11 2000 0
11 2001 0
11 2002 1
11 2003 0
11 2004 2
11 2005 0
11 2006 0
12 2000 0
12 2001 2
12 2002 0
12 2003 0
12 2004 1
12 2005 0
12 2006 0
13 2000 0
13 2001 0
13 2002 0
13 2003 1
13 2004 0
13 2005 0
13 2006 0
;
proc sort data=have;
by ID;
run;
data help;
set have;
by id;
if first.id then do;
OneObs=0;
deleteflag=0;
end;
if var=1 then OneObs=1;
if var=2 & OneObs=1 then deleteflag=1;
retain OneObs deleteflag 0;
run;
proc sql;
create table want as
select ID, year, var
from have
where ID not in (select distinct ID from help where deleteflag=1);
quit;
Anytime, glad to help 🙂
Thoug, @RW9s approach is probably better given that your year variable is increasing with observations.
Something like:
proc sql; delete from HAVE where ID in (select distinct A.ID from (select * from HAVE where VAR=1) A left join (select * from HAVE where VAR=2) B on A.ID=B.ID where A.YEAR > B.YEAR); quit;
Note, not tested - post test data in the form of a datastep using the {i} code window.
data have;
input ID$ year var;
datalines;
11 2000 0
11 2001 0
11 2002 1
11 2003 0
11 2004 2
11 2005 0
11 2006 0
12 2000 0
12 2001 2
12 2002 0
12 2003 0
12 2004 1
12 2005 0
12 2006 0
13 2000 0
13 2001 0
13 2002 0
13 2003 1
13 2004 0
13 2005 0
13 2006 0
;
data want;
call missing(_f1,_f2);
do _n_=1 by 1 until(last.id);
set have;
by id year;
if missing(_f2) and var=2 then _f2=_n_;
if missing(_f1) and var=1 then _f1=_n_;
end;
do until(last.id);
set have;
by id year;
if not(not missing(_f2) and not missing(_f1) and _f1<_f2) then output;
end;
drop _:;
run;
data have;
input ID$ year var;
datalines;
11 2000 0
11 2001 0
11 2002 1
11 2003 0
11 2004 2
11 2005 0
11 2006 0
12 2000 0
12 2001 2
12 2002 0
12 2003 0
12 2004 1
12 2005 0
12 2006 0
13 2000 0
13 2001 0
13 2002 0
13 2003 1
13 2004 0
13 2005 0
13 2006 0
;
data temp;
set have(where=(var in (1 2)));
run;
data temp1;
set temp;
if id=lag(id) and lag(var)=1 and var=2;
run;
proc sql;
create table want as
select * from have where id not in (select id from temp1);
quit;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.