Delete records where matching variable value already exists

Accepted Solution Solved
Reply
Contributor
Posts: 50
Accepted Solution

Delete records where matching variable value already exists

Hi,

 

I'm trying to think of the best way to create code that says something like "if variable X has a value of 1 previously output in the data set, then delete"

 


data have;
input x y ;
datalines;
1 7
2 8
3 8
1 10
2 7
4 10
run;

 

 

So using the data above...i would essentially want to drop the "x=1 and y=10" record as well as the "x=2 and y=7" records.I can't use a simple "if first." for this becuase the data coming in is sorted already without sorting by x (the sorting is important and can't be changed).

 

Thanks in advance,

Tom


Accepted Solutions
Solution
‎03-30-2016 02:55 PM
Respected Advisor
Posts: 4,995

Re: Delete records where matching variable value already exists

A simple way:  sort then restore.

 

data temp;

set have;

proper_order = _n_;

run;

proc sort data=temp;

by x;

run;

data temp;

set temp;

by x;

if first.x;

run;

proc sort data=temp out=want (drop=proper_order);

by proper_order;

run;

 

I'm sure there are more elegant approaches than this ... but it's quick and easy and works.  I would imagine somebody will take the results of the first DATA step, load it into a hash table with X as the key, using a loading method that ignores duplicate values for the key.  Then unload and possibly sort.

View solution in original post


All Replies
Solution
‎03-30-2016 02:55 PM
Respected Advisor
Posts: 4,995

Re: Delete records where matching variable value already exists

A simple way:  sort then restore.

 

data temp;

set have;

proper_order = _n_;

run;

proc sort data=temp;

by x;

run;

data temp;

set temp;

by x;

if first.x;

run;

proc sort data=temp out=want (drop=proper_order);

by proper_order;

run;

 

I'm sure there are more elegant approaches than this ... but it's quick and easy and works.  I would imagine somebody will take the results of the first DATA step, load it into a hash table with X as the key, using a loading method that ignores duplicate values for the key.  Then unload and possibly sort.

Contributor
Posts: 50

Re: Delete records where matching variable value already exists

Thank you for the quick response, I really appreciate it!

 

 

Trusted Advisor
Posts: 1,114

Re: Delete records where matching variable value already exists

For further simplification, the second and third step could be replaced by one:

 

data temp;
set have;
proper_order = _n_;
run;

proc sort data=temp nodupkey;
by x;
run;

proc sort data=temp out=want (drop=proper_order);
by proper_order;
run;
Grand Advisor
Posts: 9,593

Re: Delete records where matching variable value already exists

Yeah. It is question for Hash Table.



data have;
input x y ;
datalines;
1 7
2 8
3 8
1 10
2 7
4 10
;
run;


data want;
 if _n_=1 then do;
  declare hash h();
  h.definekey('x');
  h.definedone();
 end;
set have;
if h.check()=0 then delete;
 else h.add();
run;


☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 335 views
  • 3 likes
  • 4 in conversation