turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- General Programming
- /
- Delete records where matching variable value alrea...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-30-2016 02:09 PM

Hi,

I'm trying to think of the best way to create code that says something like "if variable X has a value of 1 previously output in the data set, then delete"

data have;

input x y ;

datalines;

1 7

2 8

3 8

1 10

2 7

4 10

run;

So using the data above...i would essentially want to drop the "x=1 and y=10" record as well as the "x=2 and y=7" records.I can't use a simple "if first." for this becuase the data coming in is sorted already without sorting by x (the sorting is important and can't be changed).

Thanks in advance,

Tom

Accepted Solutions

Solution

03-30-2016
02:55 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-30-2016 02:19 PM

A simple way: sort then restore.

data temp;

set have;

proper_order = _n_;

run;

proc sort data=temp;

by x;

run;

data temp;

set temp;

by x;

if first.x;

run;

proc sort data=temp out=want (drop=proper_order);

by proper_order;

run;

I'm sure there are more elegant approaches than this ... but it's quick and easy and works. I would imagine somebody will take the results of the first DATA step, load it into a hash table with X as the key, using a loading method that ignores duplicate values for the key. Then unload and possibly sort.

All Replies

Solution

03-30-2016
02:55 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-30-2016 02:19 PM

A simple way: sort then restore.

data temp;

set have;

proper_order = _n_;

run;

proc sort data=temp;

by x;

run;

data temp;

set temp;

by x;

if first.x;

run;

proc sort data=temp out=want (drop=proper_order);

by proper_order;

run;

I'm sure there are more elegant approaches than this ... but it's quick and easy and works. I would imagine somebody will take the results of the first DATA step, load it into a hash table with X as the key, using a loading method that ignores duplicate values for the key. Then unload and possibly sort.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-30-2016 02:55 PM

Thank you for the quick response, I really appreciate it!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-30-2016 03:01 PM

For further simplification, the second and third step could be replaced by one:

data temp; set have; proper_order = _n_; run; proc sort data=temp nodupkey; by x; run; proc sort data=temp out=want (drop=proper_order); by proper_order; run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-30-2016 09:44 PM

Yeah. It is question for Hash Table. data have; input x y ; datalines; 1 7 2 8 3 8 1 10 2 7 4 10 ; run; data want; if _n_=1 then do; declare hash h(); h.definekey('x'); h.definedone(); end; set have; if h.check()=0 then delete; else h.add(); run;