Help using Base SAS procedures

Code for deleting repeated numbers

Accepted Solution Solved
Reply
Contributor
Posts: 67
Accepted Solution

Code for deleting repeated numbers

Hi guys 

I'm looking for code that will delete any number repeated three times or less of this column (COW_ID), and my dataset name is (work.mfd). I have over one million observations in this column. 

COW_ID

62103918

62103918

62103918

62103918

62105823

62105823

62105823

62105952

62105952

62105952

62105952

62105952

62109009

62115446

62115446

62132148

62132148

62132148

62132148

62132148

62132148

62132148

 

my regards 

 

Ibrahim 

 

 


Accepted Solutions
Solution
‎10-08-2017 12:57 PM
Super User
Super User
Posts: 8,127

Re: Code for deleting repeated numbers

A simple SQL query should do that.

 

create table want as
  select * 
  from have 
  group by cow_id
  having count(*) > 3 
;

If the data is sorted you could probably get a quicker result using a data step with a double DOW loop. First loop to count and second to control which records are output.

data want ;
  do _N_=1 by 1 until (last.cowid);
    set have ;
    by cowid;
  end;
  do until (last.cowid);
    set have ;
    by cowid;
    if _N_ > 3 then output;
  end;
run;

 

View solution in original post


All Replies
Occasional Learner
Posts: 1

Re: Code for deleting repeated numbers

Use adaptation of this code to mark all those records with Counting Duplicate Rows in a Table, and then delete them with DELETE

I am assuming that by 

delete any number repeated three times or less of this column (COW_ID)

you mean delete all records that repeat 3 times or less.
If you formulate it like that then that means that only records that appear 4x or higher will remain.
Nevertheless, you can change condition for deletion however you want in the DELETE statement Smiley Wink

Super User
Posts: 6,785

Re: Code for deleting repeated numbers

Are you trying to remove all instances of these ID values?

 

62105823

62109009

62115446

Contributor
Posts: 67

Re: Code for deleting repeated numbers

Posted in reply to Astounding

Thanks   for your reply

yes ,that is what I need exactly

Any ideas please?

Solution
‎10-08-2017 12:57 PM
Super User
Super User
Posts: 8,127

Re: Code for deleting repeated numbers

A simple SQL query should do that.

 

create table want as
  select * 
  from have 
  group by cow_id
  having count(*) > 3 
;

If the data is sorted you could probably get a quicker result using a data step with a double DOW loop. First loop to count and second to control which records are output.

data want ;
  do _N_=1 by 1 until (last.cowid);
    set have ;
    by cowid;
  end;
  do until (last.cowid);
    set have ;
    by cowid;
    if _N_ > 3 then output;
  end;
run;

 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 218 views
  • 2 likes
  • 4 in conversation