DATA Step, Macro, Functions and more

Keeping one of two rows with duplicate ID

Reply
Frequent Contributor
Posts: 138

Keeping one of two rows with duplicate ID

Hi,

I have a dataset of duplicates--each ID occurs twice (but the values of other variables can differ within the same ID), like this:

ID     Var1     Var2

1        1            0

1        0            1

2        0            0

2        0            1

3        0            0

3        1            1

I want to keep one row for each ID, but I need to do it conditionally. First, if they have a value of 1 for Var1, that is the row that should be kept (each pair of rows will either have one 0 and one 1 for Var1 or two 0s). If a pair of rows has two 0s for Var1, the one with a value of 1 for Var2 should be kept.

Any help is much appreciated!

Valued Guide
Posts: 858

Re: Keeping one of two rows with duplicate ID

data have;

infile cards missover;

input ID     Var1     Var2;

cards;

1        1            0

1        0            1

2        0            0

2        0            1

3        0            0

3        1            1

;

run;

data prep;

set have;

by id;

if var1 = 1 then flag = 1;

if var1 = 0 and var2 = 1 then flag = 1;

run;

proc sort data=prep;by id flag;

data want(drop=flag);

set prep;

by id flag;

if first.flag and not missing(flag) then output want;

run;

Super User
Posts: 5,081

Re: Keeping one of two rows with duplicate ID

One way:

proc sort data=have;

   by id var1 var2;

run;

data want;

   set have;

   by id;

   if last.id;

   if var1=1 or var2=1;  /* optional, depending on intended result */

run;

Good luck.

Super User
Posts: 9,676

Re: Keeping one of two rows with duplicate ID

You didn't post the output yet .

Code: Program

data have;
input ID   Var1   Var2;
cards;
1 1 0
1 0 1
2 0 0
2 0 1
3 0 0
3 1 1
;
run;
proc sql;
create table want as
select *
  from have
   group by id
   having sum(var1=0) = 1 and var1=1
  union
select *
  from have
   group by id
   having sum(var1=0) = 2 and var2=1 ;
quit;

Xia Keshan

Ask a Question
Discussion stats
  • 3 replies
  • 204 views
  • 0 likes
  • 4 in conversation