Re: Deleting the second duplicate row

altijani · Posted 09-12-2018 02:44 PM

Hi,

I have the following data:

customer_id	customer_name	visit
456A	John	1
123B	Smith	3
123B	Smith	4
987D	David	2
654H	Haydar	4

I need to delete the second row for Smith, because customer_id and customer_name are the same. In other words: only keep the first instance of visit. What I want is like this:

customer_id	customer_name	visit
456A	John	1
123B	Smith	3
987D	David	2
654H	Haydar	4

Thanks

novinosrin · Posted 09-12-2018 02:49 PM

data want;
set have;
by  customer_id	customer_name notsorted;
if first.customer_id and first.customer_name;
run;

Tom · Posted 09-12-2018 02:54 PM

@novinosrin wrote:

data want;
set have;
by  customer_id	customer_name notsorted;
if first.customer_id and first.customer_name;
run;

You don't need to reference both of those FIRST. variables. If you want to keep multiple names per CUSTOMER_ID then just use FIRST.CUSTOMER_NAME. If you just want one name per customer_id the just use use FIRST.CUSTOMER_ID. Note that FIRST.CUSTOMER_NAME will always be true when FIRST.CUSTOMER_ID is true.

novinosrin · Posted 09-12-2018 02:54 PM

Oh yes of course. Thank you

altijani · Posted 09-12-2018 03:23 PM

I need to keep the first observation of the same customer_id and customer_name

Deleting the second duplicate row