Re: Removing Duplicates by ID 1

RandyStan · Posted 12-23-2022 02:01 PM

Dear All

My data is as follows

IDA IDB Cat VarA

A 1 1 7

A 1 1 9

B 1 1 7

B 1 1 10

B 2 1 11

A 1 2 12

A 1 2 14

A 2 2 16

C 1 3 12

C1 1 3 14

I want the following table as output removing the duplicates and just keeping the first observation for IDA IDB for each CAT

IDA IDB Cat VarA

A 1 1 7

B 1 1 7

B 2 1 11

A 1 2 12

A 2 2 16

C 1 3 12

The code I wrote was

proc sort data = have; by IDA IDB Cat ; run;

data want ; set have

by IDA IDB Cat ;

if first.Cat then output ;

run;

Am I making a mistake somewhere?

run;

tarheel13 · Posted 12-23-2022 02:05 PM

next time please post the data as datalines.

RandyStan · Posted 12-23-2022 02:15 PM

Sincere apologies

Row IDA IDB Cat VarA

1 A 1 1 7

2 A 1 1 9

3 B 1 1 7

4 B 1 1 10

5 B 2 1 11

6 A 1 2 12

7 A 1 2 14

8 A 2 2 16

9 C 1 3 12

10 C 1 3 14

Row IDA IDB Cat VarA

1 A 1 1 7

3 B 1 1 7

5 B 2 1 11

6 A 1 2 12

8 A 2 2 16

9 C 1 3 12

tarheel13 · Posted 12-23-2022 02:23 PM

Hey I meant to post it as a data step. You can see an example in my other reply to you. Also, you can post code by clicking the running man icon.

tarheel13 · Posted 12-23-2022 02:23 PM

I don't see C1 in your desired output. Is there a reason that was excluded?

data have;
   input IDA $ IDB $ Cat $ VarA;
   datalines;
A 1 1 7
A 1 1 9
B 1 1 7
B 1 1 10
B 2 1 11
A 1 2 12
A 1 2 14
A 2 2 16
C 1 3 12
C1 1 3 14
;

proc print;
run;

proc sort data=have out=want nodupkey;
   by IDA IDB Cat;
run;

You can try the nodupkey option in proc sort.

Kurt_Bremser · Posted 12-23-2022 03:33 PM

The SET statement does not have a terminating semicolon, so you will have ERRORs there.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Removing Duplicates by ID 1