proc sort nodupkey left in a duplicate row

Walternate — Thu, 08 Oct 2015 15:22:00 GMT

Hi,

I have a dataset at the person level but with duplicate rows. It has ID and character variables A, B, and C. I wanted unique rows, so I ran this code:

proc sort nodupkey data=have;

by ID char_A char_B char_C;

run;

It worked without producing an error message, but when looking through the data I noticed that at least one duplicate row remained.

ID Char_A Char_B Char_C

1 abc- d def_g ghi

I'm not sure why this row remained in the data, as it looks like most of the duplicate rows were correctly deleted. Is there a way to troubleshoot and figure out whether there's some minor difference between the character variables or some other reason that the duplicate row wasn't removed?

Thanks!

Re: proc sort nodupkey left in a duplicate row

data_null__ — Thu, 08 Oct 2015 15:31:41 GMT

Display the values of the BY variables for the suspect observations using $HEX format, I expect you will find they are different. There is probably a character that is displayed as a space but is not, or you have a different number of leading spaces.

topic Re: proc sort nodupkey left in a duplicate row in SAS Procedures

proc sort nodupkey left in a duplicate row

Re: proc sort nodupkey left in a duplicate row