About cardinull

cardinull · ‎09-30-2019

Thanks so much Mark! I was able to get something to work by tweaking this a bit :)!

cardinull · ‎09-28-2019

@novinosrin Thanks a bunch. I'll try and play with this and keep you both updated :)!

cardinull · ‎09-28-2019

@novinosrin@mkeintz Woops! Mark is right. I just updated my post/questions to account for that.... Sorry for the confusion... Long day To add to Mark's case, I would want to keep one record if it still is a partial duplicate to other existing records. However, if only the two records exist for that combination of ID, date, and time, then I wouldn't keep those In the case of the example by Mark, if we have data: data dt; input ID VAR1 VAR2 VAR3 VAR4 date time; datalines; 1 1 1 1 2 9 15 1 2 . . 2 9 15 1 2 2 2 1 9 15 1 2 2 2 . 9 15 This is the extra record by Mark 2 1 1 2 2 10 20 2 1 . 2 3 10 20 2 3 2 2 . 10 20 2 1 1 2 3 10 20 3 . 1 3 4 9 15 3 2 1 . 4 9 15 ; run; Then I would want a result of this where one of row 3 or 4 is kept, one of row 5 or 7 is kept, and neither of rows 9 and 10 are kept: data dt; input ID VAR1 VAR2 VAR3 VAR4 date time; datalines; 1 1 1 1 2 9 15 1 2 . . 2 9 15 1 2 2 2 1 9 15 2 1 1 2 2 10 20 2 3 2 2 . 10 20 2 1 1 2 3 10 20 ; run; Sorry for that ! Jameson

cardinull · ‎09-28-2019

Lol! No shame! Being creative is the funnest part of figuring out SAS 🙂 Although with my clarification of mkeintz's quesiton, idk if this would change anything lol!

cardinull · ‎09-28-2019

Thanks for a reply! I would keep neither.

cardinull · ‎09-28-2019

Hello! I want to create a list of partial duplicates - observations that have the same values for ID, date, and time, but different values in all other variables. The only condition is that I want to ignore differences due to missing values. For example, let's say I have the dataset: ID VAR1_ VAR2_ VAR3_ VAR4_ date time 1 1 1 1 2 9 15 1 2 . . 2 9 15 1 2 2 2 1 9 15 2 1 1 2 2 10 20 2 1 . 2 3 10 20 2 3 2 2 . 10 20 2 1 1 2 3 10 20 3 . 1 3 4 9 15 3 2 1 . 4 9 15 Edit: I just realized I framed this wrong. I'd ultimately want to keep only one of row 5 or 7 as they are partial duplicates to the other observations of ID=2, date=10, and time=20. However, I'd want to remove both rows with ID=3 as when one is absorbed by the other, it no longer has partial duplicates. Sorry for the confusion! : ID VAR1_ VAR2_ VAR3_ VAR4_ date time 1 1 1 1 2 9 15 1 2 . . 2 9 15 1 2 2 2 1 9 15 2 1 1 2 2 10 20 2 3 2 2 . 10 20 2 1 1 2 3 10 20 Is this possible in SAS? Is this not practical to code? Below I have the SAS code to make the data: data dt; input ID VAR1 VAR2 VAR3 VAR4 date time; datalines; 1 1 1 1 2 9 15 1 2 . . 2 9 15 1 2 2 2 1 9 15 2 1 1 2 2 10 20 2 1 . 2 3 10 20 2 3 2 2 . 10 20 2 1 1 2 3 10 20 3 . 1 3 4 9 15 3 2 1 . 4 9 15 ; run; Thanks!

Online Status	Offline
Date Last Visited	‎11-15-2019 06:48 PM

Re: Keeping Partial Duplicates but Ignoring Missings

Re: Keeping Partial Duplicates but Ignoring Missings

Re: Keeping Partial Duplicates but Ignoring Missings

Re: Keeping Partial Duplicates but Ignoring Missings

Re: Keeping Partial Duplicates but Ignoring Missings

Keeping Partial Duplicates but Ignoring Missings

Re: Keeping Partial Duplicates but Ignoring Missings

Re: Keeping Partial Duplicates but Ignoring Missings

Re: Keeping Partial Duplicates but Ignoring Missings

Re: Keeping Partial Duplicates but Ignoring Missings

Re: Keeping Partial Duplicates but Ignoring Missings

Re: Keeping Partial Duplicates but Ignoring Missings

Re: Keeping Partial Duplicates but Ignoring Missings

Re: Keeping Partial Duplicates but Ignoring Missings

Keeping Partial Duplicates but Ignoring Missings