BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
eawh100
Obsidian | Level 7

I'm trying to flag duplicate scan dates per location per sequence_no. If a scan date and location are entered in a later visit date, that row needs to be flagged "duplicate" see rows 5 and 6. The original data is row 4, the first visit date.  I'm really having trouble with this because of all the variables, it just seems I can't get the Proc sort exactly right, that's why I'd like a flag.  The flag column is what I want to add. 

Any advice? 

Data I want (Column flag is to be added): 

rowSEQUENCE_NO_VISIT_DATEscan_datelocationflag
121/16/20181/16/2018a 
224/12/20184/12/2018a 
328/23/20188/23/2018a 
422/26/20192/26/2019a 
529/5/20192/26/2019aduplicate
623/2/20202/26/2019aduplicate
729/5/20199/3/2019a 
823/2/20209/3/2019aduplicate
923/2/20202/27/2020a 
1037/2/20206/16/2020c 
11310/1/20206/16/2020cduplicate
1234/1/20216/16/2020cduplicate
1339/27/20216/16/2020cduplicate
1437/2/20206/16/2020e 
15310/1/20206/16/2020eduplicate
1634/1/20216/16/2020eduplicate
1739/27/20216/16/2020eduplicate
1837/2/20206/16/2020d 
19310/1/20206/16/2020dduplicate
2034/1/20216/16/2020dduplicate
2139/27/20216/16/2020dduplicate
2233/28/20223/24/2022c 
2333/28/20223/24/2022e 
2433/28/20223/24/2022d 
2539/29/20229/27/2022c 
2639/29/20229/27/2022e 
1 ACCEPTED SOLUTION

Accepted Solutions
PeterClemmensen
Tourmaline | Level 20

How about

 

data have;
input SEQUENCE_NO_ (VISIT_DATE scan_date)(:mmddyy10.) location :$1.;
format VISIT_DATE scan_date mmddyy10.;
datalines;
2 1/16/2018 1/16/2018 a 
2 4/12/2018 4/12/2018 a 
2 8/23/2018 8/23/2018 a 
2 2/26/2019 2/26/2019 a 
2 9/5/2019  2/26/2019 a 
2 3/2/2020  2/26/2019 a 
2 9/5/2019  9/3/2019  a 
2 3/2/2020  9/3/2019  a 
2 3/2/2020  2/27/2020 a 
3 7/2/2020  6/16/2020 c 
3 10/1/2020 6/16/2020 c 
3 4/1/2021  6/16/2020 c 
3 9/27/2021 6/16/2020 c 
3 7/2/2020  6/16/2020 e 
3 10/1/2020 6/16/2020 e 
3 4/1/2021  6/16/2020 e 
3 9/27/2021 6/16/2020 e 
3 7/2/2020  6/16/2020 d 
3 10/1/2020 6/16/2020 d 
3 4/1/2021  6/16/2020 d 
3 9/27/2021 6/16/2020 d 
3 3/28/2022 3/24/2022 c 
3 3/28/2022 3/24/2022 e 
3 3/28/2022 3/24/2022 d 
3 9/29/2022 9/27/2022 c 
3 9/29/2022 9/27/2022 e 
;

data want;
   set have;
   by SEQUENCE_NO_ location notsorted scan_date notsorted;
   if not first.scan_date then flag = 'duplicate';
run;

View solution in original post

2 REPLIES 2
PeterClemmensen
Tourmaline | Level 20

How about

 

data have;
input SEQUENCE_NO_ (VISIT_DATE scan_date)(:mmddyy10.) location :$1.;
format VISIT_DATE scan_date mmddyy10.;
datalines;
2 1/16/2018 1/16/2018 a 
2 4/12/2018 4/12/2018 a 
2 8/23/2018 8/23/2018 a 
2 2/26/2019 2/26/2019 a 
2 9/5/2019  2/26/2019 a 
2 3/2/2020  2/26/2019 a 
2 9/5/2019  9/3/2019  a 
2 3/2/2020  9/3/2019  a 
2 3/2/2020  2/27/2020 a 
3 7/2/2020  6/16/2020 c 
3 10/1/2020 6/16/2020 c 
3 4/1/2021  6/16/2020 c 
3 9/27/2021 6/16/2020 c 
3 7/2/2020  6/16/2020 e 
3 10/1/2020 6/16/2020 e 
3 4/1/2021  6/16/2020 e 
3 9/27/2021 6/16/2020 e 
3 7/2/2020  6/16/2020 d 
3 10/1/2020 6/16/2020 d 
3 4/1/2021  6/16/2020 d 
3 9/27/2021 6/16/2020 d 
3 3/28/2022 3/24/2022 c 
3 3/28/2022 3/24/2022 e 
3 3/28/2022 3/24/2022 d 
3 9/29/2022 9/27/2022 c 
3 9/29/2022 9/27/2022 e 
;

data want;
   set have;
   by SEQUENCE_NO_ location notsorted scan_date notsorted;
   if not first.scan_date then flag = 'duplicate';
run;
eawh100
Obsidian | Level 7

Thanks! I think this is exactly what I wanted, never thought to use notsorted

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 656 views
  • 2 likes
  • 2 in conversation