I have a specific requirement. Keep all the duplicate records if the group is different except in one instance If the record is unique with Group='Other' then keep this otherwise delete all duplicate records where group='Other'. Always take the most recent subDate
data to_clean;
infile cards dlm='|' truncover ;
input subDate :mmddyy10. unitName :$100. ADDR1 :$100. group $4.;
format subDate yymmdd10.;
cards;
11/21/2020|BAIRD HUDSON ENTERPRISES|106 E MAIN|Other
10/30/2020|BAIRD HUDSON ENTERPRISES|106 E MAIN STREET|TNF
10/30/2020|BIG HORN ENTERPRISE|146 S. BENT STREET|SNF
10/30/2020|BIG HORN ENTERPRISE|641 WARREN STREET|SNF
11/5/2020|BROOKDALE |tt|ALF
10/29/2020|BROOKDALE|2401 COUGAR AVENUE|ALF
10/30/2020|ELMCROFT|1551 SUGARLAND DRIVE|ALF
11/2/2020|ELMCROFT|1551 SUGARLAND DRIVE DRIVE|SNF
11/21/2020|GREEN HOUSE LIVING|2311 SHIRLEY|SNF
10/29/2020|GREEN HOUSE LIVING|2311 SHIRLEY COVE|ALF
11/21/2020|MISSION AT THE VILLA|1445 UINTA|ALF
11/2/2020|MISSION AT THE VILLA|1445 UINTA DRIVE|Other
Maybe this is a late-monday-effect (had a day off yesterday): i don't see a clear definition of "duplicate" in your description.
Maybe this is a late-monday-effect (had a day off yesterday): i don't see a clear definition of "duplicate" in your description.
@Stalk wrote:
Duplicates are across the multiple fields unitName, ADDR1,group;
If a record is unique and GROUP='Other' then keep
If a record is a duplicate and GROUP='Other', then delete that one.
And what should happen to those observations with Group NE 'Other'?
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.