Hello everyone,
I recently received data from the CDC's national death index. They provide death data for subjects and if duplicate records are found they provide back all of the data found and let you feed through it. There is one variable they return called exact match; however if there is an exact match and there are multiple records pulled up, they still send all the matching files. I would like to keep the exact match if found and delete any subsequent records for the same patient, without deleting duplicates for remaining records without exact matches. Below is an example of some data (not the real data of course). See patient ID #1 has an exact match, so I would want to keep that match and delete the following two. Patient ID #2 does not have an exact match, so I have to look at more information to see what row I want to keep, therefore I do not want to delete any records for patient ID #2.
ID DOB State DeathDate ExactMatch
1 9/15/2010 CO 10/26/2016 *
1 9/15/2010 IL 10/28/2016
1 9/15/2010 MO 1/6/2016
2 10/2/2013 KY 3/16/2016
2 10/2/2013 NY 2/26/2016
2 10/2/2013 CO 11/46/2016
Thanks for any help you can provide!
Stephanie
Assuming i understand your req
data have;
infile cards truncover;
input ID DOB :mmddyy10. State $ DeathDate :mmddyy10. ExactMatch $;
format dob DeathDate mmddyy10.;
cards;
1 9/15/2010 CO 10/26/2016 *
1 9/15/2010 IL 10/28/2016
1 9/15/2010 MO 1/6/2016
2 10/2/2013 KY 3/16/2016
2 10/2/2013 NY 2/26/2016
2 10/2/2013 CO 11/26/2016
;
/*assuming your datasets are sorted by id as shown*/
data want;
merge have have(in=b where=(ExactMatch='*'));
by id;
if b then do;
if ExactMatch='*' then output;
end;
else output;
run;
@novinosrin. The merge of a sorted dataset with a specific subset of same is a very useful approach. But don't overlook the additional power of a well-constructed subetting if statement.
Consider replacing
if b then do;
if ExactMatch='*' then output;
end;
else output;
with:
if b=0 or exactmatch='*';
Similarly for the hash offering for non-sorted data. You can replace
rc= h.check();
if rc=0 then do;
if ExactMatch='*' then output;
end;
else output;
drop rc;
with
if h.check()^=0 or exactmatch='*';
I knew study of set unions, intersections, and complements would be useful one day.
@mkeintz Thank you mr elegance & diligence combined as always.
Assuming i understand your req
data have;
infile cards truncover;
input ID DOB :mmddyy10. State $ DeathDate :mmddyy10. ExactMatch $;
format dob DeathDate mmddyy10.;
cards;
1 9/15/2010 CO 10/26/2016 *
1 9/15/2010 IL 10/28/2016
1 9/15/2010 MO 1/6/2016
2 10/2/2013 KY 3/16/2016
2 10/2/2013 NY 2/26/2016
2 10/2/2013 CO 11/26/2016
;
/*sorting doesn't matter*/
data want;
if _n_=1 then do;
declare hash H (dataset:'have(where=(ExactMatch="*"))') ;
h.definekey ('id') ;
h.definedata ('id') ;
h.definedone () ;
end;
set have;
rc= h.check();
if rc=0 then do;
if ExactMatch='*' then output;
end;
else output;
drop rc;
run;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.