BookmarkSubscribeRSS Feed
sjarvis847
Fluorite | Level 6

Hello everyone, 

 

I recently received data from the CDC's national death index. They provide death data for subjects and if duplicate records are found they provide back all of the data found and let you feed through it. There is one variable they return called exact match; however if there is an exact match and there are multiple records pulled up, they still send all the matching files. I would like to keep the exact match if found and delete any subsequent records for the same patient, without deleting duplicates for remaining records without exact matches. Below is an example of some data (not the real data of course). See patient ID #1 has an exact match, so I would want to keep that match and delete the following two. Patient ID #2 does not have an exact match, so I have to look at more information to see what row I want to keep, therefore I do not want to delete any records for patient ID #2. 

 

ID   DOB              State   DeathDate      ExactMatch

1     9/15/2010      CO     10/26/2016         *

1     9/15/2010      IL        10/28/2016

1     9/15/2010      MO     1/6/2016

2     10/2/2013      KY       3/16/2016           

2     10/2/2013      NY       2/26/2016          

2     10/2/2013      CO      11/46/2016

 

Thanks for any help you can provide! 

 

Stephanie

 

4 REPLIES 4
novinosrin
Tourmaline | Level 20

Assuming i understand your req

 

data have;
infile cards truncover;
input ID   DOB  :mmddyy10.            State  $ DeathDate   :mmddyy10.       ExactMatch $; 
format dob DeathDate mmddyy10.;
cards;
1     9/15/2010      CO     10/26/2016         *
1     9/15/2010      IL        10/28/2016
1     9/15/2010      MO     1/6/2016
2     10/2/2013      KY       3/16/2016           
2     10/2/2013      NY       2/26/2016          
2     10/2/2013      CO      11/26/2016
;

/*assuming your datasets are sorted by id as shown*/

data want;
merge have have(in=b where=(ExactMatch='*'));
by id;
if b then do;
if ExactMatch='*' then output;
end;
else output;
run; 
mkeintz
PROC Star

@novinosrin.   The merge of a sorted dataset with a specific subset of same is a very useful approach.  But don't overlook the additional power of a well-constructed subetting if statement.

 

Consider replacing

if b then do;
if ExactMatch='*' then output;
end;
else output;

with:

if b=0 or exactmatch='*';

 

Similarly for the hash offering for non-sorted data.  You can replace

rc= h.check();
if rc=0 then do;
if ExactMatch='*' then output;
end;
else output;
drop rc;

with

if h.check()^=0 or exactmatch='*';

 

I knew study of set unions, intersections, and complements would be useful one day.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
novinosrin
Tourmaline | Level 20

@mkeintz   Thank you mr elegance & diligence combined as always. 

novinosrin
Tourmaline | Level 20

Assuming i understand your req

 



data have;
infile cards truncover;
input ID   DOB  :mmddyy10.            State  $ DeathDate   :mmddyy10.       ExactMatch $; 
format dob DeathDate mmddyy10.;
cards;
1     9/15/2010      CO     10/26/2016         *
1     9/15/2010      IL        10/28/2016
1     9/15/2010      MO     1/6/2016
2     10/2/2013      KY       3/16/2016           
2     10/2/2013      NY       2/26/2016          
2     10/2/2013      CO      11/26/2016
;
/*sorting doesn't matter*/
data want;
if _n_=1 then do;
  declare hash H (dataset:'have(where=(ExactMatch="*"))') ;
   h.definekey  ('id') ;
   h.definedata ('id') ;
    h.definedone () ;
end;
set have;
rc= h.check();
if rc=0 then do;
if ExactMatch='*' then output;
end;
else output;
drop rc;
run;

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 838 views
  • 2 likes
  • 3 in conversation