Hi again, Apologies for resurfacing this but upon some QA I am realizing I have an issue with the code that I'm not sure what the cause of it is. Here is the code that I ran based on what you provided: data match ; if _n_ = 1 then do ; dcl hash x () ; x.definekey ("MOMSSN") ; x.definedata ("p") ; x.definedone () ; dcl hash y () ; y.definekey ("BMOMFIRSTNAME", "BMOMLASTNAME", "momyeardob") ; y.definedata ("p") ; y.definedone () ; if 0 then set deaths2 ; do p = 1 by 1 until (y2) ; set births2 end = y2 ; x.ref() ; y.ref() ; end ; end ; call missing (of _all_) ; set deaths2 ; if (SSN) = . then Match = 0 ; /*first trying to match on SSN alone and do not want matches on missing SSN's alone*/ else Match = x.find(key:SSN) eq 0 ; if not Match then Match = y.find(key:firstname,key:lastname,key:dbirthyear) eq 0 ; /*if there was not a match on SSN, trying to match on first name. last name, and year of birth*/ if Match then set births2 point = p ; run ; Now, in the output dataset MATCH - there is a key variable (datedeath) that I need for further analysis - that went missing for a lot of observations. The source dataset for this variable is deaths2 and in deaths2, none of the observations have missing data for this variable. It appears that most of the observations that have missing data for datedeath in the matched dataset MATCH now are the variables that have match =1. (And again - given that the source data does not have any observations missing this information, this is odd. Is there something in the match step that would have caused this, even though I do not cite that specific variable at all? Also, one thing of note (not sure if it's relevant) - there should only be one death record for each individual BUT they can potentially be in the matched dataset (births2) multiple times if they've had multiple births... Thank you for your help! I thought I was grasping this but this error is leaving me a bit confused.
... View more