Hi, I am processing blood lead tests. Below is a sample of the data: Record_id Collected_date Pb_result Days LEAD1000 9/14/2007 9 . LEAD1005 4/30/2007 5 . LEAD1005 1/22/2008 7 267 LEAD1007 3/9/2007 6 . LEAD1009 7/11/2007 17 . LEAD1009 8/31/2007 22 51 LEAD1009 9/21/2007 10 21 LEAD1009 2/22/2008 15 154 LEAD1011 1/22/2007 6 . LEAD1011 7/9/2007 7 168 LEAD1011 9/17/2007 10 70 LEAD1012 10/22/2007 8 . The record_id is the unique id for each child in the dataset, collected_date is the date of collection/test, Pb_result is the result of the test, and Days is the number of days between tests for the same child. I have sorted the data by record_id and collected_date. I need to look for confirmatory pairs - tests where the the same child (record_id) received a blood lead test within 1 to 84 days of a previous test, and then select the test with the highest result. Record_id LEAD1009 had 2 tests within 51 days of each other and the second test, where the record for the 8/31/2007 test with Pb_result=22, should be output. The next pair would be the test performed on 8/31/2007 and 9/21/2007, where the 8/31/2007 test should again be output because it was the higher test result of the two.The third and fourth tests for LEAD1009 should not result in any output because the time between tests was greater than 84 days. If the child had only one test then I'd like to output those records to a different dataset. I have attached an additional schematic of what I am trying to accomplish for further clarification. I just can't seem to wrap my head around how to successfully code to achieve the results I am looking for. Thanks in advance for any help.
... View more