I'm working with a health insurance claims database and am trying to identify instances in which there was a drug claim for an antibiotic that occurred either on the same day as a claim with a diagnosis code of bronchitis or within the 3 days after of a claim with a diagnsois code of bronchitis. I am working with two tables in PROC SQL, the table antibiotics (which has all drug claims for antibiotics for people in my sample) and the table bronchitis, which has all claims with a diagnosis of bronchitis for people in my sample. If there was more than one claim with a diagnosis of bronchitis on the same day for a particular person, I consider that to be one instance of a bronchitis diagnosis, and I only want to allow a maximum of one antibiotic per instance of a bronchitis diagnosis.
Here is my code:
proc sql; create table antibiotics_bronchitis as select L.claimid, L.servicedate, L.drugname, R.claimid as claimid_bronchitis, R.servicedate as svcdate_bronchitis, from work.antibiotics as L inner join work.bronchitis as R on L.enrolid=R.enrolid where L.servicedate-3 <= servicedate_bronchitis <= L.servicedate order by L.claimid; quit;
Using this code, I get many observations with duplicate claimid when a particular drug claim (say one that occurred January 4, 2015) is matched with more than one claim with a diagnosis of bronchitis with a service date of January 1, January 2, January 3, or January 4 (e.g.,this would happen if there were two claims with a diagnosis of bronchitis of January 1, one claim with a diagnosis of bronchitis on January 1 and one claim with a diagnosis of bronchitis on January 2, etc.). Using the "distinct" term in the select statement doesn't do anything because observations with duplicate claimid will have different claimid_bronchitis. I could just use proc sort NODUPKEY by claimid after this code, but I was wondering if there is an alternative method within SQL. Thanks.
... View more