Hello, I am stuck with categorizing patients based on their first diagnosis. Here is some background information about my project. I am interested in long-term risk of cardiovascular diseases (CVDs) among women. I will utilize 4 data sources to identify women with CVD diagnosis: Emergency department, Hospital discharge, Death certificate, and Medical claims. I pulled ICD codes and service dates from each data source and flagged them. For example, flag_isch_dc indicates an ischemic heart disease diagnosis from the death certificate while flag_isch_hds indicates an ischemic heart disease diagnosis from the hospital discharge data. The data is at the claims level, and I eventually want to bring it to the patient level. Before doing so, I will need to create a single CVD category. Given that I am using 4 different data sources and 4 different CVD diagnosis flags and dates, I am stuck with correctly ordering them because some women have diagnoses captured in one data but not in another. The patient level id is Row_id2. However, if I remove duplicates, I am losing a lot of women with their diagnosis. So, I want to correctly categorize them without losing any participants. I initially wanted to categorize women whichever the earliest diagnosis was. And I want to keep that date for the first diagnosis as I will use Cox-proportional hazard regression. To simplify further, I have 4 different flags with 4 different dates for ischemic heart disease. Example: Date of Diagnosis Ischemic Cerebrovascular Hypertension Other heart Other CVD Final CVD grouping Date of diagnosis from final CVD grouping Medical Claims MC059_MC flag_isch_mc flag_cero_mc flag_hypt_mc flag_ohrt_mc flag_ocvd_mc Emergency Department Service_from_Dt_ER flag_isch_er flag_cero_er flag_hypt_er flag_ohrt_er flag_ocvd_er HDS Service_from_Dt_HDS flag_isch_hds flag_cero_hds flag_hypt_hds flag_ohrt_hds flag_ocvd_hds Death Certificate death_date_DC flag_isch__dc flag_cero__dc flag_hypt_dc flag_ohrt_dc flag_ocvd_dc I want to group them (Final CVD grouping) in a way that woman will be classified based on her earliest diagnosis in these 4 files. Then, I should be able to bring it to patient-level data. I attached a SAS file with 60 observations with duplicates. I greatly appreciate it if you have any suggestions on how to achieve the correct classification and then bring it back to patient-level data. Thank you for your time and help.
... View more