About dominiquec

dominiquec · ‎12-19-2023

Thank you, this gave me exactly what I needed! I definitely overcomplicated this and need to practice more with proc summary.

dominiquec · ‎12-19-2023

Certainly! An example of what the data currently look like is below: patient_id exposure_type hosp_dt diag1 diag2 diag3 diag4 100100 1 01.01.1995 0 1 0 1 100100 1 06.15.1995 1 0 0 1 100100 2 10.01.1995 1 0 0 0 100100 2 12.15.1995 1 0 1 0 Right now, my results look like a copy of the original file: patient_id exposure_type everdiag1 everdiag2 everdiag3 everdiag4 100100 1 0 1 0 1 100100 1 1 0 0 1 100100 2 1 0 0 0 100100 2 1 0 1 0 For patient 100100, the everdiag for diag1 should be 1 for both exposure types 1 and 2, as at least one hospital visit had that diagnosis associated with it during each exposure type. However, the everdiag flag for diag3 should only be 1 for only exposure type 2, as this diagnosis only appeared during a hospitalization when the patient was in exposure type 2. See below for how I would like the final dataset to look like. All the everdiag flags should look the same within each unique patient ID and exposure type: patient_id exposure_type everdiag1 everdiag2 everdiag3 everdiag4 100100 1 1 1 0 1 100100 1 1 1 0 1 100100 2 1 0 1 0 100100 2 1 0 1 0

dominiquec · ‎12-19-2023

I am working with patient hospitalization data, which includes a series of diagnosis code flags for each hospitalization. In our study design, patients have different exposures during different periods of time (exposure_type). I want to summarize the hospitalizations within each exposure type for each patient by creating an 'ever diagnosed flag' for each of the diagnosis codes in the study. This 'ever diagnosed flag' will indicate if a patient ever had that diagnosis code associated with any of their visits during each exposure period. This is my code so far: data want; set have; by patient_id exposure_type; array oldflags {20} diag1-diag20; array overallflg {20} everdiag1-everdiag20; if first.patient_id and first.exposure_type then do i = 1; overallflg[i] = 0; end; do i = 1 to 20; overallflg[i] = max(oldflags[i]); end; keep patient_id exposure_type everdiag1-everdiag20; run; When I run this above code, the max value for each diagnosis flag isn't carrying over. everdiag1-everdiag20 are basically copies of the original flags, and the max value within each unique ID and exposure date is not retained. What am I missing? Thank you for your time.

dominiquec · ‎09-25-2023

Thank you for your help! When I try your code, the number of observations increased (not something I expected). What happened was that for each patid that appears in 'allfills', the number of observations for that individual in 'nif2' is the number of obs in 'nif1' multiplied by the number of obs in 'allfills'. Here's a made-up example of what I'm trying to get: nifgroup.nif1 patid fill_dt 100100 08/05/2019 100100 09/01/2019 100100 09/30/2019 100100 10/28/2019 100100 11/29/2019 amlgroup.allfills patid fill_dt 100100 01/02/2019 100100 03/30/2019 100100 06/25/2019 100100 07/18/2019 nifgroup.nif2 patid fill_dt amlinwash 100100 08/05/2019 1 100100 09/01/2019 1 100100 09/30/2019 1 100100 10/28/2019 1 100100 11/29/2019 0 For patid 100100, this person currently has 20 observations in nif2, with each fill_dt repeating 4 times. I need the individual to retain the original number of observations from 'nif1' when creating 'nif2', with only the creation of amlinwash added to the new dataset (as shown in table). In my data, each fill_dt has a unique 'fill ID'; if I take the first unique value of the fill ID within each unique patid, I should get what I need.

dominiquec · ‎09-22-2023

I'm having a hard time getting the code below to run at all. I am working with medication fill data, and I have two files which separately list all fills for two different medications. In the dataset 'nif2', I am trying to create a flag to indicate a washout period: within a unique ID, for each fill of medication #1 in 'nif2', I want to see if there is also a fill for medication #2 in 'allfills' in the 120 days preceding the fill in 'nif2'. data nifgroup.nif2; set nifgroup.nif1; amlinwash = 0; run; proc sql; update nifgroup.nif2 as a set amlinwash = 1 where exists (select * from amlgroup.allfills as b where a.patid = b.patid and b.fill_dt <= a.fill_dt and b.fill_dt >= (a.fill_dt -120)); quit; When I run this code, it does not stop (even after an unreasonable amount of time). I need to stop the code myself. When I check the log, it shows that it gets to the 'proc sql' step and the first row in 'allfills' is read, but nothing has happened after that. Any suggestions for why the code above is getting stuck is much appreciated.

dominiquec · ‎09-14-2023

Thank you! I was able to apply the code you suggested successfully.

dominiquec · ‎09-12-2023

SAS learner here! I have a dataset with start and stop dates of eligibility; unique IDs in the data can repeat, with multiple start (eligeff) and stop (eligend) dates of eligibility. Within a unique ID, I am trying to 1) collapse any overlapping date ranges or 2) stitch together any date ranges that are a maximum of 1 day apart (i.e. one row has a stop date of 12.31.2019 and the next row has a start date of 01.01.2020). Any eligibility periods 2 days or more apart should remain as they are (i.e. one row has a stop date of 12.31.2019 and the next start date is 01.02.2020). Here is the code I've written so far: proc sort data=amlgroup.mbr_0; by patid eligeff eligend; run; data amlgroup.mbr_1; set amlgroup.mbr_0; by patid eligeff eligend; retain maxenr_dt minenr_dt; if first.patid then do; minenr_dt = eligeff; maxenr_dt = eligend; end; else do; if eligend <=maxenr_dt then delete; if . < eligeff - maxenr_dt <= 1 and maxenr_dt < eligend then maxenr_dt = eligend; if eligeff - maxenr_dt > 1 then do; minenr_dt = eligeff; maxenr_dt = eligend; end; end; format maxenr_dt minenr_dt yymmdd10.; run; The problem I have is when I grab minenr_dt and maxenr_dt for the next phase of my project, as I only want the final, collapsed date ranges (see example of resulting dataset below). I need to drop rows where the date ranges are nested within a larger date range for the same ID (see example, I need to drop the first row for ID 123456), but still be able to keep all non-overlapping date ranges (see example, I don't want to drop either of the rows for ID 789100). Any suggestions for how to change my code? Thank you. ID eligeff eligend minenr_dt maxenr_dt 123456 01.01.2015 09.30.2018 01.01.2015 09.30.2018 123456 10.01.2018 03.25.2022 01.01.2015 03.25.2020 789100 02.01.2017 12.31.2018 02.01.2017 12.31.2018 789100 01.01.2020 06.30.2022 01.01.2020 06.30.2022

Online Status	Offline
Date Last Visited	‎04-12-2024 06:02 PM

Re: Using an array to retain max value by ID

Re: Using an array to retain max value by ID

Using an array to retain max value by ID

Re: Proc sql update not working: updating a dataset based on dates in ...

Proc sql update not working: updating a dataset based on dates in a se...

Re: How can I retain unique date ranges only after collapsing any over...

How can I retain unique date ranges only after collapsing any overlapp...

Re: Using an array to retain max value by ID

Re: Using an array to retain max value by ID

Using an array to retain max value by ID

Re: Proc sql update not working: updating a dataset based on dates in ...

Proc sql update not working: updating a dataset based on dates in a se...

Re: How can I retain unique date ranges only after collapsing any over...

How can I retain unique date ranges only after collapsing any overlapp...