Proc sql update not working: updating a dataset based on dates in a se...

dominiquec · Posted 09-22-2023 01:12 PM

I'm having a hard time getting the code below to run at all. I am working with medication fill data, and I have two files which separately list all fills for two different medications. In the dataset 'nif2', I am trying to create a flag to indicate a washout period: within a unique ID, for each fill of medication #1 in 'nif2', I want to see if there is also a fill for medication #2 in 'allfills' in the 120 days preceding the fill in 'nif2'.

data nifgroup.nif2;
set nifgroup.nif1;
amlinwash = 0;
run;

proc sql;
update nifgroup.nif2 as a
set amlinwash = 1
where exists 
(select * from amlgroup.allfills as b
where a.patid = b.patid and b.fill_dt <= a.fill_dt and b.fill_dt >= (a.fill_dt -120));
quit;

When I run this code, it does not stop (even after an unreasonable amount of time). I need to stop the code myself. When I check the log, it shows that it gets to the 'proc sql' step and the first row in 'allfills' is read, but nothing has happened after that. Any suggestions for why the code above is getting stuck is much appreciated.

Tom · Posted 09-22-2023 02:43 PM

Don't use UPDATE.

Seems like a very strange strategy to go to the trouble to make a dataset that you then immediately turn around and try to modify. Just make the dataset you want directly.

proc sql;
create table nifgroup.nif2 as 
  select a.*
       , max(b.fill_dt between a.fill_dt - 120 and a.fill_dt) as amlinwash
  from nifgroup.nif1 a
  left join amlgroup.allfills b
    on a.patid = b.patid
  group by a.patid, a.fill_dt
;
quit;

dominiquec · Posted 09-25-2023 01:11 PM

Thank you for your help! When I try your code, the number of observations increased (not something I expected). What happened was that for each patid that appears in 'allfills', the number of observations for that individual in 'nif2' is the number of obs in 'nif1' multiplied by the number of obs in 'allfills'.

Here's a made-up example of what I'm trying to get:

nifgroup.nif1

patid	fill_dt
100100	08/05/2019
100100	09/01/2019
100100	09/30/2019
100100	10/28/2019
100100	11/29/2019

amlgroup.allfills

patid	fill_dt
100100	01/02/2019
100100	03/30/2019
100100	06/25/2019
100100	07/18/2019

nifgroup.nif2

patid	fill_dt	amlinwash
100100	08/05/2019	1
100100	09/01/2019	1
100100	09/30/2019	1
100100	10/28/2019	1
100100	11/29/2019	0

For patid 100100, this person currently has 20 observations in nif2, with each fill_dt repeating 4 times. I need the individual to retain the original number of observations from 'nif1' when creating 'nif2', with only the creation of amlinwash added to the new dataset (as shown in table).

In my data, each fill_dt has a unique 'fill ID'; if I take the first unique value of the fill ID within each unique patid, I should get what I need.

Proc sql update not working: updating a dataset based on dates in a second dataset

Re: Proc sql update not working: updating a dataset based on dates in a second dataset

Re: Proc sql update not working: updating a dataset based on dates in a second dataset

Proc sql update not working: updating a dataset based on dates in a second dataset

Re: Proc sql update not working: updating a dataset based on dates in a second dataset

Re: Proc sql update not working: updating a dataset based on dates in a second dataset

Registration is open

SAS Training: Just a Click Away