Solved: Re: finding values happened 30 days before a reference date and 30 day...

lillymaginta · Posted 02-13-2016 08:57 PM

id     d_date        rx_name       rx_date                   h_date
1      3/2/2005          a                    3/4/2005                 2/29/2005
1                        b                    4/4/2005
1                        c                    2/25/2005
1                        d 
1                        e
1
1
1
1
2
2
2
2
3
3
3

I have the following data where h_date is hospitalization date and d_date is discharge date. Each rx_name happen in the corresponding rx_Date. I am trying to find rx_name that happened within 30 day before h_date. Also, I want seperately to find rx_name that happaned within 30 days after d_date.

Thanks.

Reeza · Posted 02-13-2016 10:59 PM

Your requirements aren't consistent between your first post, last post and your output.

That being said, this is how I would do it. First I'd create separate tables for my discharge and hospitalization data. Then I'd use SQL to do the join conditons. This method is more effectie when you have multiple hospitalizations and discharge dates per ID. You haven't shown that, but that's the usual structure for this type of data. I'll leave the data separation as an exercise for you.

The sample code below finds a drugs with a date 30 days before hospitalization, in a table called PRE30

and any drugs 30 days after discharge, called POST30. You may want to verify the conditons are what you want, I didn't understand your requirements. You can modify the WHERE condition to match your needs, and I would consider changing the LT and GT to LE and GE, but again, that depends on your requirements.

data hosp;
informat h_date mmddyy10.;
format h_date date9.;
input id $ h_date;
cards;
1 2/28/2005
;
run;

data disch;
informat d_date mmddyy10.;
format d_date date9.;
input id $ d_date;
cards;
1 3/2/2005
;
run;

data drugs;
informat rx_date mmddyy10.;
format rx_date date9.;
input id $ rx $ rx_date;
cards;
1 A 3/4/2005
1 B 4/4/2005
1 D 2/25/2005
;
run;

proc sql;
create table pre30 as
select a.Id, b.rx, b.rx_date, a.h_date, a.h_date-b.rx_date as days_before_hosp
from hosp as a
join drugs as b
on a.id=b.id
where 0< (a.h_date-b.rx_date) < 30;
quit;

proc sql;
create table post30 as
select a.Id, b.rx, b.rx_date, a.d_date, b.rx_date-a.d_date as days_after_disch
from disch as a
join drugs as b
on a.id=b.id
where 0< (b.rx_date-a.d_date) < 30;
quit;

View solution in original post

Reeza · Posted 02-13-2016 10:15 PM

Please post a sample of your desired output.

lillymaginta · Posted 02-13-2016 10:38 PM

I want two seperate datasets;

dataset A, rx_name within 30 days before d_date

id rx_name

1 c

dataset b, rx_name within 30 days post :

id rx_name

1 a

Reeza · Posted 02-13-2016 10:59 PM

Your requirements aren't consistent between your first post, last post and your output.

That being said, this is how I would do it. First I'd create separate tables for my discharge and hospitalization data. Then I'd use SQL to do the join conditons. This method is more effectie when you have multiple hospitalizations and discharge dates per ID. You haven't shown that, but that's the usual structure for this type of data. I'll leave the data separation as an exercise for you.

The sample code below finds a drugs with a date 30 days before hospitalization, in a table called PRE30

and any drugs 30 days after discharge, called POST30. You may want to verify the conditons are what you want, I didn't understand your requirements. You can modify the WHERE condition to match your needs, and I would consider changing the LT and GT to LE and GE, but again, that depends on your requirements.

data hosp;
informat h_date mmddyy10.;
format h_date date9.;
input id $ h_date;
cards;
1 2/28/2005
;
run;

data disch;
informat d_date mmddyy10.;
format d_date date9.;
input id $ d_date;
cards;
1 3/2/2005
;
run;

data drugs;
informat rx_date mmddyy10.;
format rx_date date9.;
input id $ rx $ rx_date;
cards;
1 A 3/4/2005
1 B 4/4/2005
1 D 2/25/2005
;
run;

proc sql;
create table pre30 as
select a.Id, b.rx, b.rx_date, a.h_date, a.h_date-b.rx_date as days_before_hosp
from hosp as a
join drugs as b
on a.id=b.id
where 0< (a.h_date-b.rx_date) < 30;
quit;

proc sql;
create table post30 as
select a.Id, b.rx, b.rx_date, a.d_date, b.rx_date-a.d_date as days_after_disch
from disch as a
join drugs as b
on a.id=b.id
where 0< (b.rx_date-a.d_date) < 30;
quit;

lillymaginta · Posted 02-14-2016 10:14 AM

Thanks Reeza, this is exactly what I wanted which is rx during the 30 days pre and rx within the 30 days post. I tried it and it works perfect! Lilly

lillymaginta · Posted 02-14-2016 10:47 AM

One issue with the code, it produces duplicates of the same id-rx combination for the same rx_date;

id rx_name rx_Date

1 a 05/01/2004

I tried to proc sort nodupkey by id, rx_name after the code you provided but is there a way to avoid this from the first place? thanks!

lillymaginta · Posted 02-14-2016 11:35 AM

The first part of the code works but the second part does not work, return 0 observations:

proc sql;
create table post30 as
select a.Id, b.rx, b.rx_date, a.d_date, b.rx_date-a.d_date as days_after_disch
from disch as a
join drugs as b
on a.id=b.id
where 0< (b.rx_date-a.d_date) < 30;
quit;

FreelanceReinh · Posted 02-16-2016 06:16 PM

@lillymaginta wrote:

One issue with the code, it produces duplicates of the same id-rx combination for the same rx_date;

Hi @lillymaginta,

I think, such duplicates can occur if the drug intake happened within the (overlapping) 30-days time windows of two or more hospitalizations. Are the three corresponding H_DATE (or D_DATE) values in your example all the same when you run Reeza's code? Probably not.

If you don't select H_DATE, D_DATE etc. and you don't need the duplicates, you could use the keyword DISTINCT:

select distinct a.Id, b.rx, b.rx_date

lillymaginta · Posted 02-16-2016 06:46 PM

Thank you for the suggesion Freelance. When I run the code the first time, I realized there was an error in when I processed my subset (millions of records). I have one h_date per patient which mean in the 30 days period, there should be only one rx per rx_date. After I corrected the error, I did not get the duplicates in the second time but thank you for the followup.

Lilly

Reeza · Posted 02-14-2016 12:12 PM

Post better sample data - make sure it replicates the issue and I can take a look. Given the sample data the code works.

lillymaginta · Posted 02-14-2016 03:34 PM

I just relaized the data was truncated post h_date, the second code worked. Thank you

finding values happened 30 days before a reference date and 30 days after another

Re: finding values happened 30 days before a reference date and 30 days after another

Re: finding values happened 30 days before a reference date and 30 days after another

Re: finding values happened 30 days before a reference date and 30 days after another

Re: finding values happened 30 days before a reference date and 30 days after another

Re: finding values happened 30 days before a reference date and 30 days after another

Re: finding values happened 30 days before a reference date and 30 days after another

Re: finding values happened 30 days before a reference date and 30 days after another

Re: finding values happened 30 days before a reference date and 30 days after another

Re: finding values happened 30 days before a reference date and 30 days after another

Re: finding values happened 30 days before a reference date and 30 days after another

Re: finding values happened 30 days before a reference date and 30 days after another

SAS Innovate 2025: Save the Date

SAS Training: Just a Click Away