There are two groups of medications. Group 1 medications are Medication A, B and C and Group 2 consists of Medication XX, XY, YY and YZ.
I want to flag those group of observations (within one unique ID i.e. for each patient) for which Group 1 medications’ Start_date falls between Group 2 medications’ Start_date and End_date. If these criteria meet, then flag 1(in the Group 1 medication observation).
In other words: I want to flag those group of observations (within one unique ID i.e. each patient) for which Group 2 medications’ Start_date occurred before Group 1 medications’ Start_date and Group 2 medications End_date occurred after the Group 1 medications’ Start_date. If these criteria meet, then flag 1(in the Group 1 medication observation).
There may be some other medications as well but we need not to consider those drugs.
Data I have
ID |
Med |
Start_date |
End_date |
1 |
A |
04/25/2015 |
11/20/2015 |
1 |
B |
03/03/2017 |
05/10/2017 |
1 |
A |
03/02/2019 |
04/02/2019 |
1 |
A |
10/02/2019 |
11/20/2019 |
1 |
XX |
03/10/2015 |
10/20/2015 |
1 |
XY |
10/20/2018 |
11/21/2018 |
1 |
FF |
12/02/2017 |
12/15/2017 |
2 |
B |
03/03/2016 |
05/03/2016 |
2 |
C |
06/05/2016 |
08/04/2016 |
2 |
C |
09/10/2016 |
10/11/2016 |
2 |
YY |
04/10/2016 |
05/19/2016 |
2 |
GG |
07/11/2016 |
07/11/2016 |
2 |
YZ |
03/01/2016 |
05/13/2016 |
2 |
YZ |
07/10/2016 |
10/10/2016 |
3 |
YY |
12/11/2017 |
12/17/2017 |
3 |
YZ |
01/22/2018 |
07/12/2018 |
3 |
A |
02/02/2019 |
05/20/2020 |
Data I want:
ID |
Med |
Start_date |
End_date |
Flag |
1 |
A |
04/25/2015 |
11/20/2015 |
1 |
1 |
B |
03/03/2017 |
05/10/2017 |
|
1 |
A |
03/02/2019 |
04/02/2019 |
|
1 |
A |
10/02/2019 |
11/20/2019 |
|
1 |
XX |
03/10/2015 |
10/20/2015 |
|
1 |
XY |
10/20/2018 |
11/21/2018 |
|
1 |
FF |
12/02/2017 |
12/15/2017 |
|
2 |
B |
03/03/2016 |
05/03/2016 |
1 |
2 |
C |
06/05/2016 |
08/04/2016 |
|
2 |
C |
09/10/2016 |
10/11/2016 |
1 |
2 |
YY |
04/10/2016 |
05/19/2016 |
|
2 |
GG |
07/11/2016 |
07/11/2016 |
|
2 |
YZ |
03/01/2016 |
05/13/2016 |
|
2 |
YZ |
07/10/2016 |
10/10/2016 |
|
3 |
YY |
12/11/2017 |
12/17/2017 |
|
3 |
YZ |
01/22/2018 |
07/12/2018 |
|
3 |
A |
02/02/2019 |
05/20/2020 |
|
Thank you very much for your support.
So if a group1 start date is after some group2 start date and the same group1 end date is before some group2 end date, we should flag that obs, correct? Does it have to apply to the same medication or can it be a start date for say XX and end date for say YY?
@SR11 wrote:
There are two groups of medications. Group 1 medications are Medication A, B and C and Group 2 consists of Medication XX, XY, YY and YZ.
I want to flag those group of observations (within one unique ID i.e. for each patient) for which Group 1 medications’ Start_date falls between Group 2 medications’ Start_date and End_date. If these criteria meet, then flag 1(in the Group 1 medication observation).
In other words: I want to flag those group of observations (within one unique ID i.e. each patient) for which Group 2 medications’ Start_date occurred before Group 1 medications’ Start_date and Group 2 medications End_date occurred after the Group 1 medications’ Start_date. If these criteria meet, then flag 1(in the Group 1 medication observation).
There may be some other medications as well but we need not to consider those drugs.
Data I have
ID
Med
Start_date
End_date
1
A
04/25/2015
11/20/2015
1
B
03/03/2017
05/10/2017
1
A
03/02/2019
04/02/2019
1
A
10/02/2019
11/20/2019
1
XX
03/10/2015
10/20/2015
1
XY
10/20/2018
11/21/2018
1
FF
12/02/2017
15/02/2017
2
B
03/03/2016
05/03/2016
2
C
06/05/2016
08/04/2016
2
C
09/10/2016
10/11/2016
2
YY
04/10/2016
05/19/2016
2
GG
07/11/2016
07/11/2016
2
YZ
03/01/2016
05/13/2016
2
YZ
07/10/2016
10/10/2016
3
YY
12/11/2017
12/17/2017
3
YZ
01/22/2018
07/12/2018
3
A
02/02/2019
05/20/2020
Data I want:
ID
Med
Start_date
End_date
Flag
1
A
04/25/2015
11/20/2015
1
1
B
03/03/2017
05/10/2017
1
A
03/02/2019
04/02/2019
1
A
10/02/2019
11/20/2019
1
XX
03/10/2015
10/20/2015
1
XY
10/20/2018
11/21/2018
1
FF
12/02/2017
15/02/2017
2
B
03/03/2016
05/03/2016
1
2
C
06/05/2016
08/04/2016
2
C
09/10/2016
10/11/2016
1
2
YY
04/10/2016
05/19/2016
2
GG
07/11/2016
07/11/2016
2
YZ
03/01/2016
05/13/2016
2
YZ
07/10/2016
10/10/2016
3
YY
12/11/2017
12/17/2017
3
YZ
01/22/2018
07/12/2018
3
A
02/02/2019
05/20/2020
Thank you very much for your support.
What's going on with the bolded rows above? The end date is a different format when compared to the rest.
The basic questions:
Programming is not that difficult but it's important to be solving the actual problem without making assumptions.
@SR11 wrote (bold italics mine)
@SR11 wrote:
Hi Astounding,
1. In my real dataset, yes the dates were formatted to dates(MMDDYY10.). The dataset in the post was just made in word doc.
2. The question is very logical. Thank you for the question. Yes 'falls between' includes Group 1 or 2 start and end dates. i.e. if the Group 1 start date is equal to Group 2 start or end date then Flag the observation.
"Group 1 or 2 start and end dates" suggests to me that you are look for an overlap or any group 1 date with any group 2 date. But your original post and the following phrase suggests you care about whether a group 1 start_date only overlaps with a group 2 date range.
So the program below assumes your earliest description - i.e. look only for the intersection of group 1 start_date with any group 2 range.
The program passes through each ID twice - the first pass reads only group2 records and updates an array of dates (group2_dates) with a dummy set to one for each group 2 date. The second pass re-reads the ID again (this time all records), and compares the start_date with the group2_dates array.
data want (drop=_:);
set have (where=(not (med in ('A','B','C'))) in=firstpass)
have (in=secondpass);
by id ;
array group2_dates {%sysevalf("01jan2015"d):%sysevalf("31dec2020"d)} _temporary_;
if first.id then call missing(of group2_dates{*});
if firstpass then do _d=start_date to end_date;
group2_dates{_d}=1;
end;
if secondpass;
if med in ('A','B','C') and group2_dates{start_date}=1 then flag=1;
run;
I offer no program test, as the data is not provide in the form of a working data step, which presumably would make a note about the apparently invalid date value for end_date (=15/02/2017) in the 7th data record.
This program assumes the data are sorted by ID (but can be any order within a given ID).
The primary feature here is the array. Instead of the array being indexed from (say) 1 to 1000, it is indexed on the underlying values of "01jan2015"d (20089) through "31dec2020"d (22280). Of course, you can change the lower bound and upper bound dates to fit your data.
The other point is that the array is _temporary_, meaning its values are retained from observation to over all records for a given id. So the firstpass populates the array, which is still available for the secondpass.
@mkeintz . this is at least as good as what I would have done ... creating an informat based on the Group 2 observations that concatenates ID + date as the input to the informat. Nice!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.