Hi All,
I have two datasets, DB1 and DB2. DB1 has 3 columns Patid, Mindate and Maxdate with about 1000 rows while DB2 has 150 variables including date variables ADMDATE and Dscdate with about 232000 rows.
I would like to have a dataset DB3 which would have the data from DB2 with admdate and dscdate within the range of Mindate and Maxdate of DB1.
Sample data would be something like this
DB1
patid mindate maxdate
1 3/3/2014 12/3/2014
2 3/1/2017 9/3/2017
DB2
Patid ADMDATE dscdate
1 1/1/2014 1/3/2014
1 2/1/2014 2/3/2014
1 3/1/2014 3/3/2014
1 4/1/2014 4/3/2014
1 5/1/2014 5/3/2014
1 6/1/2014 6/3/2014
1 7/1/2014 7/3/2014
1 8/1/2014 8/3/2014
1 9/1/2014 9/3/2014
1 10/1/2014 10/3/2014
1 11/1/2014 11/3/2014
1 12/1/2014 12/3/2014
1 1/1/2015 1/3/2015
1 2/1/2015 2/3/2015
1 3/1/2016 3/3/2016
1 4/1/2016 4/3/2016
2 3/1/2017 3/3/2017
2 4/1/2017 4/3/2017
2 5/1/2017 5/3/2017
2 6/1/2017 6/3/2017
2 7/1/2017 7/3/2017
2 8/1/2017 8/3/2017
2 9/1/2017 9/3/2017
2 10/1/2017 10/3/2017
data DB3 (Want)
1 4/1/2014 4/3/2014
1 5/1/2014 5/3/2014
1 6/1/2014 6/3/2014
1 7/1/2014 7/3/2014
1 8/1/2014 8/3/2014
1 9/1/2014 9/3/2014
1 10/1/2014 10/3/2014
1 11/1/2014 11/3/2014
1 12/1/2014 12/3/2014
2 3/1/2017 3/3/2017
2 4/1/2017 4/3/2017
2 5/1/2017 5/3/2017
2 6/1/2017 6/3/2017
2 7/1/2017 7/3/2017
2 8/1/2017 8/3/2017
2 9/1/2017 9/3/2017
Any help would be highly appreciated
You should be a bit more explicit about what the meaning of "admdate and dscdate within the range of Mindate and Maxdate" is.
There are several considerations that phrase leaves as ambiguous.
What if Admdate is between Mindate and Maxdate but dscdate is not?
Or if Dscdate is between Mindate and Maxdate but Admdate is not?
Do either of these conditions get included or are you only looking for
Mindate <= admdate <= dscdate <= Maxdate?
What I want is something like this
Dataset1 has Mindate and Maxdate ( The earliest and the last date that a patient had an encounter- Out patient consultation)
Dataset2- has admission and discharge record of patients. I would like to extract all records of patients that lie between Minimum and Maximum dates of Dataset1.
For instance, If a patient has a minimum date of 1st Jan 2016 and maximum date of 05th May 2018 in dataset 1, then in dataset2 if the first admission was on 11th May 2014 and last admission 24 December 2019, I want to pull all records of admission, discharge(from dataset2) that are within the range of 1st Jan 2016 to 05th May 2018
So I believe it is
Mindate <= ADMDATE- <=Maxdate
Not sure what you are asking for but to check for overlapping intervals use this test:
/* Interval (A,B) overlaps Interval (C,D) when A<=D and B>=C Examples of Overlap: A<-->B A<-->B A<---->B A<-->B C<-->D C<-->D C<-->D C<---->D Examples of No Overlap: A<-->B A<-->B C<-->D C<-->D To account for missing values use: .Z<A<=D and B>=C>.Z */
Are these DMY or MDY dates? Supplying dates in ISO-standardized format (YYYY-MM-DD) will prevent such ambiguities.
See this:
data db1;
input patid $ mindate :mmddyy10. maxdate :mmddyy10.;
format mindate maxdate yymmdd10.;
datalines;
1 3/3/2014 12/3/2014
2 3/1/2017 9/3/2017
;
data db2;
input patid $ admdate :mmddyy10. dscdate :mmddyy10.;
format admdate dscdate yymmdd10.;
datalines;
1 1/1/2014 1/3/2014
1 2/1/2014 2/3/2014
1 3/1/2014 3/3/2014
1 4/1/2014 4/3/2014
1 5/1/2014 5/3/2014
1 6/1/2014 6/3/2014
1 7/1/2014 7/3/2014
1 8/1/2014 8/3/2014
1 9/1/2014 9/3/2014
1 10/1/2014 10/3/2014
1 11/1/2014 11/3/2014
1 12/1/2014 12/3/2014
1 1/1/2015 1/3/2015
1 2/1/2015 2/3/2015
1 3/1/2016 3/3/2016
1 4/1/2016 4/3/2016
2 3/1/2017 3/3/2017
2 4/1/2017 4/3/2017
2 5/1/2017 5/3/2017
2 6/1/2017 6/3/2017
2 7/1/2017 7/3/2017
2 8/1/2017 8/3/2017
2 9/1/2017 9/3/2017
2 10/1/2017 10/3/2017
;
data db3;
set db2;
if _N_ = 1
then do;
length mindate maxdate 8;
declare hash db1 (dataset:"db1");
db1.definekey("patid");
db1.definedata("mindate","maxdate");
db1.definedone();
call missing(mindate,maxdate);
end;
if
db1.find() = 0 and
mindate le admdate le maxdate and
mindate le dscdate le maxdate
;
drop mindate maxdate;
run;
Note how I presented the data as data steps with datalines; this removes any doubts about attributes and contents.
Please post your source data like this in the future; it's not rocket science, and it greatly helps in finding a valid answer.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.