Hi,
I have the following dataset,
ID | Admission | Start_Year | End_Year |
|
|
|
|
1 | 1 | 10/11 | 11/12 |
1 | 2 | 10/11 | 12/13 |
1 | 3 | 12/13 | 15/16 |
|
|
|
|
2 | 1 | 10/11 | 11/12 |
2 | 2 | 10/11 | 12/13 |
2 | 3 | 10/11 | 15/16 |
|
|
|
|
3 | 1 | 10/11 | 12/13 |
3 | 2 | 14/15 | 15/16 |
All admissions have a start date and end date in fiscal years (i.e. from April 01st to March 31st). All admissions start in one fiscal year and end in another fiscal year.
I want to include the most encompassing admissions for each individual i.e. the admissions with the most N fiscal years. I also want to include admissions to ensure there is representation from each fiscal year for each individual, even though they may not be the most encompassing admissions.
For example, in the case of ID 3, the most encompassing admission is admission 1. As such, it should be included. However, I also want to include admission 2 because it includes fiscal years 14/15 and 15/16, which are not represented by admission 1.
The desired output should be as follows,
ID | Admission | Start_Year | End_Year | Include |
|
|
|
|
|
1 | 1 | 10/11 | 11/12 | 0 |
1 | 2 | 10/11 | 12/13 | 1 |
1 | 3 | 12/13 | 15/16 | 1 |
|
|
|
|
|
2 | 1 | 10/11 | 11/12 | 0 |
2 | 2 | 10/11 | 12/13 | 0 |
2 | 3 | 10/11 | 15/16 | 1 |
|
|
|
|
|
3 | 1 | 10/11 | 12/13 | 1 |
3 | 2 | 14/15 | 15/16 | 1 |
Any help would be much appreciated.
Thank you!
Try this:
data have;
input ID Adm Start End;
datalines;
1 1 10 11
1 2 10 12
1 3 12 15
2 1 10 11
2 2 10 12
2 3 10 15
3 1 10 12
3 2 14 15
;
proc sql;
create table want as
select
ID,
Adm,
Start,
end,
not exists (
select ID
from have
where ID = a.ID and
Adm ne a.Adm and
Start <= a.start and
End >= a.end ) as include
from have as a;
select * from want;
quit;
Hi PGStats,
Thank you for your response.
The code is implementing the desired logic. Can you please explain the steps in the code, so I can get an idea about the implementation for the full dataset?
Thank you once again.
The query sets include = 1 (TRUE) for periods (start-end) that do not have an enclosing period for the same ID. An enclosing period is one that starts before or in the same year and that ends after or on the same year as the period considered.
The query sets include = 0 (FALSE) for periods that do have an enclosing period for the same ID.
Hi PGStats,
Thank you for your response.
I have a larger dataset with other variables as well. I want to execute this logic for only a subset of the data based on other conditions (e.g. males, adults etc.). I tried including an additional “where” statement at the end to specify the additional conditions, but the code does not execute properly. Can you please suggest how to incorporate additional conditions?
Thank you.
****
Example:
proc sql;
create table want as
select
ID,
Adm,
Start,
end,
not exists (
select ID
from have
where ID = a.ID and
Adm ne a.Adm and
Start <= a.start and
End >= a.end ) as include
from have as a;
where (variableA=x) and (variableB=y) and (variable=z)
quit;
Always be specific what's not working i.e. by posting the relevant SAS Log message.
Given the code you've posted: Remove the semicolon after from have as a;
...
from have as a
where (variableA=x) and (variableB=y) and (variable=z)
;
quit;
Hi Patrick,
I just caught a mistake in the earlier post. I had actually executed the code without the semi colon.
There is no error in the log, but the code is not executing properly. It is erroneously marking observations as 1, when they should 0. In the updated code, observations that do not meet the new “where” clause should be denoted as 0.
Thank you.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.