I have the following dataset,
Patient ID | Year | Admission Number | Problem |
1 | 2010 | 1 | 1 |
1 | 2010 | 2 | 1 |
1 | 2010 | 3 | 1 |
1 | 2011 | 1 | 1 |
2 | 2010 | 1 | 1 |
2 | 2010 | 2 | 2 |
2 | 2010 | 3 | 2 |
2 | 2011 | 1 | 2 |
I want to create a variable called “Keep” based on these data. Based on groupings by Patient ID and Year, this variable should do as follows,
The resulting database should look as follows,
Patient ID | Year | Admission Number | Problem | Keep |
1 | 2010 | 1 | 1 | 1 |
1 | 2010 | 2 | 1 | 0 |
1 | 2010 | 3 | 1 | 0 |
1 | 2011 | 1 | 1 | 1 |
2 | 2010 | 1 | 1 | 0 |
2 | 2010 | 2 | 2 | 1 |
2 | 2010 | 3 | 2 | 0 |
2 | 2011 | 1 | 2 | 1 |
Any help would be much appreciated.
Thank you.
Yes, with a minor addition,but not if they're interspersed (e.g., 2 2 1 2 2). So, yes, the following would come out as desired:
data have;
input Patient_ID Year Admission_Number Problem;
cards;
1 2010 1 1
1 2010 2 1
1 2010 3 1
1 2011 1 1
2 2010 1 1
2 2010 2 2
2 2010 3 2
2 2011 1 2
3 2010 1 2
3 2010 2 1
3 2010 3 1
3 2011 1 2
;
data need;
set have;
recnum=_n_;
run;
proc sql noprint;
create table want as
select *, min(problem) as min, max(problem) as max
from need
group by Patient_ID,Year
order by recnum
;
quit;
data want (drop=min max recnum);
set want;
by patient_ID year problem notsorted;
if min eq max and first.problem then keep=1;
else if min ne max and problem eq 2 and first.problem then keep=1;
else keep=0;
run;
Art, CEO, AnalystFinder.com
If a patient's problem 2 always comes by itself, or after a 1, and never reverts back and forth between 1 and 2 within a given year, you could use something like:
data need;
set have;
recnum=_n_;
run;
proc sql noprint;
create table want as
select *, min(problem) as min, max(problem) as max
from need
group by Patient_ID,Year
order by recnum
;
quit;
data want (drop=min max recnum);
set want;
by patient_ID year problem;
if min eq max and first.problem then keep=1;
else if min ne max and problem eq 2 and first.problem then keep=1;
else keep=0;
run;
Art, CEO, AnalystFinder.com
Hi ART297,
Thank you for your response.
There is no set order for the patient's problems, meaning 2 could come before 1 in a given year. Would the code you have provided not work in that situation?
Yes, with a minor addition,but not if they're interspersed (e.g., 2 2 1 2 2). So, yes, the following would come out as desired:
data have;
input Patient_ID Year Admission_Number Problem;
cards;
1 2010 1 1
1 2010 2 1
1 2010 3 1
1 2011 1 1
2 2010 1 1
2 2010 2 2
2 2010 3 2
2 2011 1 2
3 2010 1 2
3 2010 2 1
3 2010 3 1
3 2011 1 2
;
data need;
set have;
recnum=_n_;
run;
proc sql noprint;
create table want as
select *, min(problem) as min, max(problem) as max
from need
group by Patient_ID,Year
order by recnum
;
quit;
data want (drop=min max recnum);
set want;
by patient_ID year problem notsorted;
if min eq max and first.problem then keep=1;
else if min ne max and problem eq 2 and first.problem then keep=1;
else keep=0;
run;
Art, CEO, AnalystFinder.com
I think this is what you are asking for:
data want;
do until (last.year);
set have;
by patient year;
max_problem = max(max_problem, problem);
end;
do until (last.year);
set have;
by patient year;
if max_problem = problem and found=. then do;
keep=1;
found=1;
end;
else keep=0;
output;
end;
drop max_problem;
run;
Thank you art297 and Astounding for helping me find a solution to this problem!
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.