I have the following dataset,
Patient ID | Year | Admission Number | Problem |
1 | 2010 | 1 | 1 |
1 | 2010 | 2 | 1 |
1 | 2010 | 3 | 1 |
1 | 2011 | 1 | 1 |
2 | 2010 | 1 | 1 |
2 | 2010 | 2 | 2 |
2 | 2010 | 3 | 2 |
2 | 2011 | 1 | 2 |
I want to create a variable called “Keep” based on these data. Based on groupings by Patient ID and Year, this variable should do as follows,
The resulting database should look as follows,
Patient ID | Year | Admission Number | Problem | Keep |
1 | 2010 | 1 | 1 | 1 |
1 | 2010 | 2 | 1 | 0 |
1 | 2010 | 3 | 1 | 0 |
1 | 2011 | 1 | 1 | 1 |
2 | 2010 | 1 | 1 | 0 |
2 | 2010 | 2 | 2 | 1 |
2 | 2010 | 3 | 2 | 0 |
2 | 2011 | 1 | 2 | 1 |
Any help would be much appreciated.
Thank you.
Yes, with a minor addition,but not if they're interspersed (e.g., 2 2 1 2 2). So, yes, the following would come out as desired:
data have; input Patient_ID Year Admission_Number Problem; cards; 1 2010 1 1 1 2010 2 1 1 2010 3 1 1 2011 1 1 2 2010 1 1 2 2010 2 2 2 2010 3 2 2 2011 1 2 3 2010 1 2 3 2010 2 1 3 2010 3 1 3 2011 1 2 ; data need; set have; recnum=_n_; run; proc sql noprint; create table want as select *, min(problem) as min, max(problem) as max from need group by Patient_ID,Year order by recnum ; quit; data want (drop=min max recnum); set want; by patient_ID year problem notsorted; if min eq max and first.problem then keep=1; else if min ne max and problem eq 2 and first.problem then keep=1; else keep=0; run;
Art, CEO, AnalystFinder.com
If a patient's problem 2 always comes by itself, or after a 1, and never reverts back and forth between 1 and 2 within a given year, you could use something like:
data need; set have; recnum=_n_; run; proc sql noprint; create table want as select *, min(problem) as min, max(problem) as max from need group by Patient_ID,Year order by recnum ; quit; data want (drop=min max recnum); set want; by patient_ID year problem; if min eq max and first.problem then keep=1; else if min ne max and problem eq 2 and first.problem then keep=1; else keep=0; run;
Art, CEO, AnalystFinder.com
Hi ART297,
Thank you for your response.
There is no set order for the patient's problems, meaning 2 could come before 1 in a given year. Would the code you have provided not work in that situation?
Yes, with a minor addition,but not if they're interspersed (e.g., 2 2 1 2 2). So, yes, the following would come out as desired:
data have; input Patient_ID Year Admission_Number Problem; cards; 1 2010 1 1 1 2010 2 1 1 2010 3 1 1 2011 1 1 2 2010 1 1 2 2010 2 2 2 2010 3 2 2 2011 1 2 3 2010 1 2 3 2010 2 1 3 2010 3 1 3 2011 1 2 ; data need; set have; recnum=_n_; run; proc sql noprint; create table want as select *, min(problem) as min, max(problem) as max from need group by Patient_ID,Year order by recnum ; quit; data want (drop=min max recnum); set want; by patient_ID year problem notsorted; if min eq max and first.problem then keep=1; else if min ne max and problem eq 2 and first.problem then keep=1; else keep=0; run;
Art, CEO, AnalystFinder.com
I think this is what you are asking for:
data want;
do until (last.year);
set have;
by patient year;
max_problem = max(max_problem, problem);
end;
do until (last.year);
set have;
by patient year;
if max_problem = problem and found=. then do;
keep=1;
found=1;
end;
else keep=0;
output;
end;
drop max_problem;
run;
Thank you art297 and Astounding for helping me find a solution to this problem!
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.