I have the following dataset,
Patient ID | Year | Admission Number | Problem |
1 | 2010 | 1 | 1 |
1 | 2010 | 2 | 1 |
1 | 2010 | 3 | 1 |
1 | 2011 | 1 | 1 |
2 | 2010 | 1 | 1 |
2 | 2010 | 2 | 2 |
2 | 2010 | 3 | 2 |
2 | 2011 | 1 | 2 |
I want to create a variable called “Keep” based on these data. Based on groupings by Patient ID and Year, this variable should do as follows,
The resulting database should look as follows,
Patient ID | Year | Admission Number | Problem | Keep |
1 | 2010 | 1 | 1 | 1 |
1 | 2010 | 2 | 1 | 0 |
1 | 2010 | 3 | 1 | 0 |
1 | 2011 | 1 | 1 | 1 |
2 | 2010 | 1 | 1 | 0 |
2 | 2010 | 2 | 2 | 1 |
2 | 2010 | 3 | 2 | 0 |
2 | 2011 | 1 | 2 | 1 |
Any help would be much appreciated.
Thank you.
Yes, with a minor addition,but not if they're interspersed (e.g., 2 2 1 2 2). So, yes, the following would come out as desired:
data have; input Patient_ID Year Admission_Number Problem; cards; 1 2010 1 1 1 2010 2 1 1 2010 3 1 1 2011 1 1 2 2010 1 1 2 2010 2 2 2 2010 3 2 2 2011 1 2 3 2010 1 2 3 2010 2 1 3 2010 3 1 3 2011 1 2 ; data need; set have; recnum=_n_; run; proc sql noprint; create table want as select *, min(problem) as min, max(problem) as max from need group by Patient_ID,Year order by recnum ; quit; data want (drop=min max recnum); set want; by patient_ID year problem notsorted; if min eq max and first.problem then keep=1; else if min ne max and problem eq 2 and first.problem then keep=1; else keep=0; run;
Art, CEO, AnalystFinder.com
If a patient's problem 2 always comes by itself, or after a 1, and never reverts back and forth between 1 and 2 within a given year, you could use something like:
data need; set have; recnum=_n_; run; proc sql noprint; create table want as select *, min(problem) as min, max(problem) as max from need group by Patient_ID,Year order by recnum ; quit; data want (drop=min max recnum); set want; by patient_ID year problem; if min eq max and first.problem then keep=1; else if min ne max and problem eq 2 and first.problem then keep=1; else keep=0; run;
Art, CEO, AnalystFinder.com
Hi ART297,
Thank you for your response.
There is no set order for the patient's problems, meaning 2 could come before 1 in a given year. Would the code you have provided not work in that situation?
Yes, with a minor addition,but not if they're interspersed (e.g., 2 2 1 2 2). So, yes, the following would come out as desired:
data have; input Patient_ID Year Admission_Number Problem; cards; 1 2010 1 1 1 2010 2 1 1 2010 3 1 1 2011 1 1 2 2010 1 1 2 2010 2 2 2 2010 3 2 2 2011 1 2 3 2010 1 2 3 2010 2 1 3 2010 3 1 3 2011 1 2 ; data need; set have; recnum=_n_; run; proc sql noprint; create table want as select *, min(problem) as min, max(problem) as max from need group by Patient_ID,Year order by recnum ; quit; data want (drop=min max recnum); set want; by patient_ID year problem notsorted; if min eq max and first.problem then keep=1; else if min ne max and problem eq 2 and first.problem then keep=1; else keep=0; run;
Art, CEO, AnalystFinder.com
I think this is what you are asking for:
data want;
do until (last.year);
set have;
by patient year;
max_problem = max(max_problem, problem);
end;
do until (last.year);
set have;
by patient year;
if max_problem = problem and found=. then do;
keep=1;
found=1;
end;
else keep=0;
output;
end;
drop max_problem;
run;
Thank you art297 and Astounding for helping me find a solution to this problem!
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Select SAS Training centers are offering in-person courses. View upcoming courses for: