BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ImSame
Calcite | Level 5

I have the following dataset,

 

Patient ID

Year

Admission  Number

Problem

1

2010

1

1

1

2010

2

1

1

2010

3

1

1

2011

1

1

2

2010

1

1

2

2010

2

2

2

2010

3

2

2

2011

1

2

 

 

I want to create a variable called “Keep” based on these data. Based on groupings by Patient ID and Year, this variable should do as follows,

 

  • Assign the first admission in a year as Keep = 1, if all admissions in that year for a Patient ID have a problem value of 1 (see Patient ID 1 in the table below as an example).
  • Assign the first instance of an admission in a year with Problem value of 2 as Keep = 1, if the admissions in a year for a Patient ID contain both Problem values of 1 and 2 (see Patient ID 2 in the table below as an example).

 

The resulting database should look as follows,

 

Patient ID

Year

Admission  Number

Problem

Keep

1

2010

1

1

1

1

2010

2

1

0

1

2010

3

1

0

1

2011

1

1

1

2

2010

1

1

0

2

2010

2

2

1

2

2010

3

2

0

2

2011

1

2

1

 

Any help would be much appreciated.

 

Thank you.

1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

Yes, with a minor addition,but not if they're interspersed (e.g., 2 2 1 2 2). So, yes, the following would come out as desired:

data have;
  input Patient_ID Year Admission_Number Problem;
  cards;
1 2010 1 1
1 2010 2 1
1 2010 3 1
1 2011 1 1
2 2010 1 1
2 2010 2 2
2 2010 3 2
2 2011 1 2
3 2010 1 2
3 2010 2 1
3 2010 3 1
3 2011 1 2
;

data need;
  set have;
  recnum=_n_;
run;

proc sql noprint;
  create table want as
    select *, min(problem) as min, max(problem) as max
      from need
        group by Patient_ID,Year
           order by recnum
  ;
quit;

data want (drop=min max recnum);
  set want;
  by patient_ID year problem notsorted;
  if min eq max and first.problem then keep=1;
  else if min ne max and problem eq 2 and first.problem then keep=1;
  else keep=0;
run;

Art, CEO, AnalystFinder.com

 

 

View solution in original post

5 REPLIES 5
art297
Opal | Level 21

If a patient's problem 2 always comes by itself, or after a 1, and never reverts back and forth between 1 and 2 within a given year, you could use something like:

data need;
  set have;
  recnum=_n_;
run;

proc sql noprint;
  create table want as
    select *, min(problem) as min, max(problem) as max
      from need
        group by Patient_ID,Year
           order by recnum
  ;
quit;

data want (drop=min max recnum);
  set want;
  by patient_ID year problem;
  if min eq max and first.problem then keep=1;
  else if min ne max and problem eq 2 and first.problem then keep=1;
  else keep=0;
run;

Art, CEO, AnalystFinder.com

 

ImSame
Calcite | Level 5

Hi ART297,

 

Thank you for your response.

 

There is no set order for the patient's problems, meaning 2 could come before 1 in a given year. Would the code you have provided not work in that situation?

 

 

art297
Opal | Level 21

Yes, with a minor addition,but not if they're interspersed (e.g., 2 2 1 2 2). So, yes, the following would come out as desired:

data have;
  input Patient_ID Year Admission_Number Problem;
  cards;
1 2010 1 1
1 2010 2 1
1 2010 3 1
1 2011 1 1
2 2010 1 1
2 2010 2 2
2 2010 3 2
2 2011 1 2
3 2010 1 2
3 2010 2 1
3 2010 3 1
3 2011 1 2
;

data need;
  set have;
  recnum=_n_;
run;

proc sql noprint;
  create table want as
    select *, min(problem) as min, max(problem) as max
      from need
        group by Patient_ID,Year
           order by recnum
  ;
quit;

data want (drop=min max recnum);
  set want;
  by patient_ID year problem notsorted;
  if min eq max and first.problem then keep=1;
  else if min ne max and problem eq 2 and first.problem then keep=1;
  else keep=0;
run;

Art, CEO, AnalystFinder.com

 

 

Astounding
PROC Star

I think this is what you are asking for:

 

data want;

do until (last.year);

   set have;

   by patient year;

   max_problem = max(max_problem, problem);

end;

do until (last.year);

   set have;

   by patient year;

   if max_problem = problem and found=. then do;

      keep=1;

      found=1;

   end;

   else keep=0;

   output;

end;

drop max_problem;

run;

ImSame
Calcite | Level 5

Thank you art297 and Astounding for helping me find a solution to this problem!

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1513 views
  • 0 likes
  • 3 in conversation