## Variable derivation by looking between observations

Solved
Occasional Contributor
Posts: 17

# Variable derivation by looking between observations

I have the following dataset,

 Patient ID Year Admission  Number Problem 1 2010 1 1 1 2010 2 1 1 2010 3 1 1 2011 1 1 2 2010 1 1 2 2010 2 2 2 2010 3 2 2 2011 1 2

I want to create a variable called “Keep” based on these data. Based on groupings by Patient ID and Year, this variable should do as follows,

• Assign the first admission in a year as Keep = 1, if all admissions in that year for a Patient ID have a problem value of 1 (see Patient ID 1 in the table below as an example).
• Assign the first instance of an admission in a year with Problem value of 2 as Keep = 1, if the admissions in a year for a Patient ID contain both Problem values of 1 and 2 (see Patient ID 2 in the table below as an example).

The resulting database should look as follows,

 Patient ID Year Admission  Number Problem Keep 1 2010 1 1 1 1 2010 2 1 0 1 2010 3 1 0 1 2011 1 1 1 2 2010 1 1 0 2 2010 2 2 1 2 2010 3 2 0 2 2011 1 2 1

Any help would be much appreciated.

Thank you.

Accepted Solutions
Solution
‎07-17-2017 12:33 PM
PROC Star
Posts: 8,163

## Re: Variable derivation by looking between observations

Yes, with a minor addition,but not if they're interspersed (e.g., 2 2 1 2 2). So, yes, the following would come out as desired:

```data have;
cards;
1 2010 1 1
1 2010 2 1
1 2010 3 1
1 2011 1 1
2 2010 1 1
2 2010 2 2
2 2010 3 2
2 2011 1 2
3 2010 1 2
3 2010 2 1
3 2010 3 1
3 2011 1 2
;

data need;
set have;
recnum=_n_;
run;

proc sql noprint;
create table want as
select *, min(problem) as min, max(problem) as max
from need
group by Patient_ID,Year
order by recnum
;
quit;

data want (drop=min max recnum);
set want;
by patient_ID year problem notsorted;
if min eq max and first.problem then keep=1;
else if min ne max and problem eq 2 and first.problem then keep=1;
else keep=0;
run;
```

Art, CEO, AnalystFinder.com

All Replies
PROC Star
Posts: 8,163

## Re: Variable derivation by looking between observations

If a patient's problem 2 always comes by itself, or after a 1, and never reverts back and forth between 1 and 2 within a given year, you could use something like:

```data need;
set have;
recnum=_n_;
run;

proc sql noprint;
create table want as
select *, min(problem) as min, max(problem) as max
from need
group by Patient_ID,Year
order by recnum
;
quit;

data want (drop=min max recnum);
set want;
by patient_ID year problem;
if min eq max and first.problem then keep=1;
else if min ne max and problem eq 2 and first.problem then keep=1;
else keep=0;
run;
```

Art, CEO, AnalystFinder.com

Occasional Contributor
Posts: 17

## Re: Variable derivation by looking between observations

Hi ART297,

There is no set order for the patient's problems, meaning 2 could come before 1 in a given year. Would the code you have provided not work in that situation?

Solution
‎07-17-2017 12:33 PM
PROC Star
Posts: 8,163

## Re: Variable derivation by looking between observations

Yes, with a minor addition,but not if they're interspersed (e.g., 2 2 1 2 2). So, yes, the following would come out as desired:

```data have;
cards;
1 2010 1 1
1 2010 2 1
1 2010 3 1
1 2011 1 1
2 2010 1 1
2 2010 2 2
2 2010 3 2
2 2011 1 2
3 2010 1 2
3 2010 2 1
3 2010 3 1
3 2011 1 2
;

data need;
set have;
recnum=_n_;
run;

proc sql noprint;
create table want as
select *, min(problem) as min, max(problem) as max
from need
group by Patient_ID,Year
order by recnum
;
quit;

data want (drop=min max recnum);
set want;
by patient_ID year problem notsorted;
if min eq max and first.problem then keep=1;
else if min ne max and problem eq 2 and first.problem then keep=1;
else keep=0;
run;
```

Art, CEO, AnalystFinder.com

Super User
Posts: 6,751

## Re: Variable derivation by looking between observations

I think this is what you are asking for:

data want;

do until (last.year);

set have;

by patient year;

max_problem = max(max_problem, problem);

end;

do until (last.year);

set have;

by patient year;

if max_problem = problem and found=. then do;

keep=1;

found=1;

end;

else keep=0;

output;

end;

drop max_problem;

run;

Occasional Contributor
Posts: 17