## How do I extract a row from several observations under multiple conditions?

Solved
Occasional Contributor
Posts: 6

# How do I extract a row from several observations under multiple conditions?

[ Edited ]

Hello all,

I am managing health insurance data so I got many records by individuals.

I'd like only one observation to remain per person with some conditions.

Here are the conditions below:

1) has a (two kinds of) specific classification number for the first time,

2) otherwise, has the latest record

For example, here's the sample of the data:

Data play;

input id \$class1 \$class2 yyyymm

datalines;

1 A12 C33 200105

1 A12 C32 200109

1 A12 C33 200203

1 A13 C34 200302

2 F12 D33 200105

2 F12 D32 200109

2 F12 D33 200203

3 S23 F32 200103

3 S23 F33 200105

3 A12 F32 200204

3 A12 C33 200205

3 A13 C34 200208

4 Q12 C33 200102

4 Q12 A12 200112

4 G12 A12 200207

4 G13 C32 200309

5 D12 G33 200105

5 D12 A32 200108

5 D12 A33 200111

;

run;

With this data, I want a row remained by an individual in which \$class1 or \$class2 has either "A12" or "A" for the first time. It otherwise has the latest record.

And here's the expected result below:

1 A12 C33 200105 (<- "A12" contained in \$class1)

2 F12 D33 200203 (<- nothing contained, so "the latest record")

3 A12 F32 200204 (<- "A12" contained in \$class1)

4 Q12 A12 200112 (<- "A12" contained in \$class2)

5 D12 A32 200108 (<- "A" contained in \$class2)

How can I make the data like this?

Accepted Solutions
Solution
‎03-06-2018 03:19 AM
Posts: 5,475

## Re: How do I extract a row from several observations under multiple conditions?

[ Edited ]

Maybe this will do:

``````proc sql;
create view temp as
select
*,
case
when class1 = "A12" or class2 = "A12" then 1
when first(class1) = "A" or first(class2) = "A" then 2
else 3
end as order
from play
order by id, order, yyyymm;
quit;

data want;
set temp; by id;
if first.id;
drop order;
run;``````

Edit -- Sorry, this code doesn't always give the correct answer. See below.

PG

All Replies
Posts: 5,475

## Re: How do I extract a row from several observations under multiple conditions?

Would the result be the same if the last obs was

5 D12 A12 200111

And either way, why?

PG
Occasional Contributor
Posts: 6

## Re: How do I extract a row from several observations under multiple conditions?

Oh I just wanted to differentiate A12 from all values which contain "A" because I set "A12" as a case and other "A"s as censored.

PROC Star
Posts: 503

## Re: How do I extract a row from several observations under multiple conditions?

this may work

``````proc sql;
/* geting latest records other than A12 or A*/
select id,  class1,  class2,  yyyymm from play
where id not in
(select id from play
where class1 = "A12"
or   class2  = "A"
group by id
having yyyymm = min(yyyymm))
group by id
having yyyymm = min(yyyymm)
union
/* geting latest records for A12 or A*/
select id,  class1,  class2,  yyyymm from play
where class1 = "A12"
or   class2  = "A"
group by id
having yyyymm = min(yyyymm);``````
Occasional Contributor
Posts: 6

## Re: How do I extract a row from several observations under multiple conditions?

Thank your for your reply! I tried it as you wrote and it looks like doing well but I cannot confirm that because of a resource problem. Besides, how can I save the new data set which was made throughout the process?

Occasional Contributor
Posts: 9

## Re: How do I extract a row from several observations under multiple conditions?

Hi

Here are the steps:

1. Sort the data by id and yyyymm

proc sort data = play;

by id yyyymm;

run;

2. Extract the rows containing 'A' in either class1 or class2

data playClassA;

set play;

if substr(class1,1,1)='A' | substr(class2,1,1)='A';

run;

3. Get the first record

data dsOut1;

set playClassA;

by id;

if first.id then output;

run;

4. Extract the ids containing no 'A' in either class1 or class2

data playClassABar;

merge play

dsOut1 (in=a keep=id);

by id;

run;

5. Get the last record

data dsOut2;

set playClassA;

by id;

if last.id then output;

run;

6. Combine the results

data dsOut;

set dsOut1 dsOut2;

run;

Hope this will help.

Occasional Contributor
Posts: 6

## Re: How do I extract a row from several observations under multiple conditions?

This solution is almost similar to what I was trying to. It's kind of a classical approach so I did it, but it was complicated to use. I will also try your solution. Thank you!

Posts: 4,668

## Re: How do I extract a row from several observations under multiple conditions?

[ Edited ]

Something along the line of below should work:

``````
data want(drop=_:);
set have;
by id;
retain _done;
if first.id then _done=0;
if _done=0 then
do;
if findc(class1,'A','i') or findc(class2,'A','i') then
do;
output;
_done=1;
end;
else if last.id then output;
end;

run;``````
Solution
‎03-06-2018 03:19 AM
Posts: 5,475

## Re: How do I extract a row from several observations under multiple conditions?

[ Edited ]

Maybe this will do:

``````proc sql;
create view temp as
select
*,
case
when class1 = "A12" or class2 = "A12" then 1
when first(class1) = "A" or first(class2) = "A" then 2
else 3
end as order
from play
order by id, order, yyyymm;
quit;

data want;
set temp; by id;
if first.id;
drop order;
run;``````

Edit -- Sorry, this code doesn't always give the correct answer. See below.

PG
Occasional Contributor
Posts: 6

## Re: How do I extract a row from several observations under multiple conditions?

I perfectly resolved the problem thanks to your solution! Thank you very much! I really love such a laconic style of your coding.

Posts: 5,475

## Re: How do I extract a row from several observations under multiple conditions?

[ Edited ]

Sorry. My suggestion above does NOT always give the expected answer in the case where A12 or A is not found..This, however will:

``````data want;
lookfor = "---";
do until(last.id);
set play; by id;
if class1 = "A12" or class2 = "A12" then lookfor = "A12";
if lookfor ne "A12" and (first(class1) = "A" or first(class2) = "A") then lookfor = "A";
end;
do until(last.id);
set play; by id;
if class1 =: trim(lookfor) or class2 =: trim(lookfor) then do;
output;
lookfor = "+++";
end;
end;
if lookfor = "---" then output;
drop lookfor;
run;``````
PG
☑ This topic is solved.