SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

How do I extract a row from several observations under multiple conditions?

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 6
Accepted Solution

How do I extract a row from several observations under multiple conditions?

[ Edited ]

Hello all,

 

I am managing health insurance data so I got many records by individuals.

 

I'd like only one observation to remain per person with some conditions.

 

Here are the conditions below:

1) has a (two kinds of) specific classification number for the first time,

2) otherwise, has the latest record

 

For example, here's the sample of the data:

 

Data play;

input id $class1 $class2 yyyymm

datalines;

1 A12 C33 200105

1 A12 C32 200109

1 A12 C33 200203

1 A13 C34 200302

2 F12 D33 200105

2 F12 D32 200109

2 F12 D33 200203

3 S23 F32 200103

3 S23 F33 200105

3 A12 F32 200204

3 A12 C33 200205

3 A13 C34 200208

4 Q12 C33 200102

4 Q12 A12 200112

4 G12 A12 200207

4 G13 C32 200309

5 D12 G33 200105

5 D12 A32 200108

5 D12 A33 200111

;

run;

 

With this data, I want a row remained by an individual in which $class1 or $class2 has either "A12" or "A" for the first time. It otherwise has the latest record.

And here's the expected result below:

 

1 A12 C33 200105 (<- "A12" contained in $class1)

2 F12 D33 200203 (<- nothing contained, so "the latest record")

3 A12 F32 200204 (<- "A12" contained in $class1)

4 Q12 A12 200112 (<- "A12" contained in $class2)

5 D12 A32 200108 (<- "A" contained in $class2)

 

How can I make the data like this?

 

I would much appreciate all your possible answers. Thank you!

 

 

 


Accepted Solutions
Solution
‎03-06-2018 03:19 AM
Esteemed Advisor
Posts: 5,523

Re: How do I extract a row from several observations under multiple conditions?

[ Edited ]

Maybe this will do:

 

proc sql;
create view temp as
select 
    *,
    case 
        when class1 = "A12" or class2 = "A12" then 1
        when first(class1) = "A" or first(class2) = "A" then 2
        else 3
        end as order
from play
order by id, order, yyyymm;
quit;

data want;
set temp; by id;
if first.id;
drop order;
run;

 

Edit -- Sorry, this code doesn't always give the correct answer. See below. 

PG

View solution in original post


All Replies
Esteemed Advisor
Posts: 5,523

Re: How do I extract a row from several observations under multiple conditions?

Would the result be the same if the last obs was

 

5 D12 A12 200111

 

And either way, why?

PG
Occasional Contributor
Posts: 6

Re: How do I extract a row from several observations under multiple conditions?

Oh I just wanted to differentiate A12 from all values which contain "A" because I set "A12" as a case and other "A"s as censored.

PROC Star
Posts: 509

Re: How do I extract a row from several observations under multiple conditions?

this may work

proc sql;
/* geting latest records other than A12 or A*/
select id,  class1,  class2,  yyyymm from play
where id not in 
(select id from play
 where class1 = "A12" 
 or   class2  = "A" 
 group by id 
having yyyymm = min(yyyymm))
 group by id 
having yyyymm = min(yyyymm)
union
/* geting latest records for A12 or A*/
select id,  class1,  class2,  yyyymm from play
 where class1 = "A12" 
 or   class2  = "A" 
 group by id 
having yyyymm = min(yyyymm);
Occasional Contributor
Posts: 6

Re: How do I extract a row from several observations under multiple conditions?

Thank your for your reply! I tried it as you wrote and it looks like doing well but I cannot confirm that because of a resource problem. Besides, how can I save the new data set which was made throughout the process?

Occasional Contributor
Posts: 9

Re: How do I extract a row from several observations under multiple conditions?

Hi

 

Here are the steps:

 

1. Sort the data by id and yyyymm

proc sort data = play;

    by id yyyymm;

run;

 

2. Extract the rows containing 'A' in either class1 or class2

data playClassA;

    set play;

    if substr(class1,1,1)='A' | substr(class2,1,1)='A';

run;

 

3. Get the first record

data dsOut1;

    set playClassA;

    by id;

    if first.id then output;

run;

 

4. Extract the ids containing no 'A' in either class1 or class2

data playClassABar;

    merge play

        dsOut1 (in=a keep=id);

    by id;

run;

 

5. Get the last record

data dsOut2;

    set playClassA;

    by id;

    if last.id then output;

run;

 

6. Combine the results

data dsOut;

    set dsOut1 dsOut2;

run;

 

Hope this will help.

Occasional Contributor
Posts: 6

Re: How do I extract a row from several observations under multiple conditions?

Posted in reply to liangchh0

This solution is almost similar to what I was trying to. It's kind of a classical approach so I did it, but it was complicated to use. I will also try your solution. Thank you! 

Respected Advisor
Posts: 4,736

Re: How do I extract a row from several observations under multiple conditions?

[ Edited ]

Something along the line of below should work:


data want(drop=_:);
  set have;
  by id;
  retain _done;
  if first.id then _done=0;
  if _done=0 then 
    do;
      if findc(class1,'A','i') or findc(class2,'A','i') then
        do;
          output;
          _done=1;
        end;
      else if last.id then output;
    end;

run;
Solution
‎03-06-2018 03:19 AM
Esteemed Advisor
Posts: 5,523

Re: How do I extract a row from several observations under multiple conditions?

[ Edited ]

Maybe this will do:

 

proc sql;
create view temp as
select 
    *,
    case 
        when class1 = "A12" or class2 = "A12" then 1
        when first(class1) = "A" or first(class2) = "A" then 2
        else 3
        end as order
from play
order by id, order, yyyymm;
quit;

data want;
set temp; by id;
if first.id;
drop order;
run;

 

Edit -- Sorry, this code doesn't always give the correct answer. See below. 

PG
Occasional Contributor
Posts: 6

Re: How do I extract a row from several observations under multiple conditions?

I perfectly resolved the problem thanks to your solution! Thank you very much! I really love such a laconic style of your coding.

Highlighted
Esteemed Advisor
Posts: 5,523

Re: How do I extract a row from several observations under multiple conditions?

[ Edited ]

Sorry. My suggestion above does NOT always give the expected answer in the case where A12 or A is not found..This, however will:

 

data want;
lookfor = "---";
do until(last.id);
    set play; by id;
    if class1 = "A12" or class2 = "A12" then lookfor = "A12";
    if lookfor ne "A12" and (first(class1) = "A" or first(class2) = "A") then lookfor = "A";
    end;
do until(last.id);
    set play; by id;
    if class1 =: trim(lookfor) or class2 =: trim(lookfor) then do;
        output;
        lookfor = "+++";
        end;
    end;
if lookfor = "---" then output;
drop lookfor;
run;
PG
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 10 replies
  • 313 views
  • 3 likes
  • 5 in conversation