BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
nexterd
Fluorite | Level 6

Hello all,

 

I am managing health insurance data so I got many records by individuals.

 

I'd like only one observation to remain per person with some conditions.

 

Here are the conditions below:

1) has a (two kinds of) specific classification number for the first time,

2) otherwise, has the latest record

 

For example, here's the sample of the data:

 

Data play;

input id $class1 $class2 yyyymm

datalines;

1 A12 C33 200105

1 A12 C32 200109

1 A12 C33 200203

1 A13 C34 200302

2 F12 D33 200105

2 F12 D32 200109

2 F12 D33 200203

3 S23 F32 200103

3 S23 F33 200105

3 A12 F32 200204

3 A12 C33 200205

3 A13 C34 200208

4 Q12 C33 200102

4 Q12 A12 200112

4 G12 A12 200207

4 G13 C32 200309

5 D12 G33 200105

5 D12 A32 200108

5 D12 A33 200111

;

run;

 

With this data, I want a row remained by an individual in which $class1 or $class2 has either "A12" or "A" for the first time. It otherwise has the latest record.

And here's the expected result below:

 

1 A12 C33 200105 (<- "A12" contained in $class1)

2 F12 D33 200203 (<- nothing contained, so "the latest record")

3 A12 F32 200204 (<- "A12" contained in $class1)

4 Q12 A12 200112 (<- "A12" contained in $class2)

5 D12 A32 200108 (<- "A" contained in $class2)

 

How can I make the data like this?

 

I would much appreciate all your possible answers. Thank you!

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

Maybe this will do:

 

proc sql;
create view temp as
select 
    *,
    case 
        when class1 = "A12" or class2 = "A12" then 1
        when first(class1) = "A" or first(class2) = "A" then 2
        else 3
        end as order
from play
order by id, order, yyyymm;
quit;

data want;
set temp; by id;
if first.id;
drop order;
run;

 

Edit -- Sorry, this code doesn't always give the correct answer. See below. 

PG

View solution in original post

10 REPLIES 10
PGStats
Opal | Level 21

Would the result be the same if the last obs was

 

5 D12 A12 200111

 

And either way, why?

PG
nexterd
Fluorite | Level 6

Oh I just wanted to differentiate A12 from all values which contain "A" because I set "A12" as a case and other "A"s as censored.

kiranv_
Rhodochrosite | Level 12

this may work

proc sql;
/* geting latest records other than A12 or A*/
select id,  class1,  class2,  yyyymm from play
where id not in 
(select id from play
 where class1 = "A12" 
 or   class2  = "A" 
 group by id 
having yyyymm = min(yyyymm))
 group by id 
having yyyymm = min(yyyymm)
union
/* geting latest records for A12 or A*/
select id,  class1,  class2,  yyyymm from play
 where class1 = "A12" 
 or   class2  = "A" 
 group by id 
having yyyymm = min(yyyymm);
nexterd
Fluorite | Level 6

Thank your for your reply! I tried it as you wrote and it looks like doing well but I cannot confirm that because of a resource problem. Besides, how can I save the new data set which was made throughout the process?

liangchh0
Fluorite | Level 6

Hi

 

Here are the steps:

 

1. Sort the data by id and yyyymm

proc sort data = play;

    by id yyyymm;

run;

 

2. Extract the rows containing 'A' in either class1 or class2

data playClassA;

    set play;

    if substr(class1,1,1)='A' | substr(class2,1,1)='A';

run;

 

3. Get the first record

data dsOut1;

    set playClassA;

    by id;

    if first.id then output;

run;

 

4. Extract the ids containing no 'A' in either class1 or class2

data playClassABar;

    merge play

        dsOut1 (in=a keep=id);

    by id;

run;

 

5. Get the last record

data dsOut2;

    set playClassA;

    by id;

    if last.id then output;

run;

 

6. Combine the results

data dsOut;

    set dsOut1 dsOut2;

run;

 

Hope this will help.

nexterd
Fluorite | Level 6

This solution is almost similar to what I was trying to. It's kind of a classical approach so I did it, but it was complicated to use. I will also try your solution. Thank you! 

Patrick
Opal | Level 21

Something along the line of below should work:


data want(drop=_:);
  set have;
  by id;
  retain _done;
  if first.id then _done=0;
  if _done=0 then 
    do;
      if findc(class1,'A','i') or findc(class2,'A','i') then
        do;
          output;
          _done=1;
        end;
      else if last.id then output;
    end;

run;
PGStats
Opal | Level 21

Maybe this will do:

 

proc sql;
create view temp as
select 
    *,
    case 
        when class1 = "A12" or class2 = "A12" then 1
        when first(class1) = "A" or first(class2) = "A" then 2
        else 3
        end as order
from play
order by id, order, yyyymm;
quit;

data want;
set temp; by id;
if first.id;
drop order;
run;

 

Edit -- Sorry, this code doesn't always give the correct answer. See below. 

PG
nexterd
Fluorite | Level 6

I perfectly resolved the problem thanks to your solution! Thank you very much! I really love such a laconic style of your coding.

PGStats
Opal | Level 21

Sorry. My suggestion above does NOT always give the expected answer in the case where A12 or A is not found..This, however will:

 

data want;
lookfor = "---";
do until(last.id);
    set play; by id;
    if class1 = "A12" or class2 = "A12" then lookfor = "A12";
    if lookfor ne "A12" and (first(class1) = "A" or first(class2) = "A") then lookfor = "A";
    end;
do until(last.id);
    set play; by id;
    if class1 =: trim(lookfor) or class2 =: trim(lookfor) then do;
        output;
        lookfor = "+++";
        end;
    end;
if lookfor = "---" then output;
drop lookfor;
run;
PG

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 10 replies
  • 2454 views
  • 3 likes
  • 5 in conversation