BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
nexterd
Fluorite | Level 6

Hello all,

 

I am managing health insurance data so I got many records by individuals.

 

I'd like only one observation to remain per person with some conditions.

 

Here are the conditions below:

1) has a (two kinds of) specific classification number for the first time,

2) otherwise, has the latest record

 

For example, here's the sample of the data:

 

Data play;

input id $class1 $class2 yyyymm

datalines;

1 A12 C33 200105

1 A12 C32 200109

1 A12 C33 200203

1 A13 C34 200302

2 F12 D33 200105

2 F12 D32 200109

2 F12 D33 200203

3 S23 F32 200103

3 S23 F33 200105

3 A12 F32 200204

3 A12 C33 200205

3 A13 C34 200208

4 Q12 C33 200102

4 Q12 A12 200112

4 G12 A12 200207

4 G13 C32 200309

5 D12 G33 200105

5 D12 A32 200108

5 D12 A33 200111

;

run;

 

With this data, I want a row remained by an individual in which $class1 or $class2 has either "A12" or "A" for the first time. It otherwise has the latest record.

And here's the expected result below:

 

1 A12 C33 200105 (<- "A12" contained in $class1)

2 F12 D33 200203 (<- nothing contained, so "the latest record")

3 A12 F32 200204 (<- "A12" contained in $class1)

4 Q12 A12 200112 (<- "A12" contained in $class2)

5 D12 A32 200108 (<- "A" contained in $class2)

 

How can I make the data like this?

 

I would much appreciate all your possible answers. Thank you!

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

Maybe this will do:

 

proc sql;
create view temp as
select 
    *,
    case 
        when class1 = "A12" or class2 = "A12" then 1
        when first(class1) = "A" or first(class2) = "A" then 2
        else 3
        end as order
from play
order by id, order, yyyymm;
quit;

data want;
set temp; by id;
if first.id;
drop order;
run;

 

Edit -- Sorry, this code doesn't always give the correct answer. See below. 

PG

View solution in original post

10 REPLIES 10
PGStats
Opal | Level 21

Would the result be the same if the last obs was

 

5 D12 A12 200111

 

And either way, why?

PG
nexterd
Fluorite | Level 6

Oh I just wanted to differentiate A12 from all values which contain "A" because I set "A12" as a case and other "A"s as censored.

kiranv_
Rhodochrosite | Level 12

this may work

proc sql;
/* geting latest records other than A12 or A*/
select id,  class1,  class2,  yyyymm from play
where id not in 
(select id from play
 where class1 = "A12" 
 or   class2  = "A" 
 group by id 
having yyyymm = min(yyyymm))
 group by id 
having yyyymm = min(yyyymm)
union
/* geting latest records for A12 or A*/
select id,  class1,  class2,  yyyymm from play
 where class1 = "A12" 
 or   class2  = "A" 
 group by id 
having yyyymm = min(yyyymm);
nexterd
Fluorite | Level 6

Thank your for your reply! I tried it as you wrote and it looks like doing well but I cannot confirm that because of a resource problem. Besides, how can I save the new data set which was made throughout the process?

liangchh0
Fluorite | Level 6

Hi

 

Here are the steps:

 

1. Sort the data by id and yyyymm

proc sort data = play;

    by id yyyymm;

run;

 

2. Extract the rows containing 'A' in either class1 or class2

data playClassA;

    set play;

    if substr(class1,1,1)='A' | substr(class2,1,1)='A';

run;

 

3. Get the first record

data dsOut1;

    set playClassA;

    by id;

    if first.id then output;

run;

 

4. Extract the ids containing no 'A' in either class1 or class2

data playClassABar;

    merge play

        dsOut1 (in=a keep=id);

    by id;

run;

 

5. Get the last record

data dsOut2;

    set playClassA;

    by id;

    if last.id then output;

run;

 

6. Combine the results

data dsOut;

    set dsOut1 dsOut2;

run;

 

Hope this will help.

nexterd
Fluorite | Level 6

This solution is almost similar to what I was trying to. It's kind of a classical approach so I did it, but it was complicated to use. I will also try your solution. Thank you! 

Patrick
Opal | Level 21

Something along the line of below should work:


data want(drop=_:);
  set have;
  by id;
  retain _done;
  if first.id then _done=0;
  if _done=0 then 
    do;
      if findc(class1,'A','i') or findc(class2,'A','i') then
        do;
          output;
          _done=1;
        end;
      else if last.id then output;
    end;

run;
PGStats
Opal | Level 21

Maybe this will do:

 

proc sql;
create view temp as
select 
    *,
    case 
        when class1 = "A12" or class2 = "A12" then 1
        when first(class1) = "A" or first(class2) = "A" then 2
        else 3
        end as order
from play
order by id, order, yyyymm;
quit;

data want;
set temp; by id;
if first.id;
drop order;
run;

 

Edit -- Sorry, this code doesn't always give the correct answer. See below. 

PG
nexterd
Fluorite | Level 6

I perfectly resolved the problem thanks to your solution! Thank you very much! I really love such a laconic style of your coding.

PGStats
Opal | Level 21

Sorry. My suggestion above does NOT always give the expected answer in the case where A12 or A is not found..This, however will:

 

data want;
lookfor = "---";
do until(last.id);
    set play; by id;
    if class1 = "A12" or class2 = "A12" then lookfor = "A12";
    if lookfor ne "A12" and (first(class1) = "A" or first(class2) = "A") then lookfor = "A";
    end;
do until(last.id);
    set play; by id;
    if class1 =: trim(lookfor) or class2 =: trim(lookfor) then do;
        output;
        lookfor = "+++";
        end;
    end;
if lookfor = "---" then output;
drop lookfor;
run;
PG

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 10 replies
  • 1483 views
  • 3 likes
  • 5 in conversation