DATA Step, Macro, Functions and more

Keep only most recent records

Accepted Solution Solved
Reply
Contributor
Posts: 46
Accepted Solution

Keep only most recent records

Hi everyone,

 

I have a data with Diagnoses at each year of Medical exam. I am trying to select ALL Diagnosis data (all lines) from the MOST RECENT EXAM ONLY, for each ID. 

 

Please, can you help me with the coding, see below the database structure.

 

Thanks in advance,

 

 IDCalendar YearDiagnosis
12016 TB
12016No health condition
12016No health condition
22014TB
22016Inactive/Latent Tuberculosis
32014TB
32016Inactive/Latent Tuberculosis
32016HIV
32016Hypertension
42015Cancer
42014No health condition
42016TB

 

 


Accepted Solutions
Solution
‎03-02-2018 09:25 AM
Respected Advisor
Posts: 4,779

Re: Keep only most recent records

[ Edited ]

What most forum members would appreciate is if you could post sample data in the form of a working SAS data step so we don't have to do the work for you.

It's not clear to me what constitutes the most recent exam. Below code sample assumes you're just after all the rows per ID with the latest year.

data have;
  infile datalines dlm=' ' truncover;
  input ID $ Calendar_Year Diagnosis $60.;
  datalines;
1 2016  TB 
1 2016 No health condition 
1 2016 No health condition 
2 2014 TB 
2 2016 Inactive/Latent Tuberculosis 
3 2014 TB 
3 2016 Inactive/Latent Tuberculosis 
3 2016 HIV 
3 2016 Hypertension 
4 2015 Cancer 
4 2014 No health condition 
4 2016 TB 
;
run;

proc sql;
  create table want as
    select *
    from have
    group by id
    having Calendar_Year=max(Calendar_Year)
  ;
quit;

 

View solution in original post


All Replies
PROC Star
Posts: 549

Re: Keep only most recent records

[ Edited ]

how the output  should look like.

Contributor
Posts: 46

Re: Keep only most recent records

It doesn't matter much. If you have more than one way in your mind feel free to share. Well appreciated..
Thanks.
Solution
‎03-02-2018 09:25 AM
Respected Advisor
Posts: 4,779

Re: Keep only most recent records

[ Edited ]

What most forum members would appreciate is if you could post sample data in the form of a working SAS data step so we don't have to do the work for you.

It's not clear to me what constitutes the most recent exam. Below code sample assumes you're just after all the rows per ID with the latest year.

data have;
  infile datalines dlm=' ' truncover;
  input ID $ Calendar_Year Diagnosis $60.;
  datalines;
1 2016  TB 
1 2016 No health condition 
1 2016 No health condition 
2 2014 TB 
2 2016 Inactive/Latent Tuberculosis 
3 2014 TB 
3 2016 Inactive/Latent Tuberculosis 
3 2016 HIV 
3 2016 Hypertension 
4 2015 Cancer 
4 2014 No health condition 
4 2016 TB 
;
run;

proc sql;
  create table want as
    select *
    from have
    group by id
    having Calendar_Year=max(Calendar_Year)
  ;
quit;

 

Frequent Contributor
Posts: 112

Re: Keep only most recent records

/*The Code will look like this*/
proc sort data=have;
by ID Calendar_Year;
run;

data want;
set have;
by ID Calendar_Year;
if last.ID then output;
run;

/*But if you have a Diagnoses_Date then It would be Perfect*/
proc sort data=have;
by ID Diagnoses_Date;
run;

data want;
set have;
by ID Diagnoses_Date;
if last.ID then output;
run;
Contributor
Posts: 46

Re: Keep only most recent records

Posted in reply to Satish_Parida
Thanks for the effort,

I wanted to have not only the last line but all lines that have the latest exam date

for example, you go to the doctor and he writes each diagnosis on a separate line with the same date for all exam
thanks
Super User
Posts: 10,532

Re: Keep only most recent records

[ Edited ]

How 'bout

proc sql;
create table want as
select *
from have
group by id
having diag_date = max(diag_date);
quit;
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Frequent Contributor
Posts: 112

Re: Keep only most recent records

In That Case:

 

Using Solution provide by @KurtBremser

 

proc sql;
create table want as
select a.* from have a ,
(select id,diag_date
from have
group by id
having diag_date = max(diag_date))
where a.id=b.id and a.diag_date=b.diag_date
;
quit;
PROC Star
Posts: 549

Re: Keep only most recent records

answer is important but asking question is even more important. it makes you successful. if you want to be successful it is very very important to ask question correctly and in effective manner, where in no one have to guess. Most of work in field depends on how to ask proper questions in business or in any other field

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 430 views
  • 0 likes
  • 5 in conversation