BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
SP01
Obsidian | Level 7

The current data structure for the problem I have is:

id L_date M_date Note
1 9/1/2020 9/29/2020 keep, as it is closest
1 8/30/2020 9/29/2020 change to null
2 8/13/2022 10/21/2022 change to null
2 10/1/2022 10/21/2022 keep, as it is closest
2 10/2/2023 10/21/2022 change to null
3 12/11/2021 .  
3 5/19/2022 .  
4 2/5/2022 12/11/2021 keep, as it is closest
4 3/19/2022 12/11/2021 change to null
4 6/21/2022 12/11/2021 change to null

 

I am trying to retain the closest M_date column value that is to the L_date value and turn the other same M_date values for the same ID to null.

 

The data structure that I am trying to achieve is below:

id L_date M_date
1 9/1/2020 9/29/2020
1 8/30/2020  
2 8/13/2022  
2 10/1/2022 10/21/2022
2 10/2/2023  
3 12/11/2021 .
3 5/19/2022 .
4 2/5/2022 12/11/2021
4 3/19/2022  
4 6/21/2022  
1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

Here's a solution, I expanded your sample data to include two different scenarios - one where there is a tie and one where there are missing M_Date and non missing M_date for the same ID.

 

data have;
infile cards truncover;
input id 1. L_date : mmddyy10.  M_date : mmddyy10.;
format l_date m_date yymmdd10.;;
cards;
1   9/1/2020    9/29/2020   
1   8/30/2020   9/27/2020
2   8/13/2022   10/21/2022
2   10/1/2022   10/21/2022
2   10/2/2023   10/21/2022
3   12/11/2021  .    
3   5/19/2022   .    
4   2/5/2022    12/11/2021
4   3/19/2022   12/11/2021  
4   6/21/2022   12/11/2021
5   2/5/2022    12/11/2021
5   2/5/2022    12/14/2021
5   3/19/2022   .
5   6/21/2022   .
;;;
run;

data step1;
set have;
abs_diff = abs(m_date-l_date);
run;

proc sort data=step1;
by id abs_diff;
run;

data want;
set step1;
by id;
retain flag;
if first.id then flag=0;

if first.id and not missing(abs_diff) then flag=1;
else if flag=0 and not missing(abs_diff) then flag=1;
else if flag=1 then call missing(m_date);

run;

View solution in original post

12 REPLIES 12
Reeza
Super User
Should L be sorted in any manner? It doesn't appear to be consistently sorted.
SP01
Obsidian | Level 7

It can be sorted from oldest dates first to newest dates last.

PaigeMiller
Diamond | Level 26
  1. Subtract L_date from M_date; then take the absolute value.
  2. Run PROC RANK on the absolute value of the difference, computed in step 1, use a BY ID; statement
  3. If the ranked value is 1 then keep M_date; other wise set M_date to null
--
Paige Miller
Reeza
Super User
And how to deal with ties?
PaigeMiller
Diamond | Level 26

@Reeza wrote:
And how to deal with ties?

Don't ask me, that's up to the OP @SP01 

--
Paige Miller
SP01
Obsidian | Level 7

Ties can be left  as same.

Reeza
Super User

Here's a solution, I expanded your sample data to include two different scenarios - one where there is a tie and one where there are missing M_Date and non missing M_date for the same ID.

 

data have;
infile cards truncover;
input id 1. L_date : mmddyy10.  M_date : mmddyy10.;
format l_date m_date yymmdd10.;;
cards;
1   9/1/2020    9/29/2020   
1   8/30/2020   9/27/2020
2   8/13/2022   10/21/2022
2   10/1/2022   10/21/2022
2   10/2/2023   10/21/2022
3   12/11/2021  .    
3   5/19/2022   .    
4   2/5/2022    12/11/2021
4   3/19/2022   12/11/2021  
4   6/21/2022   12/11/2021
5   2/5/2022    12/11/2021
5   2/5/2022    12/14/2021
5   3/19/2022   .
5   6/21/2022   .
;;;
run;

data step1;
set have;
abs_diff = abs(m_date-l_date);
run;

proc sort data=step1;
by id abs_diff;
run;

data want;
set step1;
by id;
retain flag;
if first.id then flag=0;

if first.id and not missing(abs_diff) then flag=1;
else if flag=0 and not missing(abs_diff) then flag=1;
else if flag=1 then call missing(m_date);

run;
Astounding
PROC Star
Anybody want to try their hand at a one-step double DOW solution?
SP01
Obsidian | Level 7

Yes, what is DOW?

Astounding
PROC Star
DOW is a DO loop that runs through groups of observations. It was popularized by Ian Whitlock (although he didn't name it so it is not clear whether the W stands for Whitlock). The basic form for this problem would be:

data want;
do until (last.id);
set have;
*code to find min diff for this id;
end;
do until (last.id);
set have;
by id;
*code to compare same observations to min and output;
end;
run;

You could also take a similar approach without the explicit looping:

data want;
set have (in=a) have;
by id;
if a then do;
end;
else do;
end;
run;

The bottom program would require retaining some calculated results, and checking for first.id.
Ksharp
Super User
data have;
infile cards truncover;
input id 1. L_date : mmddyy10.  M_date : mmddyy10.;
format l_date m_date yymmdd10.;;
cards;
1   9/1/2020    9/29/2020   
1   8/30/2020   9/29/2020
2   8/13/2022   10/21/2022
2   10/1/2022   10/21/2022
2   10/2/2023   10/21/2022
3   12/11/2021  .    
3   5/19/2022   .    
4   2/5/2022    12/11/2021
4   3/19/2022   12/11/2021  
4   6/21/2022   12/11/2021
5   2/5/2022    12/11/2021
5   3/19/2022   .
5   6/21/2022   .
;;;
run;

proc sql;
create table want(drop=diff M_date) as
select *,ifn(diff=min(diff),M_date,.) as new_Mdate format=mmddyy10. from
(select *, abs(M_date-L_date) as diff from have)
group by id 
order by 1,2;
quit;
lavu_90
Calcite | Level 5

Here first ID both records dates l_date & m_Date have same days difference (28 days) , it means m_date should populate for both records ?

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 1278 views
  • 7 likes
  • 6 in conversation