Hi. I am trying to replace missing date values to expected date values if it falls before the max dates for EACH subject ID.
Here is the dataset
subject_id | visit_day | expected_visit_date | actual_visit_date |
1 | 1 | 1/1/2017 | 1/1/2017 |
1 | 2 | 1/2/2017 | 1/2/2017 |
1 | 3 | 1/3/2017 | |
1 | 4 | 1/4/2017 | 1/4/2017 |
2 | 1 | 2/1/2017 | 2/1/2017 |
2 | 2 | 2/2/2017 | |
2 | 3 | 2/3/2017 | |
2 | 4 | 2/4/2017 | 2/4/2017 |
2 | 5 | 2/5/2017 | |
2 | 6 | 2/6/2017 | 2/6/2017 |
3 | 1 | 3/1/2017 | 3/1/2017 |
3 | 2 | 3/2/2017 | 3/2/2017 |
3 | 3 | 3/3/2017 | 3/3/2017 |
3 | 4 | 3/4/2017 | 3/4/2017 |
3 | 5 | 3/5/2017 | 3/5/2017 |
3 | 6 | 3/6/2017 |
Here is the dataset that I would like:
subject_id | visit_day | expected_visit_date | actual_visit_date | visit_date_new |
1 | 1 | 1/1/2017 | 1/1/2017 | 1/1/2017 |
1 | 2 | 1/2/2017 | 1/2/2017 | 1/2/2017 |
1 | 3 | 1/3/2017 | 1/3/2017 | |
1 | 4 | 1/4/2017 | 1/4/2017 | 1/4/2017 |
2 | 1 | 2/1/2017 | 2/1/2017 | 2/1/2017 |
2 | 2 | 2/2/2017 | 2/2/2017 | |
2 | 3 | 2/3/2017 | 2/3/2017 | |
2 | 4 | 2/4/2017 | 2/4/2017 | 2/4/2017 |
2 | 5 | 2/5/2017 | 2/5/2017 | |
2 | 6 | 2/6/2017 | 2/6/2017 | 2/6/2017 |
3 | 1 | 3/1/2017 | 3/1/2017 | 3/1/2017 |
3 | 2 | 3/2/2017 | 3/2/2017 | 3/2/2017 |
3 | 3 | 3/3/2017 | 3/3/2017 | 3/3/2017 |
3 | 4 | 3/4/2017 | 3/4/2017 | 3/4/2017 |
3 | 5 | 3/5/2017 | 3/5/2017 | 3/5/2017 |
3 | 6 | 3/6/2017 |
If the Subject ID 3 does not have any max date value so it is left black since the VISIT was not completed.
I have tried the following code
proc sql;
select coalesce(actual_visit_date,expected_visit_date) as new_visit_date
from have;
quit;
Thanks!
Radhika
what if
subject 3 happens to be
3 | 1 | 3/1/2017 | 3/1/2017 |
3 | 2 | 3/2/2017 | 3/2/2017 |
3 | 3 | 3/3/2017 | |
3 | 4 | 3/4/2017 | |
3 | 5 | 3/5/2017 | 3/5/2017 |
3 | 6 | 3/6/2017 |
would you fill 3,4 ? or leave it blank>?
Great question! If the actual visit dates are blank, but observations before and after are filled then they do get replaced.
3 | 1 | 3/1/2017 | 3/1/2017 |
3 | 2 | 3/2/2017 | 3/2/2017 |
3 | 3 | 3/3/2017 | 3/3/2017 |
3 | 4 | 3/4/2017 | 3/4/2017 |
3 | 5 | 3/5/2017 | 3/5/2017 |
3 | 6 | 3/6/2017 |
Thank you @radhikaa4
data have;
infile cards truncover;
input subject_id visit_day expected_visit_date :mmddyy10. actual_visit_date :mmddyy10.;
format expected_visit_date mmddyy10. actual_visit_date mmddyy10.;
cards;
1 1 1/1/2017 1/1/2017
1 2 1/2/2017 1/2/2017
1 3 1/3/2017
1 4 1/4/2017 1/4/2017
2 1 2/1/2017 2/1/2017
2 2 2/2/2017
2 3 2/3/2017
2 4 2/4/2017 2/4/2017
2 5 2/5/2017
2 6 2/6/2017 2/6/2017
3 1 3/1/2017 3/1/2017
3 2 3/2/2017 3/2/2017
3 3 3/3/2017 3/3/2017
3 4 3/4/2017 3/4/2017
3 5 3/5/2017 3/5/2017
3 6 3/6/2017
;
data want;
call missing(f);
do until(last.subject_id);
set have;
by subject_id;
if not missing(actual_visit_date) then f=actual_visit_date;
end;
do until(last.subject_id);
set have;
by subject_id;
actual_visit_date=ifn(expected_visit_date<= f and missing(actual_visit_date),expected_visit_date,actual_visit_date);
output;
end;
drop f;
run;
Oops you wanted a new column visit_date_new
data want;
call missing(f);
do until(last.subject_id);
set have;
by subject_id;
if not missing(actual_visit_date) then f=actual_visit_date;
end;
do until(last.subject_id);
set have;
by subject_id;
visit_date_new=ifn(expected_visit_date<= f and missing(actual_visit_date),expected_visit_date,actual_visit_date);
output;
end;
drop f;
format visit_date_new mmddyy10.;
run;
And proc sql-->
proc sql;
create table want as
select *,ifn(expected_visit_date<= max(actual_visit_date),expected_visit_date,actual_visit_date) as new_visit_Date format=mmddyy10.
from have
group by subject_id
order by subject_id, visit_day;
quit;
data have;
input subject_id visit_day expected_visit_date : mmddyy10. actual_visit_date mmddyy10.;
format expected_visit_date actual_visit_date mmddyy10.;
datalines;
1 1 1/1/2017 1/1/2017
1 2 1/2/2017 1/2/2017
1 3 1/3/2017
1 4 1/4/2017 1/4/2017
2 1 2/1/2017 2/1/2017
2 2 2/2/2017
2 3 2/3/2017
2 4 2/4/2017 2/4/2017
2 5 2/5/2017
2 6 2/6/2017 2/6/2017
3 1 3/1/2017 3/1/2017
3 2 3/2/2017 3/2/2017
3 3 3/3/2017 3/3/2017
3 4 3/4/2017 3/4/2017
3 5 3/5/2017 3/5/2017
3 6 3/6/2017
;
run;
proc sql;
create table want as
select Subject_id,visit_day,expected_visit_date,coalesce(actual_visit_date,newd) as actual_visit_date format=mmddyy10. from
(select Subject_id,visit_day,expected_visit_date,actual_visit_date,case when max(actual_visit_date) gt 0 and (v gt visit_day) then coalesce(actual_visit_date,expected_visit_date) else . end as newd format=mmddyy10. from
(select Subject_id, visit_day , expected_visit_date,actual_visit_date,max(visit_day) as v from have group by Subject_id)) order by subject_id,visit_day;
quit;
Hi @singhsahab Appreciate your effort and attempt. However, the code doesn't look very desirable from the point of view ease of use or even application of logic. From the start of typing to , testing and moving to production and then post production maintenance, I guess it's hard to envision implementing your code. I wonder how your code can be an alternative to the rest. Nevertheless, good contribution
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.