I have the following scenario with two data sets. I have to find the proportion of persons who tested within 5 to 10 days from exposure date. I deduplicated the datasets, merged them then found the diff_date and flagged it as follows. In the end, I found the percentage of date difference in two ways. What I'm trying to do is get a finer output report by using a better syntax.
data Exposure;
input Subject_ID Exp_Date : mmddyy10. Name $14-22 Case_ID Exposure $;
format exp_date date9.;
datalines;
1 12/20/2019 Joe Smith 122 Home
1 09/30/2019 Joe Smith 435 Home
2 01/05/2020 Mary Mill 432 Work
3 11/04/2019 John Doe 129 Work
3 11/04/2019 John Doe 655 School
3 01/25/2020 John Doe 721 Work
;;;
run;
proc sort data = work.exposure out=exp_nodups nodupkey;
by Subject_ID Name Exp_date;
run;
data Testing;
input Subject_ID Test_Date : mmddyy10. Name $14-23;
format test_date date9.;
datalines;
1 12/29/2019 Joe Smith
2 09/30/2019 Mary Mill
2 01/07/2020 Mary Mill
8 06/30/2019 Mark Roger
8 07/15/2019 Mark Roger
8 08/29/2019 Mark Roger
8 12/03/2019 Mark Roger
;;;;
run;
proc sort data = work.testing out=test_nodups nodupkey;
by Subject_ID Name test_date;
run;
data exp_;
set exp_nodups;
day5 = exp_date + 5;
day10 = exp_date + 10;
run;
Proc SQL;
Create Table Merge AS
Select
coalesce(E.Name, T.Name) as Name,
E.Exp_Date,E.Case_ID,E.Exposure,day5,day10,
coalesce(E.Subject_ID, T.Subject_ID) as Subject_ID,
T.Test_Date
from exp_ E FULL JOIN test_nodups T on (E.Subject_ID = T.Subject_ID);
quit;
data merged;
set merge;
dif_days = intck('day', exp_date, test_date);
flag = day5<=test_date<=day10;
run;
proc freq data = merged;
tables flag/norow nocol out= want;
title 'Proportion of contacts tested in the 5-10 day window of their quarantine period';
run;
proc SQL;
Create Table proportion AS
Select
mean(((dif_days>=5)*(dif_days<=10))) as contact_proportion
from merged;
quit;
Thank you
What does "finer output" mean for this?
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.