Hi,
I have a dataset with matched cases and control (dataset cc1- CCID =0 implies case and 1 is control), also I have the actual dataset for cases ( named case_1) and control (named ctrl_1)-- see below. For cases, I have overlap of drug SGA with GLP so the dataset has preoverlap duration, indexdate (which is start of GLP) and overlap duration. And for the control I just have SGA start and end date with no GLP. So I want to get the indexdate for control which should have the same preoverlap duration as the matched case. Please see the figure below.
data WORK.CC1;
infile datalines dsd truncover;
input index:$19. ENROLID:32. CCID:32. MATCHID:20.;
format MATCHID 20.;
label ENROLID="Enrollee ID";
datalines;
31ADUFEMA 3996211602 0 11
31ADUFEMA 4216560601 1 11
31ADUFEMA 4121700302 0 12
31ADUFEMA 34446086301 1 12
31ADUMALE 999351801 1 23
;;;;
data WORK.CASE_1;
infile datalines dsd truncover;
input ENROLID:32. SGA_START:MMDDYY10. SGA_END:MMDDYY10. GLP_START:MMDDYY10. GLP_END:MMDDYY10. overlap:32. overlap_start:MMDDYY10. overlap_end:MMDDYY10. overlap_length:32. pre_overlap_duration:32. total_dur:32. AGE_GRP:$3. GENDER:$4.;
format SGA_START MMDDYY10. SGA_END MMDDYY10. GLP_START MMDDYY10. GLP_END MMDDYY10. overlap_start MMDDYY10. overlap_end MMDDYY10.;
label ENROLID="Enrollee ID";
datalines;
29791002 05/03/2017 11/10/2017 08/04/2017 11/10/2017 2 08/04/2017 11/10/2017 99 93 192 ADU FEMA
29791002 09/27/2018 12/14/2019 09/28/2018 03/17/2019 3 09/28/2018 03/17/2019 171 1 172 ADU FEMA
29791002 09/27/2018 12/14/2019 06/06/2019 08/29/2019 4 06/06/2019 08/29/2019 85 252 337 ADU FEMA
33748104 01/24/2017 12/26/2019 02/03/2017 12/22/2019 1 02/03/2017 12/22/2019 1053 10 1063 YOU FEMA
98183902 01/22/2017 03/07/2017 01/24/2017 03/10/2017 1 01/24/2017 03/07/2017 43 2 45 ELD FEMA
;;;;
data WORK.CTRL_1;
infile datalines dsd truncover;
input ENROLID:32. CTRL_SGASTART:MMDDYY10. CTRL_SGAEND:MMDDYY10. day_sga:32. AGE_GRP:$3. GENDER:$4.;
format CTRL_SGASTART MMDDYY10. CTRL_SGAEND MMDDYY10.;
label ENROLID="Enrollee ID";
datalines;
68202 03/05/2018 07/28/2018 145 ADU FEMA
125603 01/11/2018 12/11/2019 699 ADU MALE
126003 01/01/2018 12/01/2019 699 ADU FEMA
350802 01/01/2018 12/20/2019 718 ADU FEMA
371401 02/07/2019 08/21/2019 195 ADU MALE
;;;;
FOR CASE | |||||||
ID | CCID | MATCHING ID | SGA_STAR | SGA_END | DM_START/INDEXDATE | DM_END | PREOVERLAP DURATION |
1 | 0 | 11 | 2/1/17 | 12/1/17 | 4/1/17 | 7/1/17 | 61 DAYS |
FOR CONTROL | |||||||
12 | 1 | 11 | 1/1/17 | 8/1/17 | -- | -- | -- |
OUTPUT DESIRED | |||||||
FOR CONTROL | INDEXDATE | ||||||
12 | 1 | 11 | 1/1/17 | 8/1/17 | 3/1/17 | 61 DAYS |
Both the group are matched on total duration, which is preoverlap+overlap duration for cases and SGA_end- sga_start for controls. Can you please help me to code index dates for control group.
Thank you
Process by MATCHID and CCID (so you process the CASE observation before the CONTROL observation).
When on a CASE calculate the offset. Remember it. When on a CONTROL apply it.
It is hard for me to see what you are calling the variables. So lets assume you have START_DATE and END_DATE values on all observations. But GLP_DATE only have values on the CASE observations.
So here is code to create INDEX_DATE values on all observations.
data want;
set have ;
by matchid ccid;
if ccid=0 then do;
offset = glp_date - start_date;
retain offset;
index_date = glp_date;
end;
else index_date = start_date + offset;
format index_date date9.;
run;
Process by MATCHID and CCID (so you process the CASE observation before the CONTROL observation).
When on a CASE calculate the offset. Remember it. When on a CONTROL apply it.
It is hard for me to see what you are calling the variables. So lets assume you have START_DATE and END_DATE values on all observations. But GLP_DATE only have values on the CASE observations.
So here is code to create INDEX_DATE values on all observations.
data want;
set have ;
by matchid ccid;
if ccid=0 then do;
offset = glp_date - start_date;
retain offset;
index_date = glp_date;
end;
else index_date = start_date + offset;
format index_date date9.;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.