BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
stellapersis7
Obsidian | Level 7

Hi,

I have a dataset with matched cases and control (dataset cc1- CCID =0 implies case and 1 is control), also I have the actual dataset for cases ( named case_1) and control (named ctrl_1)-- see below. For cases, I have overlap of drug SGA with GLP so the dataset has preoverlap duration, indexdate (which is start of GLP) and overlap duration. And for the control I just have SGA start and end date with no GLP. So I want to get the indexdate for control which should have the same preoverlap duration as the matched case. Please see the figure below. 

Screen Shot 2024-06-03 at 14.42.09.png

data WORK.CC1;
infile datalines dsd truncover;
input index:$19. ENROLID:32. CCID:32. MATCHID:20.;
format MATCHID 20.;
label ENROLID="Enrollee ID";
datalines;
31ADUFEMA 3996211602 0 11
31ADUFEMA 4216560601 1 11
31ADUFEMA 4121700302 0 12
31ADUFEMA 34446086301 1 12
31ADUMALE 999351801 1 23
;;;;

data WORK.CASE_1;
infile datalines dsd truncover;
input ENROLID:32. SGA_START:MMDDYY10. SGA_END:MMDDYY10. GLP_START:MMDDYY10. GLP_END:MMDDYY10. overlap:32. overlap_start:MMDDYY10. overlap_end:MMDDYY10. overlap_length:32. pre_overlap_duration:32. total_dur:32. AGE_GRP:$3. GENDER:$4.;
format SGA_START MMDDYY10. SGA_END MMDDYY10. GLP_START MMDDYY10. GLP_END MMDDYY10. overlap_start MMDDYY10. overlap_end MMDDYY10.;
label ENROLID="Enrollee ID";
datalines;
29791002 05/03/2017 11/10/2017 08/04/2017 11/10/2017 2 08/04/2017 11/10/2017 99 93 192 ADU FEMA
29791002 09/27/2018 12/14/2019 09/28/2018 03/17/2019 3 09/28/2018 03/17/2019 171 1 172 ADU FEMA
29791002 09/27/2018 12/14/2019 06/06/2019 08/29/2019 4 06/06/2019 08/29/2019 85 252 337 ADU FEMA
33748104 01/24/2017 12/26/2019 02/03/2017 12/22/2019 1 02/03/2017 12/22/2019 1053 10 1063 YOU FEMA
98183902 01/22/2017 03/07/2017 01/24/2017 03/10/2017 1 01/24/2017 03/07/2017 43 2 45 ELD FEMA
;;;;

data WORK.CTRL_1;
infile datalines dsd truncover;
input ENROLID:32. CTRL_SGASTART:MMDDYY10. CTRL_SGAEND:MMDDYY10. day_sga:32. AGE_GRP:$3. GENDER:$4.;
format CTRL_SGASTART MMDDYY10. CTRL_SGAEND MMDDYY10.;
label ENROLID="Enrollee ID";
datalines;
68202 03/05/2018 07/28/2018 145 ADU FEMA
125603 01/11/2018 12/11/2019 699 ADU MALE
126003 01/01/2018 12/01/2019 699 ADU FEMA
350802 01/01/2018 12/20/2019 718 ADU FEMA
371401 02/07/2019 08/21/2019 195 ADU MALE
;;;;

FOR CASE 
ID  CCID MATCHING ID SGA_STAR SGA_END  DM_START/INDEXDATE DM_END PREOVERLAP DURATION
1 0 11 2/1/17 12/1/17 4/1/17 7/1/17 61 DAYS
FOR CONTROL
12 1 11 1/1/17 8/1/17 -- -- --
               
OUTPUT DESIRED  
FOR CONTROL  INDEXDATE    
12 1 11 1/1/17 8/1/17 3/1/17   61 DAYS

Both the group are matched on total duration, which is preoverlap+overlap duration for cases and SGA_end- sga_start for controls. Can you please help me to code index dates for control group.

Thank you

 

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Process by MATCHID and CCID (so you process the CASE observation before the CONTROL observation).

When on a CASE calculate the offset.  Remember it.  When on a CONTROL apply it.

 

It is hard for me to see what you are calling the variables.  So lets assume you have START_DATE and END_DATE values on all observations.  But GLP_DATE only have values on the CASE observations. 

 

So here is code to create INDEX_DATE values on all observations.

data want;
  set have ;
  by matchid ccid;
  if ccid=0 then do;
    offset = glp_date - start_date;
    retain offset;
    index_date = glp_date;
  end;
  else index_date = start_date + offset;
  format index_date date9.;
run;

View solution in original post

1 REPLY 1
Tom
Super User Tom
Super User

Process by MATCHID and CCID (so you process the CASE observation before the CONTROL observation).

When on a CASE calculate the offset.  Remember it.  When on a CONTROL apply it.

 

It is hard for me to see what you are calling the variables.  So lets assume you have START_DATE and END_DATE values on all observations.  But GLP_DATE only have values on the CASE observations. 

 

So here is code to create INDEX_DATE values on all observations.

data want;
  set have ;
  by matchid ccid;
  if ccid=0 then do;
    offset = glp_date - start_date;
    retain offset;
    index_date = glp_date;
  end;
  else index_date = start_date + offset;
  format index_date date9.;
run;

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 1 reply
  • 690 views
  • 1 like
  • 2 in conversation