Hi all,
I am trying to match case and controls (1:4) in a study. It gives me this error and stops working. I need help please.
This is my code:
/* merge cases and control*/
data md_merged_GLP;
set MD_glpcasemerge_2 MD_ctrl_merge2;
run;
/* trial 2 with duration of controls greater than cases */
data md_study_GLP md_control_GLP;
set md_merged_GLP;
rand_num=uniform(0);
if cc=1 then output md_study_GLP;
else output md_control_GLP;
run;
data md_study_GLP2;
set md_study_GLP;
duration_low = duration+1;
age_low = age-2;
age_high = age+2;
run;
data md_control_GLP2;
set md_control_GLP;
run;
proc sql;
create table md_GLP_controls_id
asselect
one.enrolid as study_id,
two.enrolid as control_id,
one.age as study_age,
two.age as control_age,
one.sex as study_sex,
two.sex as control_sex,
one.duration as study_duration,
two.duration as control_duration,
one.rand_num as rand_num
from md_study_GLP2 one, md_control_GLP2 two
where (two.age between one.age_low and one.age_high and one.sex=two.sex
and two.duration ge one.duration_low
and two.CTRL_EDAYS ge one.duration_365 ) ;
proc sort data=md_GLP_controls_id out = MD_SGACONTROL nodupkey ;
by study_id control_id;
run;
proc surveyselect data=MD_SGACONTROL
method=srs n=4 out=MD_GLP_caseids;
strata study_id ;
SAMPLINGUNIT control_id ;
run;
This is the error I get and it stops working:
955 proc surveyselect data=MD_SGACONTROL
956 seed = 123
957 method=srs n=4 out=MD_GLP_caseids;
958 strata study_id ;
959 SAMPLINGUNIT control_id ;
960 run;
ERROR: The sample size, 4, is greater than the number of sampling units, 1.
NOTE: The above message was for the following stratum:
Enrollee ID=20012389602.
ERROR: The sample size, 4, is greater than the number of sampling units, 1.
NOTE: The above message was for the following stratum:
Enrollee ID=20013082518.
ERROR: The sample size, 4, is greater than the number of sampling units, 1.
NOTE: The above message was for the following stratum:
Enrollee ID=20021334804.
ERROR: The sample size, 4, is greater than the number of sampling units, 2.
NOTE: The above message was for the following stratum:
Enrollee ID=20040835820.
ERROR: The sample size, 4, is greater than the number of sampling units, 1.
NOTE: The above message was for the following stratum:
Enrollee ID=20041167418.
ERROR: The sample size, 4, is greater than the number of sampling units, 1.
NOTE: The above message was for the following stratum:
Enrollee ID=20054274425.
ERROR: The sample size, 4, is greater than the number of sampling units, 1.
NOTE: The above message was for the following stratum:
Enrollee ID=20054544231.
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.MD_GLP_CASEIDS may be incomplete. When this step was stopped there were 5472
observations and 11 variables.
WARNING: Data set WORK.MD_GLP_CASEIDS was not replaced because this step was stopped.
NOTE: PROCEDURE SURVEYSELECT used (Total process time):
real time 9.10 seconds
cpu time 7.80 seconds
I don't mind if those 7 ids which do not have 4 matched controls each, are dropped.
Below is the data set md_merged_glp:
data WORK.MD_MERGED_GLP;
infile datalines dsd truncover;
input ENROLID:32. SGA_START:MMDDYY10. SGA_END:MMDDYY10. GLP_START:MMDDYY10. GLP_END:MMDDYY10. overlap_start:MMDDYY10. overlap_end:MMDDYY10. overlap_length:32. duration:32. SEX:$1. age:32. CC:32. case_180:MMDDYY10. case_365:MMDDYY10. duration_180:32. durati
on_365:32. E_START:MMDDYY10. E_END:MMDDYY10. sga_end1:MMDDYY10. CTRL_SGASTART:MMDDYY10. CTRL_SGAEND:MMDDYY10. CTRL_EDAYS:32.;
format SGA_START MMDDYY10. SGA_END MMDDYY10. GLP_START MMDDYY10. GLP_END MMDDYY10. overlap_start MMDDYY10. overlap_end MMDDYY10. case_180 MMDDYY10. case_365 MMDDYY10. E_START MMDDYY10. E_END MMDDYY10. sga_end1 MMDDYY10. CTRL_SGASTART MMDDYY10. CTRL_SGAEND
MMDDYY10.;
label ENROLID="Enrollee ID" SEX="Gender of Patient" age="Age of Patient";
datalines;
20000046218 01/26/2018 03/22/2019 01/06/2018 03/16/2018 01/26/2018 03/16/2018 50 0 1 57 1 07/05/2018 01/06/2019 180 365 01/01/2017 12/31/2019 01/06/2019 . . .
20000046218 01/01/2018 02/23/2018 01/06/2018 03/16/2018 01/06/2018 02/23/2018 49 5 1 57 1 07/05/2018 01/06/2019 185 370 01/01/2017 12/31/2019 02/23/2018 . . .
20000048519 01/03/2018 02/28/2018 01/11/2018 05/17/2018 01/11/2018 02/28/2018 49 8 2 48 1 07/10/2018 01/11/2019 188 373 01/01/2017 12/31/2019 02/28/2018 . . .
20000049149 04/27/2017 05/18/2018 02/15/2018 06/12/2018 02/15/2018 05/18/2018 93 294 2 44 1 08/14/2018 02/15/2019 474 659 01/01/2017 12/31/2019 05/18/2018 . . .
20000058589 05/08/2018 08/10/2018 05/08/2018 09/14/2018 05/08/2018 08/10/2018 95 0 2 45 1 11/04/2018 05/08/2019 180 365 01/01/2017 12/31/2019 08/10/2018 . . .
20000058589 06/28/2017 02/19/2018 11/28/2017 03/01/2018 11/28/2017 02/19/2018 84 153 2 45 1 05/27/2018 11/28/2018 333 518 01/01/2017 12/31/2019 02/19/2018 . . .
20000273217 08/29/2017 12/28/2019 04/19/2018 12/27/2019 04/19/2018 12/27/2019 618 233 2 63 1 10/16/2018 04/19/2019 413 598 01/01/2017 12/31/2019 04/19/2019 . . .
20000298367 01/02/2017 12/29/2019 06/26/2018 07/02/2019 06/26/2018 07/02/2019 372 540 1 45 1 12/23/2018 06/26/2019 720 905 01/01/2017 12/31/2019 06/26/2019 . . .
20000298367 01/02/2017 12/29/2019 12/03/2018 05/05/2019 12/03/2018 05/05/2019 154 700 1 45 1 06/01/2019 12/03/2019 880 1065 01/01/2017 12/31/2019 12/03/2019 . . .
20000304815 01/31/2018 11/07/2019 01/16/2018 08/31/2019 01/31/2018 08/31/2019 578 0 2 48 1 07/15/2018 01/16/2019 180 365 01/01/2017 12/31/2019 01/16/2019 . . .
;;;;
NOTE: There were 10 observations read from the data set WORK.MD_MERGED_GLP.
NOTE: DATA statement used (Total process time):
real time 0.32 seconds
cpu time 0.00 seconds
What exactly do you need help with? It sounds like you understand the error and are willing to drop the 7 enrollee IDs where the error happens.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.