BookmarkSubscribeRSS Feed
stellapersis7
Obsidian | Level 7

Hi all, 
I am trying to match case and controls (1:4) in a study. It gives me this error and stops working. I need help please.

This is my code:

/* merge cases and control*/

 

data md_merged_GLP;

set MD_glpcasemerge_2 MD_ctrl_merge2;

run;

 

/* trial 2 with duration of controls greater than cases */

 

data md_study_GLP md_control_GLP;

set md_merged_GLP;

rand_num=uniform(0);

 if cc=1 then output md_study_GLP;

else output md_control_GLP;

run;

 

data md_study_GLP2;

set md_study_GLP;

duration_low = duration+1;

age_low = age-2;

age_high = age+2;

run;

 

data md_control_GLP2;

set md_control_GLP;

 

run;

 

proc sql;

create table md_GLP_controls_id

asselect

one.enrolid as study_id,

two.enrolid as control_id,

one.age as study_age,

two.age as control_age,

one.sex as study_sex,

two.sex as control_sex,

one.duration as study_duration,

two.duration as control_duration,

one.rand_num as rand_num

from md_study_GLP2 one, md_control_GLP2 two

where (two.age between one.age_low and one.age_high  and one.sex=two.sex

and two.duration ge one.duration_low  

and two.CTRL_EDAYS ge one.duration_365 ) ;

 

 

proc sort data=md_GLP_controls_id out = MD_SGACONTROL nodupkey ;

by  study_id control_id;

     run;

 

proc surveyselect data=MD_SGACONTROL 

method=srs n=4 out=MD_GLP_caseids;

strata study_id  ;

SAMPLINGUNIT control_id ;

run;

This is the error I get and it stops working:


955 proc surveyselect data=MD_SGACONTROL
956 seed = 123
957 method=srs n=4 out=MD_GLP_caseids;
958 strata study_id ;
959 SAMPLINGUNIT control_id ;
960 run;

ERROR: The sample size, 4, is greater than the number of sampling units, 1.
NOTE: The above message was for the following stratum:
Enrollee ID=20012389602.
ERROR: The sample size, 4, is greater than the number of sampling units, 1.
NOTE: The above message was for the following stratum:
Enrollee ID=20013082518.
ERROR: The sample size, 4, is greater than the number of sampling units, 1.
NOTE: The above message was for the following stratum:
Enrollee ID=20021334804.
ERROR: The sample size, 4, is greater than the number of sampling units, 2.
NOTE: The above message was for the following stratum:
Enrollee ID=20040835820.
ERROR: The sample size, 4, is greater than the number of sampling units, 1.
NOTE: The above message was for the following stratum:
Enrollee ID=20041167418.
ERROR: The sample size, 4, is greater than the number of sampling units, 1.
NOTE: The above message was for the following stratum:
Enrollee ID=20054274425.
ERROR: The sample size, 4, is greater than the number of sampling units, 1.
NOTE: The above message was for the following stratum:
Enrollee ID=20054544231.
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.MD_GLP_CASEIDS may be incomplete. When this step was stopped there were 5472
observations and 11 variables.
WARNING: Data set WORK.MD_GLP_CASEIDS was not replaced because this step was stopped.
NOTE: PROCEDURE SURVEYSELECT used (Total process time):
real time 9.10 seconds
cpu time 7.80 seconds

 

I don't mind if those 7 ids which do not have 4 matched controls each, are dropped. 

 

Below is the data set md_merged_glp:

 

 

data WORK.MD_MERGED_GLP;
infile datalines dsd truncover;

input ENROLID:32. SGA_START:MMDDYY10. SGA_END:MMDDYY10. GLP_START:MMDDYY10. GLP_END:MMDDYY10. overlap_start:MMDDYY10. overlap_end:MMDDYY10. overlap_length:32. duration:32. SEX:$1. age:32. CC:32. case_180:MMDDYY10. case_365:MMDDYY10. duration_180:32. durati
on_365:32. E_START:MMDDYY10. E_END:MMDDYY10. sga_end1:MMDDYY10. CTRL_SGASTART:MMDDYY10. CTRL_SGAEND:MMDDYY10. CTRL_EDAYS:32.;

format SGA_START MMDDYY10. SGA_END MMDDYY10. GLP_START MMDDYY10. GLP_END MMDDYY10. overlap_start MMDDYY10. overlap_end MMDDYY10. case_180 MMDDYY10. case_365 MMDDYY10. E_START MMDDYY10. E_END MMDDYY10. sga_end1 MMDDYY10. CTRL_SGASTART MMDDYY10. CTRL_SGAEND
MMDDYY10.;
label ENROLID="Enrollee ID" SEX="Gender of Patient" age="Age of Patient";
datalines;
20000046218 01/26/2018 03/22/2019 01/06/2018 03/16/2018 01/26/2018 03/16/2018 50 0 1 57 1 07/05/2018 01/06/2019 180 365 01/01/2017 12/31/2019 01/06/2019 . . .
20000046218 01/01/2018 02/23/2018 01/06/2018 03/16/2018 01/06/2018 02/23/2018 49 5 1 57 1 07/05/2018 01/06/2019 185 370 01/01/2017 12/31/2019 02/23/2018 . . .
20000048519 01/03/2018 02/28/2018 01/11/2018 05/17/2018 01/11/2018 02/28/2018 49 8 2 48 1 07/10/2018 01/11/2019 188 373 01/01/2017 12/31/2019 02/28/2018 . . .
20000049149 04/27/2017 05/18/2018 02/15/2018 06/12/2018 02/15/2018 05/18/2018 93 294 2 44 1 08/14/2018 02/15/2019 474 659 01/01/2017 12/31/2019 05/18/2018 . . .
20000058589 05/08/2018 08/10/2018 05/08/2018 09/14/2018 05/08/2018 08/10/2018 95 0 2 45 1 11/04/2018 05/08/2019 180 365 01/01/2017 12/31/2019 08/10/2018 . . .
20000058589 06/28/2017 02/19/2018 11/28/2017 03/01/2018 11/28/2017 02/19/2018 84 153 2 45 1 05/27/2018 11/28/2018 333 518 01/01/2017 12/31/2019 02/19/2018 . . .
20000273217 08/29/2017 12/28/2019 04/19/2018 12/27/2019 04/19/2018 12/27/2019 618 233 2 63 1 10/16/2018 04/19/2019 413 598 01/01/2017 12/31/2019 04/19/2019 . . .
20000298367 01/02/2017 12/29/2019 06/26/2018 07/02/2019 06/26/2018 07/02/2019 372 540 1 45 1 12/23/2018 06/26/2019 720 905 01/01/2017 12/31/2019 06/26/2019 . . .
20000298367 01/02/2017 12/29/2019 12/03/2018 05/05/2019 12/03/2018 05/05/2019 154 700 1 45 1 06/01/2019 12/03/2019 880 1065 01/01/2017 12/31/2019 12/03/2019 . . .
20000304815 01/31/2018 11/07/2019 01/16/2018 08/31/2019 01/31/2018 08/31/2019 578 0 2 48 1 07/15/2018 01/16/2019 180 365 01/01/2017 12/31/2019 01/16/2019 . . .
;;;;
NOTE: There were 10 observations read from the data set WORK.MD_MERGED_GLP.
NOTE: DATA statement used (Total process time):
real time 0.32 seconds
cpu time 0.00 seconds

1 REPLY 1
PaigeMiller
Diamond | Level 26

What exactly do you need help with? It sounds like you understand the error and are willing to drop the 7 enrollee IDs where the error happens.

--
Paige Miller

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 1 reply
  • 147 views
  • 0 likes
  • 2 in conversation