BookmarkSubscribeRSS Feed
Denali
Quartz | Level 8

Hi,

 

I used the SAS code from this article (https://www.frontiersin.org/articles/10.3389/fdata.2019.00004/full) to do 1:2 matching for 1 case with 2 controls. 

 

Does anyone know how to generate a column with case ID numbers that the 2 controls matched with in the "matches" file step?

 

/*Example Code*/

DATA population; /*Example of population dataset*/

input uniqueid gender $ age ethnic $ casecontrol; /*gender and ethnic are defined as categorical variables by the following $*/

cards;

              1 f 26 1 1

              2 m 27 1 1

              3 m 30 2 0

              4 f 25 2 0

              5 m 29 1 0

              6 f 26 1 1

              7 f 28 2 0

              8 f 28 1 0

              9 m 25 1 1

              10 m 26 2 0

              11 f 30 3 0

              12 m 29 2 0

              13 f 27 1 0

              14 m 27 3 0

              15 f 26 1 1

              16 f 25 2 0

              17 f 29 1 0

              18 f 30 3 0

              19 m 26 1 0

              20 m 26 2 1

              21 f 30 2 0

              22 m 25 3 0

              23 m 27 1 0

              24 m 28 1 1

;

run;

 

%LET agerange = 5; /*For this example, we have chosen the age range to be 5. Controls can therefore be up to 5 years younger or older than the case*/

%LET ratio = 3; /*We have chosen to match 3 controls for each case*/

 

DATA cases controls;

   SET population;

   IF casecontrol = 1 THEN OUTPUT cases;

   ELSE OUTPUT controls;

RUN;

 

PROC FREQ NOPRINT DATA=cases;

   TABLES age*gender*ethnic/OUT=caseout; /*Since we want ‘ethnic’ to be a matching variable, we need to add it here*/

RUN;

 

%MACRO sample(v_age, v_gender, v_count,v_ethnic); /*Here, ethnic is also added*/

 

   DATA qualify1;

      SET controls;

    WHERE (&v_age-&agerange <= age <= &v_age+&agerange)

      AND

      (gender = "&v_gender")

      AND

      (ethnic = "&v_ethnic"); /*You will also need to add your extra variable to this step.*/

 

   case_age = &v_age;

   case_gender = "&v_gender";

   case_ethnic = "&v_ethnic"; /* You will also need to add your extra variable to this step.*/

 

   SEED = RANUNI(0);

   PROC SORT;

      BY SEED;

 

   DATA qualify2;

      SET qualify1 NOBS=totobs;

      IF _N_ <= &v_count*&ratio;

      IF &v_count*&ratio <= totobs THEN tag = 'yes';

         ELSE tag = 'no';

 

   PROC APPEND BASE=matches DATA=qualify2 force; /*new data set matches will contain the matched controls*/

 

   PROC SORT DATA=qualify2 OUT=temp1 (KEEP=uniqueid);

      BY uniqueid;

 

   PROC SORT DATA=controls OUT=temp2;

      BY uniqueid;

 

   DATA controls; /*the dataset controls is updated so that the controls already matched are removed and can not be matched again*/

      MERGE temp1(IN=in1) temp2(IN=in2);

      BY uniqueid;

      IF in2 AND NOT in1;

 

%MEND sample;

 

DATA _NULL_;

   SET caseout;

   CALL EXECUTE ('%sample('||age||','||gender||','||count||','||ethnic||')'); /* You will also need to add your extra variable to this step.*/

RUN;

 

/*The next part is for testing if any of the cases have not received the wanted amount of controls*/

 

PROC FREQ NOPRINT DATA=matches;

   TABLES case_age*case_gender*case_ethnic/OUT=con_out; /* You will also need to add your extra variable to this step.*/

 

PROC SORT DATA = caseout(RENAME=

                (age=case_age gender=case_gender count=case_cnt ethnic=case_ethnic)); /* You will also need to add your extra variable to this step.*/

   BY case_age case_gender case_ethnic; /* You will also need to add your extra variable to this step.*/

 

PROC SORT DATA = con_out (RENAME= (count=con_cnt));

   BY case_age case_gender case_ethnic; /* You will also need to add your extra variable to this step.*/

 

DATA final (DROP=percent);

   MERGE caseout con_out;

   BY case_age case_gender case_ethnic; /* You will also need to add your extra variable to this step.*/

 

con_need = case_cnt*&ratio;

IF con_cnt = . THEN con_cnt = 0;

diff = con_cnt-con_need;

 

PROC PRINT DATA = final; /*creates a table showing what characterizes the cases who have not received enough matches and how many matches they are missing*/

   WHERE diff < 0;

   TITLE 'Insufficient Matches';

RUN;

 

 

Thank you all in advance. 

1 REPLY 1
PeterClemmensen
Tourmaline | Level 20

I can't figure out what you want to do here. 

 

Please specify your desired result given the test data set.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 1 reply
  • 297 views
  • 0 likes
  • 2 in conversation