BookmarkSubscribeRSS Feed
slivingston
Calcite | Level 5

I have a dataset that has cases and controls. I need to create a newsataset with matches at least a 1:4 ratio of cases and controls based on 2 variables. I am at a loss as to where to start

Obs 1      Type       Rank      Gender      Score1     Score2
1               Control  1               2               10          5

2               case      2               1               1          1

3               control     1               1               1          3

.

.

.

.

So in the end I would like to see each case have 4 controls under neath with matching Rank and Gender

Obs 1      Type       Rank      Gender      Score1     Score2
1               case       1              2              17          5

2               control    1              2               71          1

3               control    1              2              71          3

4               control     1             2               8            16

Any help would be appreciated! Maybe a macro?

9 REPLIES 9
jakarman
Barite | Level 11

I all depends,

The situation you start at and want to achieve is not clear enough. It looks like setting up a dataset to be used for data-mining.

Having too small number of observations you can boost them

Having many observations you can sample in a ratio

Having some requirements within series .... you can accomodate that.

What is your design/analyses?

---->-- ja karman --<-----
slivingston
Calcite | Level 5

My dataset is about 5000. I simplified the dataset in my explanation for simplicitity sake however its has cases and controls and their responses to survey questions. I would like to have 1:4 ratio cases to control matched on Military Rank (5 categories) and Gender . Design is a retrospective case-control matched analysis.


slivingston
Calcite | Level 5

I have never used a macro program so complicated. How do I use them in my programming? Do I just copy and fill out the appropriate variables?

Reeza
Super User

Read the documentation in the code. Preferably read the code as well Smiley Happy

In the docs is an example and at the bottom is an example with sample data and a call example.

slivingston
Calcite | Level 5


I am using this one: http://www.mayo.edu/research/documents/gmatchsas/DOC-10027248 But I am getting no outputs or errors in log. I am new to Macros so appreciate the patient and help!

Reeza
Super User

So no datasets  get created and no errors in the log either? post the log instead of the code.

Astounding
PROC Star

If you would like to learn how to program this yourself, instead of using a macro that you probably don't understand and that can do more than what you need, the steps are not so difficult.

1. Separate your observations into two data sets:  treatment and control.

2. From the treatment data set, run a PROC FREQ on the combination of the key variables, and send the results to an output data set.

3. For the control data set, assign a random number to each observation.

4. For the control data set, sort by an extra variable:  the key variables, plus the random number.

5. Merge the sorted control data set with the output data set from PROC FREQ (step 2).  Use the COUNT variable from PROC FREQ to determine which control observations to keep and which to delete.

6. Combine the selected control observations with the treatment observations.

I know that's an overview, and you may need help pursuing this.  Also note that this approach doesn't match up controls to specific treatment observations.  If you have two treatment observations with the same key values, it will pick the right number of controls to match up to both treatment observations combined.  You could randomly assign them to a particular treatment observation at that point, if needed.  Note that there is no guarantee that your control data set contains enough matching observations for every treatment observation.

Finally, I haven't used PROC SURVEYSELECT very much.  It's possible that it has the built-in capabilities to do this easily.

Good luck.

jakarman
Barite | Level 11

slivingston, As possible being new to SAS, there are al lot of studies presented. (cases and controls)

Google (/#q=Matching+cases+and+controls++site%3Asas.com&start=20) eliminate you own questions.

They should give some direction. I agree with Astoundings remarks for the work.

---->-- ja karman --<-----

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 2523 views
  • 1 like
  • 4 in conversation