BookmarkSubscribeRSS Feed
swillhoite
Calcite | Level 5

Hi everyone. I'm working on some figures for my client and this is part of the instructions in the spec:

 

"If there are ≥150 patients with data for a given figure, please randomly and equally distribute patients into multiple graphs so that there are <150 patients in each graph. For example, if there are 160 patients, create 2 graphs with 80 patients each. If there are 300 patients with data, create 3 graphs with 100 patients each."

 

I'm not even sure where to start with this. Any ideas?

3 REPLIES 3
ballardw
Super User

Some questions that you have to answer.

 

How do we know how many "patients" are intended for any one graph (before splitting)? Do  you have a variable that indicates that? Or is this really "I have X number of observations in general and need to split them for graphing based on the number X?"

 

The content and structure of your data set may be quite important if this involves pre-indentified "graphs" that certain groups of patients may be currently assigned.

 

When it comes to random selection then the procedure is almost certain to be Proc SurveySelect. But as I say, the content of your current data and how it is to be set up for selection is important.

A basic when you know the number of groups that you want is to use the GROUPS=option.

This is a brief example that you can run using a data set that should be included in your installation:

Proc surveyselect data=sashelp.class out=work.grouped groups=3;
run;

You can look at the output data set, Work.Grouped, and see that a variable Groupid has been added. It will have nearly equal numbers of observations assigned to each group.

When graphing this data you would sort the data by the GroupID variable and use a BY GroupId in Proc Sgplot (or which ever procedure you intend) to create separate plots for each group. Or use the GroupId as a Panelby variable in Proc Sgpanel.

 

 

swillhoite
Calcite | Level 5
Thank you. We came up with the following solution:
Start with this:
Total N (per param) = TOTN
Number of groups needed (ceiling of TOTN/150)= TOTGRP

Then use proc surveyselect to randomly assign subjects to group:

proc surveyselect data=temp(where=(paramn eq 1 and ablfn eq 1)) out=grp1(keep=paramn usubjid groupid) groups=&totgrp1 seed=12345 noprint;
run;
PaigeMiller
Diamond | Level 26

What happens if N cannot be split into exactly equal groups? If N is a prime number (and sometimes even if it is not prime), you cannot get equal numbers in each group. N=173 (a prime number) can be split into groups of 86 and 87, or 90 and 83, or ...

 

What to do then?

 

What if N=400, is that 4 groups of 100, or 5 groups of 80, or 3 groups of 133, 133 and 134?

 

@ballardw also raises some good points, and all of these are things you need to think about -- and discuss with your client to get his/her agreement, long before you start writing SAS code.

--
Paige Miller

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 269 views
  • 0 likes
  • 3 in conversation