BookmarkSubscribeRSS Feed
anu1999
Obsidian | Level 7

Hi,

Can anyone please suggest a way how can I randomly split the longitudinal data into training (60%) and validation (40%).

In my case, I'd like split on a data set where each individual has more than one observation, in such a way that if an individual is in one of the training/validation sets, then all of their observations are in that same set.

 

Example data (BMILONG) below:

I want to split on BMILONG dataset generated in the second step.

 

DATA BMI;
CALL STREAMINIT(12345);
DO ID = 1 TO 100;
GENDER=(MOD(ID,2)=0);
TREAT=( ID>50);
BASELINE = ROUND(RAND('NORMAL',35,2),.1);
IF GENDER=1 AND TREAT=0 THEN DO;
GROUP = 'FEMALE - PLACEBO';
MONTH3 = ROUND(BASELINE - .25 + RAND('NORMAL',0,1),.1);
MONTH6 = ROUND(MONTH3 + .25 + RAND('NORMAL',0,1),.1);
MONTH9 = ROUND(MONTH6 - .25 + RAND('NORMAL',0,1),.1);
MONTH12= ROUND(MONTH9 + .25 + RAND('NORMAL',0,1),.1);
END;
IF GENDER=0 AND TREAT=0 THEN DO;
GROUP = 'MALE - PLACEBO';
MONTH3 = ROUND(BASELINE - 1 + RAND('NORMAL',0,1),.1);
MONTH6 = ROUND(MONTH3 - 1 + RAND('NORMAL',0,1),.1);
MONTH9 = ROUND(MONTH6 + 1 + RAND('NORMAL',0,1),.1);
MONTH12= ROUND(MONTH9 + 1 + RAND('NORMAL',0,1),.1);
END;
IF GENDER=0 AND TREAT=1 THEN DO;
GROUP = 'MALE - TREAT';
MONTH3 = ROUND(BASELINE - 1.5 + RAND('NORMAL',0,1),.1);
MONTH6 = ROUND(MONTH3 - 1.5 + RAND('NORMAL',0,1),.1);
MONTH9 = ROUND(MONTH6 - 1.5 + RAND('NORMAL',0,1),.1);
MONTH12= ROUND(MONTH9 - 1.5 + RAND('NORMAL',0,1),.1);
END;
IF GENDER=1 AND TREAT=1 THEN DO;
GROUP = 'FEMALE - TREAT';
MONTH3 = ROUND(BASELINE - 1 + RAND('NORMAL',0,1),.1);
MONTH6 = ROUND(MONTH3 - 1 + RAND('NORMAL',0,1),.1);
MONTH9 = ROUND(MONTH6 - 1 + RAND('NORMAL',0,1),.1);
MONTH12= ROUND(MONTH9 - 1 + RAND('NORMAL',0,1),.1);
END;
OUTPUT;
END;
RUN;

 

 

DATA BMILONG;
SET BMI;
TIMEPT=0; BMI=BASELINE; OUTPUT;
TIMEPT=3; BMI=MONTH3; OUTPUT;
TIMEPT=6; BMI=MONTH6; OUTPUT;
TIMEPT=9; BMI=MONTH9; OUTPUT;
TIMEPT=12; BMI=MONTH12; OUTPUT;
DROP BASELINE MONTH:;
RUN;

2 REPLIES 2
Reeza
Super User

1. Create a list of unique IDs 

2. use a random number generation to assign groups 

3. Merge back with original data

 

When you say 40/60 split how does that factor in multiple records for each person. Does each person count once?

anu1999
Obsidian | Level 7
This is helpful.
Thanks Reeza

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 1353 views
  • 0 likes
  • 2 in conversation