Hi,
Can anyone please suggest a way how can I randomly split the longitudinal data into training (60%) and validation (40%).
In my case, I'd like split on a data set where each individual has more than one observation, in such a way that if an individual is in one of the training/validation sets, then all of their observations are in that same set.
Example data (BMILONG) below:
I want to split on BMILONG dataset generated in the second step.
DATA BMI;
CALL STREAMINIT(12345);
DO ID = 1 TO 100;
GENDER=(MOD(ID,2)=0);
TREAT=( ID>50);
BASELINE = ROUND(RAND('NORMAL',35,2),.1);
IF GENDER=1 AND TREAT=0 THEN DO;
GROUP = 'FEMALE - PLACEBO';
MONTH3 = ROUND(BASELINE - .25 + RAND('NORMAL',0,1),.1);
MONTH6 = ROUND(MONTH3 + .25 + RAND('NORMAL',0,1),.1);
MONTH9 = ROUND(MONTH6 - .25 + RAND('NORMAL',0,1),.1);
MONTH12= ROUND(MONTH9 + .25 + RAND('NORMAL',0,1),.1);
END;
IF GENDER=0 AND TREAT=0 THEN DO;
GROUP = 'MALE - PLACEBO';
MONTH3 = ROUND(BASELINE - 1 + RAND('NORMAL',0,1),.1);
MONTH6 = ROUND(MONTH3 - 1 + RAND('NORMAL',0,1),.1);
MONTH9 = ROUND(MONTH6 + 1 + RAND('NORMAL',0,1),.1);
MONTH12= ROUND(MONTH9 + 1 + RAND('NORMAL',0,1),.1);
END;
IF GENDER=0 AND TREAT=1 THEN DO;
GROUP = 'MALE - TREAT';
MONTH3 = ROUND(BASELINE - 1.5 + RAND('NORMAL',0,1),.1);
MONTH6 = ROUND(MONTH3 - 1.5 + RAND('NORMAL',0,1),.1);
MONTH9 = ROUND(MONTH6 - 1.5 + RAND('NORMAL',0,1),.1);
MONTH12= ROUND(MONTH9 - 1.5 + RAND('NORMAL',0,1),.1);
END;
IF GENDER=1 AND TREAT=1 THEN DO;
GROUP = 'FEMALE - TREAT';
MONTH3 = ROUND(BASELINE - 1 + RAND('NORMAL',0,1),.1);
MONTH6 = ROUND(MONTH3 - 1 + RAND('NORMAL',0,1),.1);
MONTH9 = ROUND(MONTH6 - 1 + RAND('NORMAL',0,1),.1);
MONTH12= ROUND(MONTH9 - 1 + RAND('NORMAL',0,1),.1);
END;
OUTPUT;
END;
RUN;
DATA BMILONG;
SET BMI;
TIMEPT=0; BMI=BASELINE; OUTPUT;
TIMEPT=3; BMI=MONTH3; OUTPUT;
TIMEPT=6; BMI=MONTH6; OUTPUT;
TIMEPT=9; BMI=MONTH9; OUTPUT;
TIMEPT=12; BMI=MONTH12; OUTPUT;
DROP BASELINE MONTH:;
RUN;
1. Create a list of unique IDs
2. use a random number generation to assign groups
3. Merge back with original data
When you say 40/60 split how does that factor in multiple records for each person. Does each person count once?
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.