Hello, I am trying to do some survival analysis and have a dataset that looks like this:
As you can see the first 6 lines are from the same person. fu_6m to fu_5 are the follow up status from 6 months to 5 years. 0 is dead and 1 is alive.
I need help with having one unique observeration per person and having a variable for follow up status at 6 month, 1 year, 2 year .... 5 year
proc summary data=have nway;
class pers_id px_id;
var fu_6m--fu_5;
output out=want sum=;
run;
Since you don't have unique records for each person, you have six different records, what do you want to see as the output for each person?
I would like to combine the six records and see for each person something like this:
pers_id px_id fu_6m fu_1 fu_2 fu_3 fu_4 fu_5
123 123 1 1 1 0 0 0
124 124 1 1 1 1 0 0
Where 123 has died at follow up at 3 years and 124 has died at follow up at 4 years
@Kashvig wrote:
I would like to combine the six records and see for each person something like this:
pers_id px_id fu_6m fu_1 fu_2 fu_3 fu_4 fu_5
123 123 1 1 1 0 0 0
124 124 1 1 1 1 0 0
Where 123 has died at follow up at 3 years and 124 has died at follow up at 4 years
I would like you to show the desired output from the input data set, not the desired output from two individuals who were not in the input data set.
If I am understanding your Q correctly I would want the output data for lines 1-6 look like this
pers_id px_id. fu_6m. fu_1. fu_2. fu_3. fu_4. fu_5
4736884 715723 1 1 1 1 1 1
So are you asking for the sum of the six rows for each patient?
I think so, yes
proc summary data=have nway;
class pers_id px_id;
var fu_6m--fu_5;
output out=want sum=;
run;
Thank you this worked!
See the example titled "Comparison of the Marginal and Random Effect Models for Binary Data" in the PROC GEE documentation in the SAS/STAT User's Guide. This shows how the data can be rearranged so that there is observation per measurement. The repeated binary measures data can then be modeled with either a Generalized Estimating Equations model or a random effects model.
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.