Hi,
I have two datasets one in long format and one in wide format.
data WORK.FIELD2;
infile datalines truncover;
input Study_ID:BEST12. Sex:BEST12.
FU_Day_1:BEST12. FU_Day_1_time:BEST12. FU_Day_2:BEST12. FU_Day_2_time:BEST12. FU_Day_3:BEST12.
FU_Day_3_time:BEST12. FU_Day_4:BEST12. FU_Day_4_time:BEST12. FU_Day_5:BEST12. FU_Day_5_time:BEST12. FU_Day_6:BEST12.
FU_Day_6_time:BEST12. FU_Day_7:BEST12. FU_Day_7_time:BEST12. FU_Day_8:BEST12. FU_Day_8_time:BEST12. FU_Day_9:BEST12.
FU_Day_9_time:BEST12. FU_Day_10:BEST12. FU_Day_10_time:BEST12. FU_Day_11:BEST12. FU_Day_11_time:BEST12. FU_Day_12:BEST12.
FU_Day_12_time:BEST12. FU_Day_13:BEST12. FU_Day_13_time:BEST12. FU_Day_14:BEST12. FU_Day_14_time:BEST12. FU_Day_15:BEST12.
FU_Day_15_time:BEST12. FU_Day_16:BEST12. FU_Day_16_time:BEST12. FU_Day_17:BEST12. FU_Day_17_time:BEST12. FU_Day_18:BEST12.
FU_Day_18_time:BEST12. FU_Day_19:BEST12. FU_Day_19_time:BEST12. FU_Day_20:BEST12. FU_Day_20_time:BEST12. FU_Day_21:BEST12.
FU_Day_21_time:BEST12. FU_Day_22:BEST12. FU_Day_22_time:BEST12. FU_Day_23:BEST12. FU_Day_23_time:BEST12. FU_Day_24:BEST12.
FU_Day_24_time:BEST12. FU_Day_25:BEST12. FU_Day_25_time:BEST12. FU_Day_26:BEST12. FU_Day_26_time:BEST12. FU_Day_27:BEST12.
FU_Day_27_time:BEST12. FU_Day_28:BEST12. FU_Day_28_time:BEST12. FU_Day_29:BEST12. FU_Day_29_time:BEST12. FU_Day_30:BEST12.
FU_Day_30_time:BEST12.;
format Study_ID BEST12. Sex BEST12. FU_Day_1 BEST12. FU_Day_1_time BEST12. FU_Day_2 BEST12. FU_Day_2_time BEST12.
FU_Day_3 BEST12. FU_Day_3_time BEST12. FU_Day_4 BEST12. FU_Day_4_time BEST12. FU_Day_5 BEST12.
FU_Day_5_time BEST12. FU_Day_6 BEST12. FU_Day_6_time BEST12. FU_Day_7 BEST12. FU_Day_7_time BEST12. FU_Day_8 BEST12.
FU_Day_8_time BEST12. FU_Day_9 BEST12. FU_Day_9_time BEST12. FU_Day_10 BEST12. FU_Day_10_time BEST12. FU_Day_11 BEST12.
FU_Day_11_time BEST12. FU_Day_12 BEST12. FU_Day_12_time BEST12. FU_Day_13 BEST12. FU_Day_13_time BEST12. FU_Day_14 BEST12.
FU_Day_14_time BEST12. FU_Day_15 BEST12. FU_Day_15_time BEST12. FU_Day_16 BEST12. FU_Day_16_time BEST12. FU_Day_17 BEST12.
FU_Day_17_time BEST12. FU_Day_18 BEST12. FU_Day_18_time BEST12. FU_Day_19 BEST12. FU_Day_19_time BEST12. FU_Day_20 BEST12.
FU_Day_20_time BEST12. FU_Day_21 BEST12. FU_Day_21_time BEST12. FU_Day_22 BEST12. FU_Day_22_time BEST12. FU_Day_23 BEST12.
FU_Day_23_time BEST12. FU_Day_24 BEST12. FU_Day_24_time BEST12. FU_Day_25 BEST12. FU_Day_25_time BEST12. FU_Day_26 BEST12.
FU_Day_26_time BEST12. FU_Day_27 BEST12. FU_Day_27_time BEST12. FU_Day_28 BEST12. FU_Day_28_time BEST12. FU_Day_29 BEST12.
FU_Day_29_time BEST12. FU_Day_30 BEST12. FU_Day_30_time BEST12.;
datalines;
1 0 0 0 0 0 0 0 2 2 2 2 8 10 9 12 3 3 5 3 8 6 0 0 0 0 2 3 0 0 0 0 2 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 0 2 3 2 3 0 0 12 13.5 12 13.5 12 13.5 12 13.5 2 4 0 0 1 2 7 4 2 3.5 0 0 0 0 0 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 0 5 3 0 0 0 0 2 0.5 4 2 3 2 0 0 4 2 0 0 3 4 2 1 0 0 0 0 0 0 3 2.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4 0 0 0 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 9 5 0 0 0 0 2 1 0 0 2 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5 1 0 0 3 5 3 5 7 6 0 0 0 0 0 0 0 0 2 2 0 0 3 1.5 0 0 0 0 2 2 2 2 2 1 0 0 2 3 5 3.5 . . . . . . . . . . . . . . . . . . . . . .
6 0 0 0 0 0 0 0 0 0 3 1 3 3 3 3 0 0 3 1.5 0 0 0 0 3 2 3 2.5 4 4 0 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7 1 1 0.5 18 4 3 2 0 0 4 3 0 0 2 1 0 0 2 2 0 0 0 0 2 1 0 0 0 0 4 3 2 2 8 3 3 2 0 0 . . . . . . . . . . . . . . . . . . . . . .
8 0 14 7 8 5 0 0 2 2 0 0 0 0 3 2 2 3 0 0 4 3 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 . . . . . . . . . . . . . . . . . . . . . . . .
9 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
;
data work.fulong ;
infile datalines dsd dlm='|' truncover;
input Study_ID Sex day FU_day FU_time counter FU_EBAC FU_14day_DRink
max_Daycount FU_14Day_Mean_Weekly_Drink
FU_14Day_Drinks_Per_Drink_Day DrinkDays TotalDrinkDays
Total_FU_Ebac FU_14Day_Mean_eBAC Heavy VeryHeavy
Fu_14Day_Very_Heavy_Drink_Count Fu_14Day_Heavy_Drink_Count
Fu_14Day_Peak_Drinks Fu_14Day_Peak_BAC
;
format Study_ID best12. Sex best12. ;
informat Study_ID best32. Sex best32. ;
datalines4;
||1|||0|||1|||0|0|||0|0|0|0||
1|0|6|8|10|11|0.25632613908872|41|14|20.5|2.92857142857142|1|9|1.31587146282973|0.14620794031441|1|1|3|4|9|0.28816690647482
1|0|16|2|2|1|0.06421486810551|41|14|20.5|2.92857142857142|1|9|1.31587146282973|0.14620794031441|0|0|3|4|9|0.28816690647482
1|0|15|0|0|2|0|41|14|20.5|2.92857142857142|0|9|1.31587146282973|0.14620794031441|0|0|3|4|9|0.28816690647482
1|0|3|0|0|14|0|41|14|20.5|2.92857142857142|0|9|1.31587146282973|0.14620794031441|0|0|3|4|9|0.28816690647482
1|0|11|0|0|6|0|41|14|20.5|2.92857142857142|0|9|1.31587146282973|0.14620794031441|0|0|3|4|9|0.28816690647482
1|0|5|2|2|12|0.06421486810551|41|14|20.5|2.92857142857142|1|9|1.31587146282973|0.14620794031441|0|0|3|4|9|0.28816690647482
1|0|4|2|2|13|0.06421486810551|41|14|20.5|2.92857142857142|1|9|1.31587146282973|0.14620794031441|0|0|3|4|9|0.28816690647482
1|0|10|8|6|7|0.25739280575539|41|14|20.5|2.92857142857142|1|9|1.31587146282973|0.14620794031441|1|1|3|4|9|0.28816690647482
1|0|7|9|12|10|0.28816690647482|41|14|20.5|2.92857142857142|1|9|1.31587146282973|0.14620794031441|1|1|3|4|9|0.28816690647482
1|0|13|2|3|4|0.06394820143884|41|14|20.5|2.92857142857142|1|9|1.31587146282973|0.14620794031441|0|0|3|4|9|0.28816690647482
3 The SAS System 15:51 Monday, June 1, 2020
1|0|8|3|3|9|0.09632230215827|41|14|20.5|2.92857142857142|1|9|1.31587146282973|0.14620794031441|0|0|3|4|9|0.28816690647482
1|0|12|0|0|5|0|41|14|20.5|2.92857142857142|0|9|1.31587146282973|0.14620794031441|0|0|3|4|9|0.28816690647482
1|0|14|0|0|3|0|41|14|20.5|2.92857142857142|0|9|1.31587146282973|0.14620794031441|0|0|3|4|9|0.28816690647482
1|0|9|5|3|8|0.16107050359712|41|14|20.5|2.92857142857142|1|9|1.31587146282973|0.14620794031441|1|0|3|4|9|0.28816690647482
2|0|9|0|0|7|0|62|14|31|4.42857142857142|0|9|1.72495|0.19166111111111|0|0|4|5|12|0.3339
2|0|15|0|0|1|0|62|14|31|4.42857142857142|0|9|1.72495|0.19166111111111|0|0|4|5|12|0.3339
2|0|8|2|4|8|0.05518333333333|62|14|31|4.42857142857142|1|9|1.72495|0.19166111111111|0|0|4|5|12|0.3339
2|0|13|0|0|3|0|62|14|31|4.42857142857142|0|9|1.72495|0.19166111111111|0|0|4|5|12|0.3339
2|0|3|0|0|13|0|62|14|31|4.42857142857142|0|9|1.72495|0.19166111111111|0|0|4|5|12|0.3339
;;;;
Originally, I thought I wanted the data from the long format merged with that from the wide fomat. However, I would ultimately just like the following variables into the wide dataset:
FU_14day_DRink max_daycount FU_14Day_Mean_Weekly_Drink
FU_14Day_Drinks_Per_Drink_Day
TotalDrinkDays Total_FU_Ebac FU_14Day_Mean_eBAC Fu_14Day_Very_Heavy_Drink_Count Fu_14Day_Heavy_Drink_Count Fu_14Day_Peak_Drinks Fu_14Day_Peak_BAC
I was able to merge the wide onto the long, but the long onto the wide stacked the two datasets. Any help would be appreciated.
So if all of the observations are exactly the same then use FIRST. or LAST. flags to take either the first or the last observation in the group.
data want;
merge one many ;
by study_id;
if first.study_id;
run;
@joebacon wrote:
I should mention that all those variables are a single number that is repeated. They are metrics. I only need that one number per study_id (observation). I was trying to find a way to just output their max value to another dataset without repeated measures, but got lost in SQL.
Show us as a list of variables at least one example of finding the "max" value.
The data step has a MAX function that takes multiple variables.
So, my thought was if I could just get the max it would input only one value, but when I did this:
data trythis2;
set trythis;
MaxFU_14day_DRink= max(FU_14day_DRink);
keep study_id FU_14day_drink;
run;
It still outputs the study_id repeated and the max 14 day drink repeated. If I can get it to only one value, I could merge the wide with the wide and easily output these variables. It feels like it should be really simple
@joebacon wrote:
So, my thought was if I could just get the max it would input only one value, but when I did this:
data trythis2; set trythis; MaxFU_14day_DRink= max(FU_14day_DRink); keep study_id FU_14day_drink; run;
It still outputs the study_id repeated and the max 14 day drink repeated. If I can get it to only one value, I could merge the wide with the wide and easily output these variables. It feels like it should be really simple
Taking the MAX() of just one variable is not going to do anything. If FU_14day_DRink=10 then the max of it is also 10.
The data step MAX works on one observation (row of data).
If you want a MAX across records you use a procedure like Means or Summary.
Either sort by a grouping variable and use BY that variable or use a Class variable to get summaries by the values of another variable (or variables).
I think the first time you asked about this data I or someone suggested cutting the scope WAY back and only have about 3 of those related variables instead of 30. Make up similar data but enough to show with 7 variables (the id, the time and the day variable) and maybe 10 records in the "wide". Make the "long" as needed to work with that. Ensure the values of the variables are such that we can fairly easily see where the results come from. (Hint: this means not having almost everything 1 and 0)
Then you can manually calculate the needed result and can describe that in better detail. I have to say that I actually have no idea what you are attempting to do.
As always, thank you!
I have data that looks like this
data work.trythis2 ;
infile datalines dsd dlm='|' truncover;
input Study_ID FU_14day_DRink ;
format Study_ID best12. ;
informat Study_ID best32. ;
datalines4;
1|
1|41
1|41
1|41
1|41
1|41
1|41
1|41
1|41
2|28
2|28
2|28
2|28
2|28
2|28
2|28
;;;;
I already calculated the values but I need data that look like this
data work.trythis2 ;
infile datalines dsd dlm='|' truncover;
input Study_ID FU_14day_DRink ;
format Study_ID best12. ;
informat Study_ID best32. ;
datalines4;
1|41
2|28
3|89
4|22
5|90
;;;;
If you data is sorted by ID then this may reduce the "long" data as you want:
data work.want; set work.trythis2; by study_id; if last.study_id; run;
This grabs the last record. I don't know if was intentional or not but the first id has blank value for the other variable.
Or a similar approach using MAX
Proc summary data=work.trythis2 nway; class study_id; var <list of variables>; output out=work.want (drop=_type_ _freq_) max=; run;
If all of the values are the same for a variable any of max, min or median would work. There is an extremely small chance the mean, which should work, might acquire a small decimal difference when calculated so I wouldn't recommend that. Min, Max and Median should be the same from the way you describe the data.
Then, I was looking up different uses of the Put and Output statements. I tried a variation with output that I haven't gotten to run. It also has the problem that not every study_id is 14 repetitions. :
data trythis2;
set trythis;
keep study_id FU_14day_drink;
reps+1;
if reps=1 then FU_14day_drink=0;
max(FU_14day_drink);
if reps=14 then do;
output;
stop;
end;
run;
Are you asking to take a variable from the LONG format and merge it into the WIDE format?
If so you are either going to convert the WIDE to long or you need to specify which value from the LONG you want. Do you want just one of the observations for that STUDY_ID? Or some type of aggregate across all of the observations that that STUDY_ID?
So if all of the observations are exactly the same then use FIRST. or LAST. flags to take either the first or the last observation in the group.
data want;
merge one many ;
by study_id;
if first.study_id;
run;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.