Hi all,
I am attempting to create new variables from a dataset containing TimeLine Follow Back (TLFB) data. The data is in wide form and looks like this:
data WORK.FIELD2;
infile datalines dsd truncover;
input Study_ID:BEST12. Sex:BEST12. FU_Day_1:BEST12. FU_Day_1_time:BEST12. FU_Day_2:BEST12. FU_Day_2_time:BEST12. FU_Day_3:BEST12. FU_Day_3_time:BEST12. FU_Day_4:BEST12. FU_Day_4_time:BEST12. FU_Day_5:BEST12. FU_Day_5_time:BEST12. FU_Day_6:BEST12. FU_Day_6_
time:BEST12. FU_Day_7:BEST12. FU_Day_7_time:BEST12. FU_Day_8:BEST12. FU_Day_8_time:BEST12. FU_Day_9:BEST12. FU_Day_9_time:BEST12. FU_Day_10:BEST12. FU_Day_10_time:BEST12. FU_Day_11:BEST12. FU_Day_11_time:BEST12. FU_Day_12:BEST12. FU_Day_12_time:BEST12. FU_
Day_13:BEST12. FU_Day_13_time:BEST12. FU_Day_14:BEST12. FU_Day_14_time:BEST12. FU_Day_15:BEST12. FU_Day_15_time:BEST12. FU_Day_16:BEST12. FU_Day_16_time:BEST12. FU_Day_17:BEST12. FU_Day_17_time:BEST12. FU_Day_18:BEST12. FU_Day_18_time:BEST12. FU_Day_19:BES
T12. FU_Day_19_time:BEST12. FU_Day_20:BEST12. FU_Day_20_time:BEST12. FU_Day_21:BEST12. FU_Day_21_time:BEST12. FU_Day_22:BEST12. FU_Day_22_time:BEST12. FU_Day_23:BEST12. FU_Day_23_time:BEST12. FU_Day_24:BEST12. FU_Day_24_time:BEST12. FU_Day_25:BEST12. FU_Da
y_25_time:BEST12. FU_Day_26:BEST12. FU_Day_26_time:BEST12. FU_Day_27:BEST12. FU_Day_27_time:BEST12. FU_Day_28:BEST12. FU_Day_28_time:BEST12. FU_Day_29:BEST12. FU_Day_29_time:BEST12. FU_Day_30:BEST12. FU_Day_30_time:BEST12.;
format Study_ID BEST12. Sex BEST12. FU_Day_1 BEST12. FU_Day_1_time BEST12. FU_Day_2 BEST12. FU_Day_2_time BEST12. FU_Day_3 BEST12. FU_Day_3_time BEST12. FU_Day_4 BEST12. FU_Day_4_time BEST12. FU_Day_5 BEST12. FU_Day_5_time BEST12. FU_Day_6 BEST12. FU_Day_6
_time BEST12. FU_Day_7 BEST12. FU_Day_7_time BEST12. FU_Day_8 BEST12. FU_Day_8_time BEST12. FU_Day_9 BEST12. FU_Day_9_time BEST12. FU_Day_10 BEST12. FU_Day_10_time BEST12. FU_Day_11 BEST12. FU_Day_11_time BEST12. FU_Day_12 BEST12. FU_Day_12_time BEST12. FU
_Day_13 BEST12. FU_Day_13_time BEST12. FU_Day_14 BEST12. FU_Day_14_time BEST12. FU_Day_15 BEST12. FU_Day_15_time BEST12. FU_Day_16 BEST12. FU_Day_16_time BEST12. FU_Day_17 BEST12. FU_Day_17_time BEST12. FU_Day_18 BEST12. FU_Day_18_time BEST12. FU_Day_19 BE
ST12. FU_Day_19_time BEST12. FU_Day_20 BEST12. FU_Day_20_time BEST12. FU_Day_21 BEST12. FU_Day_21_time BEST12. FU_Day_22 BEST12. FU_Day_22_time BEST12. FU_Day_23 BEST12. FU_Day_23_time BEST12. FU_Day_24 BEST12. FU_Day_24_time BEST12. FU_Day_25 BEST12. FU_D
ay_25_time BEST12. FU_Day_26 BEST12. FU_Day_26_time BEST12. FU_Day_27 BEST12. FU_Day_27_time BEST12. FU_Day_28 BEST12. FU_Day_28_time BEST12. FU_Day_29 BEST12. FU_Day_29_time BEST12. FU_Day_30 BEST12. FU_Day_30_time BEST12.;
datalines;
1 0 0 0 0 0 0 0 2 2 2 2 8 10 9 12 3 3 5 3 8 6 0 0 0 0 2 3 0 0 0 0 2 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 0 2 3 2 3 0 0 12 13.5 12 13.5 12 13.5 12 13.5 2 4 0 0 1 2 7 4 2 3.5 0 0 0 0 0 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 0 5 3 0 0 0 0 2 0.5 4 2 3 2 0 0 4 2 0 0 3 4 2 1 0 0 0 0 0 0 3 2.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4 0 0 0 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 9 5 0 0 0 0 2 1 0 0 2 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5 1 0 0 3 5 3 5 7 6 0 0 0 0 0 0 0 0 2 2 0 0 3 1.5 0 0 0 0 2 2 2 2 2 1 0 0 2 3 5 3.5 . . . . . . . . . . . . . . . . . . . . . .
6 0 0 0 0 0 0 0 0 0 3 1 3 3 3 3 0 0 3 1.5 0 0 0 0 3 2 3 2.5 4 4 0 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7 1 1 0.5 18 4 3 2 0 0 4 3 0 0 2 1 0 0 2 2 0 0 0 0 2 1 0 0 0 0 4 3 2 2 8 3 3 2 0 0 . . . . . . . . . . . . . . . . . . . . . .
8 0 14 7 8 5 0 0 2 2 0 0 0 0 3 2 2 3 0 0 4 3 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 . . . . . . . . . . . . . . . . . . . . . . . .
9 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
;;;;
I want to create the following variables on two different timelines. One for the full 30 days (which I have completed) and another for each participant's most recent complete 14 days.
The data is in wide format obviously but you can see that each participant has varying amounts of data (look at the difference between 2 and 4 for instance). I calculated a lot of the 30-day variables using arrays, but was wondering if there was a way to capture each participant's 14 days. Some will not have a full 14 days while others will have more than 14 usable days. However, I still would like to
Before I completely understood the data, I was trying to use the following code. However, I realize that I need to approach the issue differently which brings me to the question of how to find the last 14 days of usable data when it varies for each participant
data Steady.Master;
set Steady.Master;
IPS_14Day_Drink= sum(of IPS_day_17 - IPS_day_30);
FU_14Day_Drink= sum(FU_day_17, FU_day_18, FU_day_19, FU_day_20, FU_day_21, FU_day_22, FU_day_23,
FU_day_24, FU_day_25, FU_day_26, FU_day_27, FU_day_28, FU_day_29, FU_day_30);
run;
/*Counting how many days they drank*/
data STEADy.master(drop=i drop=n9);
set STEADY.Master;
n9=0;
array _n(*) IPS_day_17 - IPS_day_30;
do i=1 to dim(_n);
if _n(i) ge 1 then n9+1;
end;
IPS_14Day_Drink_Count=n9;
run;
data STEADy.master(drop=i drop=n9);
set STEADY.Master;
n9=0;
array _n(*) FU_day_17 FU_day_18 FU_day_19 FU_day_20 FU_day_21 FU_day_22 FU_day_23
FU_day_24 FU_day_25 FU_day_26 FU_day_27 FU_day_28 FU_day_29 FU_day_30;
do i=1 to dim(_n);
if _n(i) ge 1 then n9+1;
end;
FU_14Day_Drink_Count=n9;
run;
/*Creating new variables for mean drinks per week ((total drinks/14)*7)*/
data STEADY.master;
set STEADY.master;
IPS_14Day_Mean_Weekly_Drink = ((IPS_14Day_Drink/14)*7);
FU_14Day_Mean_Weekly_Drink = ((FU_14Day_Drink/14)*7);
IPS_14Day_Drinks_Per_Drink_Day = IPS_14Day_Drink/IPS_14Day_Drink_Count;
FU_14Day_Drinks_Per_Drink_Day = FU_14Day_Drink/FU_14Day_Drink_Count;
run;
I always appreciate any help with my code or materials to read. I have learned SAS almost entirely from this forum and really hold the regulars here in high regard for spending their time so graciously.
The first thing I would do is to rearrange the data to a long format. Your variable names are not so good, SAS-wise, but it can be solved using a two-dimensional array:
data long;
set field2;
array data [30,2] FU_:;
do day=1 to 30;
FU_day=data[day,1];
FU_time=data[day,2];
output;
end;
keep Study_ID day FU_day FU_time;
run;
Then sort by descending day, and get the last 14 days with data (or a dummy row if no data at all):
proc sort data=long;
by Study_ID descending day;
run;
data want;
set long;
by Study_ID;
if first.Study_ID then
counter=0;
if not missing(FU_day) then do;
counter+1;
if counter<=14 then
output;
end;
if last.Study_ID and counter=0 then
output; /* to get the subjects with no data at all */
run;
It should then be quite easy to get the means and peaks and whatever.
I'm not sure I fully understand, and you do not have complete data out through 30 days for any of the records in your sample data.
Nevertheless, you can find where the first dot appears on each row by using ARRAYs and the WHICHN function. I show only the array for FU_DAY. It's not easy when you also have variables named FU_DAY_1_TIME, because these can't be turned into an array as easily, without typing all 30 variable names. You probably want the variables named FU_DAY_TIME_1 etc. and then creating the array is easy, and I leave you to make those changes.
This code shows that the first dot for record one is on day 17, and so the last 14 variables without missing data are in days 3 through 16 (simple math), and from there you ought to be able to compute the statistics you want.
data WORK.FIELD2;
infile datalines;
input Study_ID:BEST12. Sex:BEST12. FU_Day_1:BEST12. FU_Day_1_time:BEST12. FU_Day_2:BEST12. FU_Day_2_time:BEST12. FU_Day_3:BEST12. FU_Day_3_time:BEST12. FU_Day_4:BEST12. FU_Day_4_time:BEST12. FU_Day_5:BEST12. FU_Day_5_time:BEST12. FU_Day_6:BEST12.
FU_Day_6_time:BEST12. FU_Day_7:BEST12. FU_Day_7_time:BEST12. FU_Day_8:BEST12. FU_Day_8_time:BEST12. FU_Day_9:BEST12. FU_Day_9_time:BEST12. FU_Day_10:BEST12. FU_Day_10_time:BEST12. FU_Day_11:BEST12. FU_Day_11_time:BEST12. FU_Day_12:BEST12. FU_Day_12_time:BEST12.
FU_Day_13:BEST12. FU_Day_13_time:BEST12. FU_Day_14:BEST12. FU_Day_14_time:BEST12. FU_Day_15:BEST12. FU_Day_15_time:BEST12. FU_Day_16:BEST12. FU_Day_16_time:BEST12. FU_Day_17:BEST12. FU_Day_17_time:BEST12. FU_Day_18:BEST12. FU_Day_18_time:BEST12.
FU_Day_19:BEST12. FU_Day_19_time:BEST12. FU_Day_20:BEST12. FU_Day_20_time:BEST12. FU_Day_21:BEST12. FU_Day_21_time:BEST12. FU_Day_22:BEST12. FU_Day_22_time:BEST12. FU_Day_23:BEST12. FU_Day_23_time:BEST12. FU_Day_24:BEST12. FU_Day_24_time:BEST12. FU_Day_25:BEST12.
FU_Day_25_time:BEST12. FU_Day_26:BEST12. FU_Day_26_time:BEST12. FU_Day_27:BEST12. FU_Day_27_time:BEST12. FU_Day_28:BEST12. FU_Day_28_time:BEST12. FU_Day_29:BEST12. FU_Day_29_time:BEST12. FU_Day_30:BEST12. FU_Day_30_time:BEST12.;
array fu_day[*] fu_day_1-fu_day_30;
do i=1 to dim(fu_day);
if missing(fu_day(i)) then fu_day(i)=-9999;
end;
first_dot_day = whichn(-9999,of fu_day[*]);
drop i;
datalines;
1 0 0 0 0 0 0 0 2 2 2 2 8 10 9 12 3 3 5 3 8 6 0 0 0 0 2 3 0 0 0 0 2 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 0 2 3 2 3 0 0 12 13.5 12 13.5 12 13.5 12 13.5 2 4 0 0 1 2 7 4 2 3.5 0 0 0 0 0 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 0 5 3 0 0 0 0 2 0.5 4 2 3 2 0 0 4 2 0 0 3 4 2 1 0 0 0 0 0 0 3 2.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4 0 0 0 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 9 5 0 0 0 0 2 1 0 0 2 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5 1 0 0 3 5 3 5 7 6 0 0 0 0 0 0 0 0 2 2 0 0 3 1.5 0 0 0 0 2 2 2 2 2 1 0 0 2 3 5 3.5 . . . . . . . . . . . . . . . . . . . . . .
6 0 0 0 0 0 0 0 0 0 3 1 3 3 3 3 0 0 3 1.5 0 0 0 0 3 2 3 2.5 4 4 0 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7 1 1 0.5 18 4 3 2 0 0 4 3 0 0 2 1 0 0 2 2 0 0 0 0 2 1 0 0 0 0 4 3 2 2 8 3 3 2 0 0 . . . . . . . . . . . . . . . . . . . . . .
8 0 14 7 8 5 0 0 2 2 0 0 0 0 3 2 2 3 0 0 4 3 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 . . . . . . . . . . . . . . . . . . . . . . . .
9 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
;
Your solution works great if there are none missing before the first valid entry.
Please provide an example.
data WORK.FIELD2;
infile datalines truncover;
input Study_ID:BEST12. Sex:BEST12.
FU_Day_1:BEST12. FU_Day_1_time:BEST12. FU_Day_2:BEST12. FU_Day_2_time:BEST12. FU_Day_3:BEST12.
FU_Day_3_time:BEST12. FU_Day_4:BEST12. FU_Day_4_time:BEST12. FU_Day_5:BEST12. FU_Day_5_time:BEST12. FU_Day_6:BEST12.
FU_Day_6_time:BEST12. FU_Day_7:BEST12. FU_Day_7_time:BEST12. FU_Day_8:BEST12. FU_Day_8_time:BEST12. FU_Day_9:BEST12.
FU_Day_9_time:BEST12. FU_Day_10:BEST12. FU_Day_10_time:BEST12. FU_Day_11:BEST12. FU_Day_11_time:BEST12. FU_Day_12:BEST12.
FU_Day_12_time:BEST12. FU_Day_13:BEST12. FU_Day_13_time:BEST12. FU_Day_14:BEST12. FU_Day_14_time:BEST12. FU_Day_15:BEST12.
FU_Day_15_time:BEST12. FU_Day_16:BEST12. FU_Day_16_time:BEST12. FU_Day_17:BEST12. FU_Day_17_time:BEST12. FU_Day_18:BEST12.
FU_Day_18_time:BEST12. FU_Day_19:BEST12. FU_Day_19_time:BEST12. FU_Day_20:BEST12. FU_Day_20_time:BEST12. FU_Day_21:BEST12.
FU_Day_21_time:BEST12. FU_Day_22:BEST12. FU_Day_22_time:BEST12. FU_Day_23:BEST12. FU_Day_23_time:BEST12. FU_Day_24:BEST12.
FU_Day_24_time:BEST12. FU_Day_25:BEST12. FU_Day_25_time:BEST12. FU_Day_26:BEST12. FU_Day_26_time:BEST12. FU_Day_27:BEST12.
FU_Day_27_time:BEST12. FU_Day_28:BEST12. FU_Day_28_time:BEST12. FU_Day_29:BEST12. FU_Day_29_time:BEST12. FU_Day_30:BEST12.
FU_Day_30_time:BEST12.;
format Study_ID BEST12. Sex BEST12. FU_Day_1 BEST12. FU_Day_1_time BEST12. FU_Day_2 BEST12. FU_Day_2_time BEST12.
FU_Day_3 BEST12. FU_Day_3_time BEST12. FU_Day_4 BEST12. FU_Day_4_time BEST12. FU_Day_5 BEST12.
FU_Day_5_time BEST12. FU_Day_6 BEST12. FU_Day_6_time BEST12. FU_Day_7 BEST12. FU_Day_7_time BEST12. FU_Day_8 BEST12.
FU_Day_8_time BEST12. FU_Day_9 BEST12. FU_Day_9_time BEST12. FU_Day_10 BEST12. FU_Day_10_time BEST12. FU_Day_11 BEST12.
FU_Day_11_time BEST12. FU_Day_12 BEST12. FU_Day_12_time BEST12. FU_Day_13 BEST12. FU_Day_13_time BEST12. FU_Day_14 BEST12.
FU_Day_14_time BEST12. FU_Day_15 BEST12. FU_Day_15_time BEST12. FU_Day_16 BEST12. FU_Day_16_time BEST12. FU_Day_17 BEST12.
FU_Day_17_time BEST12. FU_Day_18 BEST12. FU_Day_18_time BEST12. FU_Day_19 BEST12. FU_Day_19_time BEST12. FU_Day_20 BEST12.
FU_Day_20_time BEST12. FU_Day_21 BEST12. FU_Day_21_time BEST12. FU_Day_22 BEST12. FU_Day_22_time BEST12. FU_Day_23 BEST12.
FU_Day_23_time BEST12. FU_Day_24 BEST12. FU_Day_24_time BEST12. FU_Day_25 BEST12. FU_Day_25_time BEST12. FU_Day_26 BEST12.
FU_Day_26_time BEST12. FU_Day_27 BEST12. FU_Day_27_time BEST12. FU_Day_28 BEST12. FU_Day_28_time BEST12. FU_Day_29 BEST12.
FU_Day_29_time BEST12. FU_Day_30 BEST12. FU_Day_30_time BEST12.;
datalines;
1 0 0 0 0 0 0 0 2 2 2 2 8 10 9 12 3 3 5 3 8 6 0 0 0 0 2 3 0 0 0 0 2 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 0 2 3 2 3 0 0 12 13.5 12 13.5 12 13.5 12 13.5 2 4 0 0 1 2 7 4 2 3.5 0 0 0 0 0 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 0 5 3 0 0 0 0 2 0.5 4 2 3 2 0 0 4 2 0 0 3 4 2 1 0 0 0 0 0 0 3 2.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4 0 0 0 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 9 5 0 0 0 0 2 1 0 0 2 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5 1 0 0 3 5 3 5 7 6 0 0 0 0 0 0 0 0 2 2 0 0 3 1.5 0 0 0 0 2 2 2 2 2 1 0 0 2 3 5 3.5 . . . . . . . . . . . . . . . . . . . . . .
6 0 0 0 0 0 0 0 0 0 3 1 3 3 3 3 0 0 3 1.5 0 0 0 0 3 2 3 2.5 4 4 0 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7 1 1 0.5 18 4 3 2 0 0 4 3 0 0 2 1 0 0 2 2 0 0 0 0 2 1 0 0 0 0 4 3 2 2 8 3 3 2 0 0 . . . . . . . . . . . . . . . . . . . . . .
8 0 14 7 8 5 0 0 2 2 0 0 0 0 3 2 2 3 0 0 4 3 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 . . . . . . . . . . . . . . . . . . . . . . . .
48 0 0 0 0 0 4 2.5 0 0 3 4 0 0 0 0 0 0 2 0.5 0 0 0 0 0 0 0 0 1 2.5 0 0 0 0 0 0 0 0 0 0 10 4.5 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0
83 0 . . 2 1 0 0 0 0 2 1 4 2 0 0 5 4 0 0 0 0 0 0 0 0 5 2.5 0 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
;
I have updated the data with two different observations. One is of someone with full data where they dont hit a "missing" but I would still need the last 14 days (days 17-30). The other is an example of someone who is missing data and then has usable data.
Thank you for all the help thus far.
I have removed all of your formats and informats, as they don't apply to this example.
data WORK.FIELD2;
infile datalines;
input Study_ID Sex FU_Day_1 FU_Day_1_time FU_Day_2 FU_Day_2_time FU_Day_3 FU_Day_3_time FU_Day_4 FU_Day_4_time FU_Day_5 FU_Day_5_time FU_Day_6
FU_Day_6_time FU_Day_7 FU_Day_7_time FU_Day_8 FU_Day_8_time FU_Day_9 FU_Day_9_time FU_Day_10 FU_Day_10_time FU_Day_11 FU_Day_11_time FU_Day_12 FU_Day_12_time
FU_Day_13 FU_Day_13_time FU_Day_14 FU_Day_14_time FU_Day_15 FU_Day_15_time FU_Day_16 FU_Day_16_time FU_Day_17 FU_Day_17_time FU_Day_18 FU_Day_18_time
FU_Day_19 FU_Day_19_time FU_Day_20 FU_Day_20_time FU_Day_21 FU_Day_21_time FU_Day_22 FU_Day_22_time FU_Day_23 FU_Day_23_time FU_Day_24 FU_Day_24_time FU_Day_25
FU_Day_25_time FU_Day_26 FU_Day_26_time FU_Day_27 FU_Day_27_time FU_Day_28 FU_Day_28_time FU_Day_29 FU_Day_29_time FU_Day_30 FU_Day_30_time;
array fu_day[*] fu_day_1-fu_day_30;
first_non_dot=.;
do i=1 to dim(fu_day);
if not missing(fu_day(i)) and missing(first_non_dot) then first_non_dot=i;
if missing(fu_day(i)) then fu_day(i)=-9999;
end;
first_dot_day = whichn(-9999,of fu_day[*]);
drop i;
datalines;
1 0 0 0 0 0 0 0 2 2 2 2 8 10 9 12 3 3 5 3 8 6 0 0 0 0 2 3 0 0 0 0 2 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 0 2 3 2 3 0 0 12 13.5 12 13.5 12 13.5 12 13.5 2 4 0 0 1 2 7 4 2 3.5 0 0 0 0 0 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 0 5 3 0 0 0 0 2 0.5 4 2 3 2 0 0 4 2 0 0 3 4 2 1 0 0 0 0 0 0 3 2.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4 0 0 0 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 9 5 0 0 0 0 2 1 0 0 2 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5 1 0 0 3 5 3 5 7 6 0 0 0 0 0 0 0 0 2 2 0 0 3 1.5 0 0 0 0 2 2 2 2 2 1 0 0 2 3 5 3.5 . . . . . . . . . . . . . . . . . . . . . .
6 0 0 0 0 0 0 0 0 0 3 1 3 3 3 3 0 0 3 1.5 0 0 0 0 3 2 3 2.5 4 4 0 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7 1 1 0.5 18 4 3 2 0 0 4 3 0 0 2 1 0 0 2 2 0 0 0 0 2 1 0 0 0 0 4 3 2 2 8 3 3 2 0 0 . . . . . . . . . . . . . . . . . . . . . .
8 0 14 7 8 5 0 0 2 2 0 0 0 0 3 2 2 3 0 0 4 3 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 . . . . . . . . . . . . . . . . . . . . . . . .
9 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
48 0 0 0 0 0 4 2.5 0 0 3 4 0 0 0 0 0 0 2 0.5 0 0 0 0 0 0 0 0 1 2.5 0 0 0 0 0 0 0 0 0 0 10 4.5 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0
83 0 . . 2 1 0 0 0 0 2 1 4 2 0 0 5 4 0 0 0 0 0 0 0 0 5 2.5 0 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
;
This gives you the location of the first non-dot and the last dot, and then you can select the last 14 values and work with them.
Using arrays:
data WORK.FIELD2;
infile datalines truncover;
input Study_ID:BEST12. Sex:BEST12.
FU_Day_1:BEST12. FU_Day_1_time:BEST12. FU_Day_2:BEST12. FU_Day_2_time:BEST12. FU_Day_3:BEST12.
FU_Day_3_time:BEST12. FU_Day_4:BEST12. FU_Day_4_time:BEST12. FU_Day_5:BEST12. FU_Day_5_time:BEST12. FU_Day_6:BEST12.
FU_Day_6_time:BEST12. FU_Day_7:BEST12. FU_Day_7_time:BEST12. FU_Day_8:BEST12. FU_Day_8_time:BEST12. FU_Day_9:BEST12.
FU_Day_9_time:BEST12. FU_Day_10:BEST12. FU_Day_10_time:BEST12. FU_Day_11:BEST12. FU_Day_11_time:BEST12. FU_Day_12:BEST12.
FU_Day_12_time:BEST12. FU_Day_13:BEST12. FU_Day_13_time:BEST12. FU_Day_14:BEST12. FU_Day_14_time:BEST12. FU_Day_15:BEST12.
FU_Day_15_time:BEST12. FU_Day_16:BEST12. FU_Day_16_time:BEST12. FU_Day_17:BEST12. FU_Day_17_time:BEST12. FU_Day_18:BEST12.
FU_Day_18_time:BEST12. FU_Day_19:BEST12. FU_Day_19_time:BEST12. FU_Day_20:BEST12. FU_Day_20_time:BEST12. FU_Day_21:BEST12.
FU_Day_21_time:BEST12. FU_Day_22:BEST12. FU_Day_22_time:BEST12. FU_Day_23:BEST12. FU_Day_23_time:BEST12. FU_Day_24:BEST12.
FU_Day_24_time:BEST12. FU_Day_25:BEST12. FU_Day_25_time:BEST12. FU_Day_26:BEST12. FU_Day_26_time:BEST12. FU_Day_27:BEST12.
FU_Day_27_time:BEST12. FU_Day_28:BEST12. FU_Day_28_time:BEST12. FU_Day_29:BEST12. FU_Day_29_time:BEST12. FU_Day_30:BEST12.
FU_Day_30_time:BEST12.;
format Study_ID BEST12. Sex BEST12. FU_Day_1 BEST12. FU_Day_1_time BEST12. FU_Day_2 BEST12. FU_Day_2_time BEST12.
FU_Day_3 BEST12. FU_Day_3_time BEST12. FU_Day_4 BEST12. FU_Day_4_time BEST12. FU_Day_5 BEST12.
FU_Day_5_time BEST12. FU_Day_6 BEST12. FU_Day_6_time BEST12. FU_Day_7 BEST12. FU_Day_7_time BEST12. FU_Day_8 BEST12.
FU_Day_8_time BEST12. FU_Day_9 BEST12. FU_Day_9_time BEST12. FU_Day_10 BEST12. FU_Day_10_time BEST12. FU_Day_11 BEST12.
FU_Day_11_time BEST12. FU_Day_12 BEST12. FU_Day_12_time BEST12. FU_Day_13 BEST12. FU_Day_13_time BEST12. FU_Day_14 BEST12.
FU_Day_14_time BEST12. FU_Day_15 BEST12. FU_Day_15_time BEST12. FU_Day_16 BEST12. FU_Day_16_time BEST12. FU_Day_17 BEST12.
FU_Day_17_time BEST12. FU_Day_18 BEST12. FU_Day_18_time BEST12. FU_Day_19 BEST12. FU_Day_19_time BEST12. FU_Day_20 BEST12.
FU_Day_20_time BEST12. FU_Day_21 BEST12. FU_Day_21_time BEST12. FU_Day_22 BEST12. FU_Day_22_time BEST12. FU_Day_23 BEST12.
FU_Day_23_time BEST12. FU_Day_24 BEST12. FU_Day_24_time BEST12. FU_Day_25 BEST12. FU_Day_25_time BEST12. FU_Day_26 BEST12.
FU_Day_26_time BEST12. FU_Day_27 BEST12. FU_Day_27_time BEST12. FU_Day_28 BEST12. FU_Day_28_time BEST12. FU_Day_29 BEST12.
FU_Day_29_time BEST12. FU_Day_30 BEST12. FU_Day_30_time BEST12.;
datalines;
1 0 0 0 0 0 0 0 2 2 2 2 8 10 9 12 3 3 5 3 8 6 0 0 0 0 2 3 0 0 0 0 2 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 0 2 3 2 3 0 0 12 13.5 12 13.5 12 13.5 12 13.5 2 4 0 0 1 2 7 4 2 3.5 0 0 0 0 0 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 0 5 3 0 0 0 0 2 0.5 4 2 3 2 0 0 4 2 0 0 3 4 2 1 0 0 0 0 0 0 3 2.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4 0 0 0 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 9 5 0 0 0 0 2 1 0 0 2 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5 1 0 0 3 5 3 5 7 6 0 0 0 0 0 0 0 0 2 2 0 0 3 1.5 0 0 0 0 2 2 2 2 2 1 0 0 2 3 5 3.5 . . . . . . . . . . . . . . . . . . . . . .
6 0 0 0 0 0 0 0 0 0 3 1 3 3 3 3 0 0 3 1.5 0 0 0 0 3 2 3 2.5 4 4 0 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7 1 1 0.5 18 4 3 2 0 0 4 3 0 0 2 1 0 0 2 2 0 0 0 0 2 1 0 0 0 0 4 3 2 2 8 3 3 2 0 0 . . . . . . . . . . . . . . . . . . . . . .
8 0 14 7 8 5 0 0 2 2 0 0 0 0 3 2 2 3 0 0 4 3 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 . . . . . . . . . . . . . . . . . . . . . . . .
48 0 0 0 0 0 4 2.5 0 0 3 4 0 0 0 0 0 0 2 0.5 0 0 0 0 0 0 0 0 1 2.5 0 0 0 0 0 0 0 0 0 0 10 4.5 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0
83 0 . . 2 1 0 0 0 0 2 1 4 2 0 0 5 4 0 0 0 0 0 0 0 0 5 2.5 0 0 9 8.9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
;
data want_f2w;
set field2;
array fu fu_day_1 - fu_day_30;
array f2w_{14};
do i = dim(fu) to 1 by -1;
if missing(last) then do;
if i < dim(f2w_) then leave;
if not missing(fu{i}) then last = i;
end;
else do;
if missing(fu{i}) then call missing(last);
else if i <= last - dim(f2W_) + 1 then do;
do j = 1 to dim(f2w_);
f2w_{j} = fu{i + j -1};
end;
leave;
end;
end;
end;
keep study_id sex f2w:;
run;
proc print data=want_f2w noobs; run;
Study_ID Sex f2w_1 f2w_2 f2w_3 f2w_4 f2w_5 f2w_6 f2w_7 f2w_8 f2w_9 f2w_10 f2w_11 f2w_12 f2w_13 f2w_14 1 0 0 2 2 8 9 3 5 8 0 0 2 0 0 2 2 0 2 0 12 12 12 12 2 0 1 7 2 0 0 0 3 0 0 0 2 4 3 0 4 0 3 2 0 0 0 3 4 0 2 0 0 0 0 0 0 0 9 0 0 2 0 2 5 1 0 0 0 2 0 3 0 0 2 2 2 0 2 5 6 0 0 0 0 3 3 3 0 3 0 0 3 3 4 0 7 1 0 2 0 2 0 0 2 0 0 4 2 8 3 0 8 0 0 0 3 2 0 4 0 0 0 0 0 1 0 0 48 0 0 0 0 10 0 0 0 0 0 1 0 0 0 0 83 0 2 0 0 2 4 0 5 0 0 0 0 5 0 9
Note, I fixed obs study_id=83 in example dataset to make the sequence 14 long.
What's the difference between FU_Day_1 and FU_Day_1_time?
FU_Day_1 is a count variable of how many alcoholic drinks individuals had. FU_DAY_1_Time is how long it took them to drink those drinks. I use the time variable to calculate BAC but I am more concerned about the FU_Day_1. I included the FU_Day_1_Time variables as a way of saying that I have other variables that look similar that I will use in an array as well.
A side note. If it's not too late, and if you intend to work with SAS a lot on this, I suggest that you put the number part of your variable names at the end. That way you may use SAS variable list shortcuts such as FU_Day_Time_1 - FU_Day_Time_30 to refer to the whole series.
Extremely long lines of code are problematic with this forum.
Copy/ paste of your code will not run because informats are split across lines.
I am not going to attempt to fix it to fit a reasonable line length.
Do you really have lines of code longer than 250 characters or do we some artifact of different systems?
PS: You could use an INFORMAT statement and avoid having to type the same informat for multiple variables. And since you do not show any values anywhere near 12 characters you don't even need to define an informat as the default BEST.
And even if you do need to specify an informat there are shorthand ways to avoid repeating the informat that many times.
If it may help - cleaned up code:
data WORK.FIELD2;
infile datalines truncover;
input Study_ID:BEST12. Sex:BEST12.
FU_Day_1:BEST12. FU_Day_1_time:BEST12. FU_Day_2:BEST12. FU_Day_2_time:BEST12. FU_Day_3:BEST12.
FU_Day_3_time:BEST12. FU_Day_4:BEST12. FU_Day_4_time:BEST12. FU_Day_5:BEST12. FU_Day_5_time:BEST12. FU_Day_6:BEST12.
FU_Day_6_time:BEST12. FU_Day_7:BEST12. FU_Day_7_time:BEST12. FU_Day_8:BEST12. FU_Day_8_time:BEST12. FU_Day_9:BEST12.
FU_Day_9_time:BEST12. FU_Day_10:BEST12. FU_Day_10_time:BEST12. FU_Day_11:BEST12. FU_Day_11_time:BEST12. FU_Day_12:BEST12.
FU_Day_12_time:BEST12. FU_Day_13:BEST12. FU_Day_13_time:BEST12. FU_Day_14:BEST12. FU_Day_14_time:BEST12. FU_Day_15:BEST12.
FU_Day_15_time:BEST12. FU_Day_16:BEST12. FU_Day_16_time:BEST12. FU_Day_17:BEST12. FU_Day_17_time:BEST12. FU_Day_18:BEST12.
FU_Day_18_time:BEST12. FU_Day_19:BEST12. FU_Day_19_time:BEST12. FU_Day_20:BEST12. FU_Day_20_time:BEST12. FU_Day_21:BEST12.
FU_Day_21_time:BEST12. FU_Day_22:BEST12. FU_Day_22_time:BEST12. FU_Day_23:BEST12. FU_Day_23_time:BEST12. FU_Day_24:BEST12.
FU_Day_24_time:BEST12. FU_Day_25:BEST12. FU_Day_25_time:BEST12. FU_Day_26:BEST12. FU_Day_26_time:BEST12. FU_Day_27:BEST12.
FU_Day_27_time:BEST12. FU_Day_28:BEST12. FU_Day_28_time:BEST12. FU_Day_29:BEST12. FU_Day_29_time:BEST12. FU_Day_30:BEST12.
FU_Day_30_time:BEST12.;
format Study_ID BEST12. Sex BEST12. FU_Day_1 BEST12. FU_Day_1_time BEST12. FU_Day_2 BEST12. FU_Day_2_time BEST12.
FU_Day_3 BEST12. FU_Day_3_time BEST12. FU_Day_4 BEST12. FU_Day_4_time BEST12. FU_Day_5 BEST12.
FU_Day_5_time BEST12. FU_Day_6 BEST12. FU_Day_6_time BEST12. FU_Day_7 BEST12. FU_Day_7_time BEST12. FU_Day_8 BEST12.
FU_Day_8_time BEST12. FU_Day_9 BEST12. FU_Day_9_time BEST12. FU_Day_10 BEST12. FU_Day_10_time BEST12. FU_Day_11 BEST12.
FU_Day_11_time BEST12. FU_Day_12 BEST12. FU_Day_12_time BEST12. FU_Day_13 BEST12. FU_Day_13_time BEST12. FU_Day_14 BEST12.
FU_Day_14_time BEST12. FU_Day_15 BEST12. FU_Day_15_time BEST12. FU_Day_16 BEST12. FU_Day_16_time BEST12. FU_Day_17 BEST12.
FU_Day_17_time BEST12. FU_Day_18 BEST12. FU_Day_18_time BEST12. FU_Day_19 BEST12. FU_Day_19_time BEST12. FU_Day_20 BEST12.
FU_Day_20_time BEST12. FU_Day_21 BEST12. FU_Day_21_time BEST12. FU_Day_22 BEST12. FU_Day_22_time BEST12. FU_Day_23 BEST12.
FU_Day_23_time BEST12. FU_Day_24 BEST12. FU_Day_24_time BEST12. FU_Day_25 BEST12. FU_Day_25_time BEST12. FU_Day_26 BEST12.
FU_Day_26_time BEST12. FU_Day_27 BEST12. FU_Day_27_time BEST12. FU_Day_28 BEST12. FU_Day_28_time BEST12. FU_Day_29 BEST12.
FU_Day_29_time BEST12. FU_Day_30 BEST12. FU_Day_30_time BEST12.;
datalines;
1 0 0 0 0 0 0 0 2 2 2 2 8 10 9 12 3 3 5 3 8 6 0 0 0 0 2 3 0 0 0 0 2 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 0 2 3 2 3 0 0 12 13.5 12 13.5 12 13.5 12 13.5 2 4 0 0 1 2 7 4 2 3.5 0 0 0 0 0 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 0 5 3 0 0 0 0 2 0.5 4 2 3 2 0 0 4 2 0 0 3 4 2 1 0 0 0 0 0 0 3 2.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4 0 0 0 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 9 5 0 0 0 0 2 1 0 0 2 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5 1 0 0 3 5 3 5 7 6 0 0 0 0 0 0 0 0 2 2 0 0 3 1.5 0 0 0 0 2 2 2 2 2 1 0 0 2 3 5 3.5 . . . . . . . . . . . . . . . . . . . . . .
6 0 0 0 0 0 0 0 0 0 3 1 3 3 3 3 0 0 3 1.5 0 0 0 0 3 2 3 2.5 4 4 0 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7 1 1 0.5 18 4 3 2 0 0 4 3 0 0 2 1 0 0 2 2 0 0 0 0 2 1 0 0 0 0 4 3 2 2 8 3 3 2 0 0 . . . . . . . . . . . . . . . . . . . . . .
8 0 14 7 8 5 0 0 2 2 0 0 0 0 3 2 2 3 0 0 4 3 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 . . . . . . . . . . . . . . . . . . . . . . . .
9 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
;
The first thing I would do is to rearrange the data to a long format. Your variable names are not so good, SAS-wise, but it can be solved using a two-dimensional array:
data long;
set field2;
array data [30,2] FU_:;
do day=1 to 30;
FU_day=data[day,1];
FU_time=data[day,2];
output;
end;
keep Study_ID day FU_day FU_time;
run;
Then sort by descending day, and get the last 14 days with data (or a dummy row if no data at all):
proc sort data=long;
by Study_ID descending day;
run;
data want;
set long;
by Study_ID;
if first.Study_ID then
counter=0;
if not missing(FU_day) then do;
counter+1;
if counter<=14 then
output;
end;
if last.Study_ID and counter=0 then
output; /* to get the subjects with no data at all */
run;
It should then be quite easy to get the means and peaks and whatever.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.