BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
joebacon
Pyrite | Level 9

Hi all,

 

I am attempting to create new variables from a dataset containing TimeLine Follow Back (TLFB) data. The data is in wide form and looks like this:

 

 

data WORK.FIELD2;
  infile datalines dsd truncover;

input Study_ID:BEST12. Sex:BEST12. FU_Day_1:BEST12. FU_Day_1_time:BEST12. FU_Day_2:BEST12. FU_Day_2_time:BEST12. FU_Day_3:BEST12. FU_Day_3_time:BEST12. FU_Day_4:BEST12. FU_Day_4_time:BEST12. FU_Day_5:BEST12. FU_Day_5_time:BEST12. FU_Day_6:BEST12. FU_Day_6_
time:BEST12. FU_Day_7:BEST12. FU_Day_7_time:BEST12. FU_Day_8:BEST12. FU_Day_8_time:BEST12. FU_Day_9:BEST12. FU_Day_9_time:BEST12. FU_Day_10:BEST12. FU_Day_10_time:BEST12. FU_Day_11:BEST12. FU_Day_11_time:BEST12. FU_Day_12:BEST12. FU_Day_12_time:BEST12. FU_
Day_13:BEST12. FU_Day_13_time:BEST12. FU_Day_14:BEST12. FU_Day_14_time:BEST12. FU_Day_15:BEST12. FU_Day_15_time:BEST12. FU_Day_16:BEST12. FU_Day_16_time:BEST12. FU_Day_17:BEST12. FU_Day_17_time:BEST12. FU_Day_18:BEST12. FU_Day_18_time:BEST12. FU_Day_19:BES
T12. FU_Day_19_time:BEST12. FU_Day_20:BEST12. FU_Day_20_time:BEST12. FU_Day_21:BEST12. FU_Day_21_time:BEST12. FU_Day_22:BEST12. FU_Day_22_time:BEST12. FU_Day_23:BEST12. FU_Day_23_time:BEST12. FU_Day_24:BEST12. FU_Day_24_time:BEST12. FU_Day_25:BEST12. FU_Da
y_25_time:BEST12. FU_Day_26:BEST12. FU_Day_26_time:BEST12. FU_Day_27:BEST12. FU_Day_27_time:BEST12. FU_Day_28:BEST12. FU_Day_28_time:BEST12. FU_Day_29:BEST12. FU_Day_29_time:BEST12. FU_Day_30:BEST12. FU_Day_30_time:BEST12.;

format Study_ID BEST12. Sex BEST12. FU_Day_1 BEST12. FU_Day_1_time BEST12. FU_Day_2 BEST12. FU_Day_2_time BEST12. FU_Day_3 BEST12. FU_Day_3_time BEST12. FU_Day_4 BEST12. FU_Day_4_time BEST12. FU_Day_5 BEST12. FU_Day_5_time BEST12. FU_Day_6 BEST12. FU_Day_6
_time BEST12. FU_Day_7 BEST12. FU_Day_7_time BEST12. FU_Day_8 BEST12. FU_Day_8_time BEST12. FU_Day_9 BEST12. FU_Day_9_time BEST12. FU_Day_10 BEST12. FU_Day_10_time BEST12. FU_Day_11 BEST12. FU_Day_11_time BEST12. FU_Day_12 BEST12. FU_Day_12_time BEST12. FU
_Day_13 BEST12. FU_Day_13_time BEST12. FU_Day_14 BEST12. FU_Day_14_time BEST12. FU_Day_15 BEST12. FU_Day_15_time BEST12. FU_Day_16 BEST12. FU_Day_16_time BEST12. FU_Day_17 BEST12. FU_Day_17_time BEST12. FU_Day_18 BEST12. FU_Day_18_time BEST12. FU_Day_19 BE
ST12. FU_Day_19_time BEST12. FU_Day_20 BEST12. FU_Day_20_time BEST12. FU_Day_21 BEST12. FU_Day_21_time BEST12. FU_Day_22 BEST12. FU_Day_22_time BEST12. FU_Day_23 BEST12. FU_Day_23_time BEST12. FU_Day_24 BEST12. FU_Day_24_time BEST12. FU_Day_25 BEST12. FU_D
ay_25_time BEST12. FU_Day_26 BEST12. FU_Day_26_time BEST12. FU_Day_27 BEST12. FU_Day_27_time BEST12. FU_Day_28 BEST12. FU_Day_28_time BEST12. FU_Day_29 BEST12. FU_Day_29_time BEST12. FU_Day_30 BEST12. FU_Day_30_time BEST12.;
datalines;
1 0 0 0 0 0 0 0 2 2 2 2 8 10 9 12 3 3 5 3 8 6 0 0 0 0 2 3 0 0 0 0 2 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 0 2 3 2 3 0 0 12 13.5 12 13.5 12 13.5 12 13.5 2 4 0 0 1 2 7 4 2 3.5 0 0 0 0 0 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 0 5 3 0 0 0 0 2 0.5 4 2 3 2 0 0 4 2 0 0 3 4 2 1 0 0 0 0 0 0 3 2.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4 0 0 0 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 9 5 0 0 0 0 2 1 0 0 2 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5 1 0 0 3 5 3 5 7 6 0 0 0 0 0 0 0 0 2 2 0 0 3 1.5 0 0 0 0 2 2 2 2 2 1 0 0 2 3 5 3.5 . . . . . . . . . . . . . . . . . . . . . .
6 0 0 0 0 0 0 0 0 0 3 1 3 3 3 3 0 0 3 1.5 0 0 0 0 3 2 3 2.5 4 4 0 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7 1 1 0.5 18 4 3 2 0 0 4 3 0 0 2 1 0 0 2 2 0 0 0 0 2 1 0 0 0 0 4 3 2 2 8 3 3 2 0 0 . . . . . . . . . . . . . . . . . . . . . .
8 0 14 7 8 5 0 0 2 2 0 0 0 0 3 2 2 3 0 0 4 3 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 . . . . . . . . . . . . . . . . . . . . . . . .
9 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
;;;;

I want to create the following variables on two different timelines. One for the full 30 days (which I have completed) and another for each participant's most recent complete 14 days. 

 

  • Mean Drinks per week ((total number of drinks/14) x 7)
  • Frequency of any drinking days
  • Mean Drinks per drinking day (total number of drinks/frequency of any drinking days)
  • Mean Estimated blood alcohol content (an eBAC will have to be completed for each drinking day and then take the mean of all of these)
  • Peak estimated blood alcohol content (the highest eBAC in a day during each time period)
  • Peak drinking day (the highest number of drinks each participant had during each time period)
  • Frequency of heavy drinking days (5 or more for drinks in a day for men, 4 or more for drinks in a day for women)
  • Extreme heavy drinking days (10 or more drinks in a day for men, 8 or more drinks in a day for women)

The data is in wide format obviously but you can see that each participant has varying amounts of data (look at the difference between 2 and 4 for instance). I calculated a lot of the 30-day variables using arrays, but was wondering if there was a way to capture each participant's 14 days. Some will not have a full 14 days while others will have more than 14 usable days. However, I still would like to 

 

Before I completely understood the data, I was trying to use the following code. However, I realize that I need to approach the issue differently which brings me to the question of how to find the last 14 days of usable data when it varies for each participant

 

 

data Steady.Master;
set Steady.Master;
IPS_14Day_Drink= sum(of IPS_day_17 - IPS_day_30);
FU_14Day_Drink= sum(FU_day_17, FU_day_18, FU_day_19, FU_day_20, FU_day_21, FU_day_22, FU_day_23,
 FU_day_24, FU_day_25, FU_day_26, FU_day_27, FU_day_28,  FU_day_29, FU_day_30);
run;

/*Counting how many days they drank*/

data STEADy.master(drop=i drop=n9);
set STEADY.Master;
n9=0;
array _n(*) IPS_day_17 - IPS_day_30;
do i=1 to dim(_n);
if  _n(i) ge 1 then n9+1;
end;

IPS_14Day_Drink_Count=n9;
run;
data STEADy.master(drop=i drop=n9);
set STEADY.Master;
n9=0;
array _n(*) FU_day_17 FU_day_18 FU_day_19 FU_day_20 FU_day_21 FU_day_22 FU_day_23
 FU_day_24 FU_day_25 FU_day_26 FU_day_27 FU_day_28  FU_day_29 FU_day_30;
do i=1 to dim(_n);
if  _n(i) ge 1 then n9+1;
end;

FU_14Day_Drink_Count=n9;
run;

/*Creating new variables for mean drinks per week ((total drinks/14)*7)*/
data STEADY.master;
set STEADY.master;
IPS_14Day_Mean_Weekly_Drink = ((IPS_14Day_Drink/14)*7);
FU_14Day_Mean_Weekly_Drink = ((FU_14Day_Drink/14)*7);
IPS_14Day_Drinks_Per_Drink_Day = IPS_14Day_Drink/IPS_14Day_Drink_Count;
FU_14Day_Drinks_Per_Drink_Day = FU_14Day_Drink/FU_14Day_Drink_Count;
run;

I always appreciate any help with my code or materials to read. I have learned SAS almost entirely from this forum and really hold the regulars here in high regard for spending their time so graciously.

 

1 ACCEPTED SOLUTION

Accepted Solutions
s_lassen
Meteorite | Level 14

The first thing I would do is to rearrange the data to a long format. Your variable names are not so good, SAS-wise, but it can be solved using a two-dimensional array:

data long;
  set field2;
  array data [30,2] FU_:;
  
  do day=1 to 30;
    FU_day=data[day,1];
    FU_time=data[day,2];
    output;
    end;
  keep Study_ID day FU_day FU_time;
run;

Then sort by descending day, and get the last 14 days with data (or a dummy row if no data at all):

proc sort data=long;
  by Study_ID descending day;
run;

data want;
  set long;
  by Study_ID;
  if first.Study_ID then
    counter=0;
  if not missing(FU_day) then do;
counter+1; if counter<=14 then output;
end; if last.Study_ID and counter=0 then output; /* to get the subjects with no data at all */ run;

It should then be quite easy to get the means and peaks and whatever.

 

 

 

View solution in original post

14 REPLIES 14
PaigeMiller
Diamond | Level 26

I'm not sure I fully understand, and you do not have complete data out through 30 days for any of the records in your sample data.

 

Nevertheless, you can find where the first dot appears on each row by using ARRAYs and the WHICHN function. I show only the array for FU_DAY. It's not easy when you also have variables named FU_DAY_1_TIME, because these can't be turned into an array as easily, without typing all 30 variable names. You probably want the variables named FU_DAY_TIME_1 etc. and then creating the array is easy, and I leave you to make those changes.

 

This code shows that the first dot for record one is on day 17, and so the last 14 variables without missing data are in days 3 through 16 (simple math), and from there you ought to be able to compute the statistics you want.

 

data WORK.FIELD2;
  infile datalines;
input Study_ID:BEST12. Sex:BEST12. FU_Day_1:BEST12. FU_Day_1_time:BEST12. FU_Day_2:BEST12. FU_Day_2_time:BEST12. FU_Day_3:BEST12. FU_Day_3_time:BEST12. FU_Day_4:BEST12. FU_Day_4_time:BEST12. FU_Day_5:BEST12. FU_Day_5_time:BEST12. FU_Day_6:BEST12. 
FU_Day_6_time:BEST12. FU_Day_7:BEST12. FU_Day_7_time:BEST12. FU_Day_8:BEST12. FU_Day_8_time:BEST12. FU_Day_9:BEST12. FU_Day_9_time:BEST12. FU_Day_10:BEST12. FU_Day_10_time:BEST12. FU_Day_11:BEST12. FU_Day_11_time:BEST12. FU_Day_12:BEST12. FU_Day_12_time:BEST12. 
FU_Day_13:BEST12. FU_Day_13_time:BEST12. FU_Day_14:BEST12. FU_Day_14_time:BEST12. FU_Day_15:BEST12. FU_Day_15_time:BEST12. FU_Day_16:BEST12. FU_Day_16_time:BEST12. FU_Day_17:BEST12. FU_Day_17_time:BEST12. FU_Day_18:BEST12. FU_Day_18_time:BEST12. 
FU_Day_19:BEST12. FU_Day_19_time:BEST12. FU_Day_20:BEST12. FU_Day_20_time:BEST12. FU_Day_21:BEST12. FU_Day_21_time:BEST12. FU_Day_22:BEST12. FU_Day_22_time:BEST12. FU_Day_23:BEST12. FU_Day_23_time:BEST12. FU_Day_24:BEST12. FU_Day_24_time:BEST12. FU_Day_25:BEST12. 
FU_Day_25_time:BEST12. FU_Day_26:BEST12. FU_Day_26_time:BEST12. FU_Day_27:BEST12. FU_Day_27_time:BEST12. FU_Day_28:BEST12. FU_Day_28_time:BEST12. FU_Day_29:BEST12. FU_Day_29_time:BEST12. FU_Day_30:BEST12. FU_Day_30_time:BEST12.;
array fu_day[*] fu_day_1-fu_day_30;
do i=1 to dim(fu_day);
	if missing(fu_day(i)) then fu_day(i)=-9999;
end;
first_dot_day = whichn(-9999,of fu_day[*]);
drop i;
datalines;
1 0 0 0 0 0 0 0 2 2 2 2 8 10 9 12 3 3 5 3 8 6 0 0 0 0 2 3 0 0 0 0 2 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 0 2 3 2 3 0 0 12 13.5 12 13.5 12 13.5 12 13.5 2 4 0 0 1 2 7 4 2 3.5 0 0 0 0 0 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 0 5 3 0 0 0 0 2 0.5 4 2 3 2 0 0 4 2 0 0 3 4 2 1 0 0 0 0 0 0 3 2.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4 0 0 0 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 9 5 0 0 0 0 2 1 0 0 2 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5 1 0 0 3 5 3 5 7 6 0 0 0 0 0 0 0 0 2 2 0 0 3 1.5 0 0 0 0 2 2 2 2 2 1 0 0 2 3 5 3.5 . . . . . . . . . . . . . . . . . . . . . .
6 0 0 0 0 0 0 0 0 0 3 1 3 3 3 3 0 0 3 1.5 0 0 0 0 3 2 3 2.5 4 4 0 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7 1 1 0.5 18 4 3 2 0 0 4 3 0 0 2 1 0 0 2 2 0 0 0 0 2 1 0 0 0 0 4 3 2 2 8 3 3 2 0 0 . . . . . . . . . . . . . . . . . . . . . .
8 0 14 7 8 5 0 0 2 2 0 0 0 0 3 2 2 3 0 0 4 3 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 . . . . . . . . . . . . . . . . . . . . . . . .
9 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
;
--
Paige Miller
joebacon
Pyrite | Level 9
There are very few cases of those who go out to 30. However, I should include one in my sample data. I will update it to reflect this.
I definitely need to change the variable names for FU_Day_1_Time. For now, you could see that I had to input each of the 30 variables into arrays because of the variable format. It probably should've been one of my first steps.

Your solution works great if there are none missing before the first valid entry. Some of the data have missing observations before they reach any usable data. Is there a way to also add that it should start with a number that is not missing?
PaigeMiller
Diamond | Level 26

Your solution works great if there are none missing before the first valid entry. 

Please provide an example.

--
Paige Miller
joebacon
Pyrite | Level 9
data WORK.FIELD2;
  infile datalines truncover;

input Study_ID:BEST12. Sex:BEST12. 
FU_Day_1:BEST12. FU_Day_1_time:BEST12. FU_Day_2:BEST12. FU_Day_2_time:BEST12. FU_Day_3:BEST12. 
FU_Day_3_time:BEST12. FU_Day_4:BEST12. FU_Day_4_time:BEST12. FU_Day_5:BEST12. FU_Day_5_time:BEST12. FU_Day_6:BEST12. 
FU_Day_6_time:BEST12. FU_Day_7:BEST12. FU_Day_7_time:BEST12. FU_Day_8:BEST12. FU_Day_8_time:BEST12. FU_Day_9:BEST12. 
FU_Day_9_time:BEST12. FU_Day_10:BEST12. FU_Day_10_time:BEST12. FU_Day_11:BEST12. FU_Day_11_time:BEST12. FU_Day_12:BEST12. 
FU_Day_12_time:BEST12. FU_Day_13:BEST12. FU_Day_13_time:BEST12. FU_Day_14:BEST12. FU_Day_14_time:BEST12. FU_Day_15:BEST12. 
FU_Day_15_time:BEST12. FU_Day_16:BEST12. FU_Day_16_time:BEST12. FU_Day_17:BEST12. FU_Day_17_time:BEST12. FU_Day_18:BEST12. 
FU_Day_18_time:BEST12. FU_Day_19:BEST12. FU_Day_19_time:BEST12. FU_Day_20:BEST12. FU_Day_20_time:BEST12. FU_Day_21:BEST12. 
FU_Day_21_time:BEST12. FU_Day_22:BEST12. FU_Day_22_time:BEST12. FU_Day_23:BEST12. FU_Day_23_time:BEST12. FU_Day_24:BEST12. 
FU_Day_24_time:BEST12. FU_Day_25:BEST12. FU_Day_25_time:BEST12. FU_Day_26:BEST12. FU_Day_26_time:BEST12. FU_Day_27:BEST12. 
FU_Day_27_time:BEST12. FU_Day_28:BEST12. FU_Day_28_time:BEST12. FU_Day_29:BEST12. FU_Day_29_time:BEST12. FU_Day_30:BEST12. 
FU_Day_30_time:BEST12.;

format Study_ID BEST12. Sex BEST12. FU_Day_1 BEST12. FU_Day_1_time BEST12. FU_Day_2 BEST12. FU_Day_2_time BEST12. 
FU_Day_3 BEST12. FU_Day_3_time BEST12. FU_Day_4 BEST12. FU_Day_4_time BEST12. FU_Day_5 BEST12. 
FU_Day_5_time BEST12. FU_Day_6 BEST12. FU_Day_6_time BEST12. FU_Day_7 BEST12. FU_Day_7_time BEST12. FU_Day_8 BEST12. 
FU_Day_8_time BEST12. FU_Day_9 BEST12. FU_Day_9_time BEST12. FU_Day_10 BEST12. FU_Day_10_time BEST12. FU_Day_11 BEST12. 
FU_Day_11_time BEST12. FU_Day_12 BEST12. FU_Day_12_time BEST12. FU_Day_13 BEST12. FU_Day_13_time BEST12. FU_Day_14 BEST12. 
FU_Day_14_time BEST12. FU_Day_15 BEST12. FU_Day_15_time BEST12. FU_Day_16 BEST12. FU_Day_16_time BEST12. FU_Day_17 BEST12. 
FU_Day_17_time BEST12. FU_Day_18 BEST12. FU_Day_18_time BEST12. FU_Day_19 BEST12. FU_Day_19_time BEST12. FU_Day_20 BEST12. 
FU_Day_20_time BEST12. FU_Day_21 BEST12. FU_Day_21_time BEST12. FU_Day_22 BEST12. FU_Day_22_time BEST12. FU_Day_23 BEST12. 
FU_Day_23_time BEST12. FU_Day_24 BEST12. FU_Day_24_time BEST12. FU_Day_25 BEST12. FU_Day_25_time BEST12. FU_Day_26 BEST12. 
FU_Day_26_time BEST12. FU_Day_27 BEST12. FU_Day_27_time BEST12. FU_Day_28 BEST12. FU_Day_28_time BEST12. FU_Day_29 BEST12. 
FU_Day_29_time BEST12. FU_Day_30 BEST12. FU_Day_30_time BEST12.;
datalines;
1 0 0 0 0 0 0 0 2 2 2 2 8 10 9 12 3 3 5 3 8 6 0 0 0 0 2 3 0 0 0 0 2 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 0 2 3 2 3 0 0 12 13.5 12 13.5 12 13.5 12 13.5 2 4 0 0 1 2 7 4 2 3.5 0 0 0 0 0 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 0 5 3 0 0 0 0 2 0.5 4 2 3 2 0 0 4 2 0 0 3 4 2 1 0 0 0 0 0 0 3 2.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4 0 0 0 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 9 5 0 0 0 0 2 1 0 0 2 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5 1 0 0 3 5 3 5 7 6 0 0 0 0 0 0 0 0 2 2 0 0 3 1.5 0 0 0 0 2 2 2 2 2 1 0 0 2 3 5 3.5 . . . . . . . . . . . . . . . . . . . . . .
6 0 0 0 0 0 0 0 0 0 3 1 3 3 3 3 0 0 3 1.5 0 0 0 0 3 2 3 2.5 4 4 0 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7 1 1 0.5 18 4 3 2 0 0 4 3 0 0 2 1 0 0 2 2 0 0 0 0 2 1 0 0 0 0 4 3 2 2 8 3 3 2 0 0 . . . . . . . . . . . . . . . . . . . . . .
8 0 14 7 8 5 0 0 2 2 0 0 0 0 3 2 2 3 0 0 4 3 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 . . . . . . . . . . . . . . . . . . . . . . . .
48 0 0 0 0 0 4 2.5 0 0 3 4 0 0 0 0 0 0 2 0.5 0 0 0 0 0 0 0 0 1 2.5 0 0 0 0 0 0 0 0 0 0 10 4.5 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0
83 0 . . 2 1 0 0 0 0 2 1 4 2 0 0 5 4 0 0 0 0 0 0 0 0 5 2.5 0 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
;

I have updated the data with two different observations. One is of someone with full data where they dont hit a "missing" but I would still need the last 14 days (days 17-30). The other is an example of someone who is missing data and then has usable data.

 

Thank you for all the help thus far.

PaigeMiller
Diamond | Level 26

I have removed all of your formats and informats, as they don't apply to this example.

 

 

data WORK.FIELD2;
  infile datalines;
input Study_ID Sex FU_Day_1 FU_Day_1_time FU_Day_2 FU_Day_2_time FU_Day_3 FU_Day_3_time FU_Day_4 FU_Day_4_time FU_Day_5 FU_Day_5_time FU_Day_6 
FU_Day_6_time FU_Day_7 FU_Day_7_time FU_Day_8 FU_Day_8_time FU_Day_9 FU_Day_9_time FU_Day_10 FU_Day_10_time FU_Day_11 FU_Day_11_time FU_Day_12 FU_Day_12_time 
FU_Day_13 FU_Day_13_time FU_Day_14 FU_Day_14_time FU_Day_15 FU_Day_15_time FU_Day_16 FU_Day_16_time FU_Day_17 FU_Day_17_time FU_Day_18 FU_Day_18_time 
FU_Day_19 FU_Day_19_time FU_Day_20 FU_Day_20_time FU_Day_21 FU_Day_21_time FU_Day_22 FU_Day_22_time FU_Day_23 FU_Day_23_time FU_Day_24 FU_Day_24_time FU_Day_25 
FU_Day_25_time FU_Day_26 FU_Day_26_time FU_Day_27 FU_Day_27_time FU_Day_28 FU_Day_28_time FU_Day_29 FU_Day_29_time FU_Day_30 FU_Day_30_time;
array fu_day[*] fu_day_1-fu_day_30;
first_non_dot=.;
do i=1 to dim(fu_day);
	if not missing(fu_day(i)) and missing(first_non_dot) then first_non_dot=i;
	if missing(fu_day(i)) then fu_day(i)=-9999;
end;
first_dot_day = whichn(-9999,of fu_day[*]);

drop i;
datalines;
1 0 0 0 0 0 0 0 2 2 2 2 8 10 9 12 3 3 5 3 8 6 0 0 0 0 2 3 0 0 0 0 2 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 0 2 3 2 3 0 0 12 13.5 12 13.5 12 13.5 12 13.5 2 4 0 0 1 2 7 4 2 3.5 0 0 0 0 0 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 0 5 3 0 0 0 0 2 0.5 4 2 3 2 0 0 4 2 0 0 3 4 2 1 0 0 0 0 0 0 3 2.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4 0 0 0 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 9 5 0 0 0 0 2 1 0 0 2 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5 1 0 0 3 5 3 5 7 6 0 0 0 0 0 0 0 0 2 2 0 0 3 1.5 0 0 0 0 2 2 2 2 2 1 0 0 2 3 5 3.5 . . . . . . . . . . . . . . . . . . . . . .
6 0 0 0 0 0 0 0 0 0 3 1 3 3 3 3 0 0 3 1.5 0 0 0 0 3 2 3 2.5 4 4 0 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7 1 1 0.5 18 4 3 2 0 0 4 3 0 0 2 1 0 0 2 2 0 0 0 0 2 1 0 0 0 0 4 3 2 2 8 3 3 2 0 0 . . . . . . . . . . . . . . . . . . . . . .
8 0 14 7 8 5 0 0 2 2 0 0 0 0 3 2 2 3 0 0 4 3 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 . . . . . . . . . . . . . . . . . . . . . . . .
9 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
48 0 0 0 0 0 4 2.5 0 0 3 4 0 0 0 0 0 0 2 0.5 0 0 0 0 0 0 0 0 1 2.5 0 0 0 0 0 0 0 0 0 0 10 4.5 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0
83 0 . . 2 1 0 0 0 0 2 1 4 2 0 0 5 4 0 0 0 0 0 0 0 0 5 2.5 0 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
;

This gives you the location of the first non-dot and the last dot, and then you can select the last 14 values and work with them.

 

--
Paige Miller
PGStats
Opal | Level 21

Using arrays:

 

data WORK.FIELD2;
  infile datalines truncover;

input Study_ID:BEST12. Sex:BEST12. 
FU_Day_1:BEST12. FU_Day_1_time:BEST12. FU_Day_2:BEST12. FU_Day_2_time:BEST12. FU_Day_3:BEST12. 
FU_Day_3_time:BEST12. FU_Day_4:BEST12. FU_Day_4_time:BEST12. FU_Day_5:BEST12. FU_Day_5_time:BEST12. FU_Day_6:BEST12. 
FU_Day_6_time:BEST12. FU_Day_7:BEST12. FU_Day_7_time:BEST12. FU_Day_8:BEST12. FU_Day_8_time:BEST12. FU_Day_9:BEST12. 
FU_Day_9_time:BEST12. FU_Day_10:BEST12. FU_Day_10_time:BEST12. FU_Day_11:BEST12. FU_Day_11_time:BEST12. FU_Day_12:BEST12. 
FU_Day_12_time:BEST12. FU_Day_13:BEST12. FU_Day_13_time:BEST12. FU_Day_14:BEST12. FU_Day_14_time:BEST12. FU_Day_15:BEST12. 
FU_Day_15_time:BEST12. FU_Day_16:BEST12. FU_Day_16_time:BEST12. FU_Day_17:BEST12. FU_Day_17_time:BEST12. FU_Day_18:BEST12. 
FU_Day_18_time:BEST12. FU_Day_19:BEST12. FU_Day_19_time:BEST12. FU_Day_20:BEST12. FU_Day_20_time:BEST12. FU_Day_21:BEST12. 
FU_Day_21_time:BEST12. FU_Day_22:BEST12. FU_Day_22_time:BEST12. FU_Day_23:BEST12. FU_Day_23_time:BEST12. FU_Day_24:BEST12. 
FU_Day_24_time:BEST12. FU_Day_25:BEST12. FU_Day_25_time:BEST12. FU_Day_26:BEST12. FU_Day_26_time:BEST12. FU_Day_27:BEST12. 
FU_Day_27_time:BEST12. FU_Day_28:BEST12. FU_Day_28_time:BEST12. FU_Day_29:BEST12. FU_Day_29_time:BEST12. FU_Day_30:BEST12. 
FU_Day_30_time:BEST12.;

format Study_ID BEST12. Sex BEST12. FU_Day_1 BEST12. FU_Day_1_time BEST12. FU_Day_2 BEST12. FU_Day_2_time BEST12. 
FU_Day_3 BEST12. FU_Day_3_time BEST12. FU_Day_4 BEST12. FU_Day_4_time BEST12. FU_Day_5 BEST12. 
FU_Day_5_time BEST12. FU_Day_6 BEST12. FU_Day_6_time BEST12. FU_Day_7 BEST12. FU_Day_7_time BEST12. FU_Day_8 BEST12. 
FU_Day_8_time BEST12. FU_Day_9 BEST12. FU_Day_9_time BEST12. FU_Day_10 BEST12. FU_Day_10_time BEST12. FU_Day_11 BEST12. 
FU_Day_11_time BEST12. FU_Day_12 BEST12. FU_Day_12_time BEST12. FU_Day_13 BEST12. FU_Day_13_time BEST12. FU_Day_14 BEST12. 
FU_Day_14_time BEST12. FU_Day_15 BEST12. FU_Day_15_time BEST12. FU_Day_16 BEST12. FU_Day_16_time BEST12. FU_Day_17 BEST12. 
FU_Day_17_time BEST12. FU_Day_18 BEST12. FU_Day_18_time BEST12. FU_Day_19 BEST12. FU_Day_19_time BEST12. FU_Day_20 BEST12. 
FU_Day_20_time BEST12. FU_Day_21 BEST12. FU_Day_21_time BEST12. FU_Day_22 BEST12. FU_Day_22_time BEST12. FU_Day_23 BEST12. 
FU_Day_23_time BEST12. FU_Day_24 BEST12. FU_Day_24_time BEST12. FU_Day_25 BEST12. FU_Day_25_time BEST12. FU_Day_26 BEST12. 
FU_Day_26_time BEST12. FU_Day_27 BEST12. FU_Day_27_time BEST12. FU_Day_28 BEST12. FU_Day_28_time BEST12. FU_Day_29 BEST12. 
FU_Day_29_time BEST12. FU_Day_30 BEST12. FU_Day_30_time BEST12.;
datalines;
1 0 0 0 0 0 0 0 2 2 2 2 8 10 9 12 3 3 5 3 8 6 0 0 0 0 2 3 0 0 0 0 2 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 0 2 3 2 3 0 0 12 13.5 12 13.5 12 13.5 12 13.5 2 4 0 0 1 2 7 4 2 3.5 0 0 0 0 0 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 0 5 3 0 0 0 0 2 0.5 4 2 3 2 0 0 4 2 0 0 3 4 2 1 0 0 0 0 0 0 3 2.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4 0 0 0 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 9 5 0 0 0 0 2 1 0 0 2 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5 1 0 0 3 5 3 5 7 6 0 0 0 0 0 0 0 0 2 2 0 0 3 1.5 0 0 0 0 2 2 2 2 2 1 0 0 2 3 5 3.5 . . . . . . . . . . . . . . . . . . . . . .
6 0 0 0 0 0 0 0 0 0 3 1 3 3 3 3 0 0 3 1.5 0 0 0 0 3 2 3 2.5 4 4 0 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7 1 1 0.5 18 4 3 2 0 0 4 3 0 0 2 1 0 0 2 2 0 0 0 0 2 1 0 0 0 0 4 3 2 2 8 3 3 2 0 0 . . . . . . . . . . . . . . . . . . . . . .
8 0 14 7 8 5 0 0 2 2 0 0 0 0 3 2 2 3 0 0 4 3 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 . . . . . . . . . . . . . . . . . . . . . . . .
48 0 0 0 0 0 4 2.5 0 0 3 4 0 0 0 0 0 0 2 0.5 0 0 0 0 0 0 0 0 1 2.5 0 0 0 0 0 0 0 0 0 0 10 4.5 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0
83 0 . . 2 1 0 0 0 0 2 1 4 2 0 0 5 4 0 0 0 0 0 0 0 0 5 2.5 0 0 9 8.9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
;

data want_f2w;
set field2;
array fu fu_day_1 - fu_day_30;
array f2w_{14};

do i = dim(fu) to 1 by -1;
    if missing(last) then do;
        if i < dim(f2w_) then leave;
        if not missing(fu{i}) then last = i;
        end;
    else do;
        if missing(fu{i}) then call missing(last);
        else if i  <= last - dim(f2W_) + 1 then do;
            do j = 1 to dim(f2w_);
                f2w_{j} = fu{i + j -1};
                end;
            leave;
            end;
        end;
    end;
keep study_id sex f2w:;
run;

proc print data=want_f2w noobs; run;
Study_ID 	Sex 	f2w_1 	f2w_2 	f2w_3 	f2w_4 	f2w_5 	f2w_6 	f2w_7 	f2w_8 	f2w_9 	f2w_10 	f2w_11 	f2w_12 	f2w_13 	f2w_14
1 	0 	0 	2 	2 	8 	9 	3 	5 	8 	0 	0 	2 	0 	0 	2
2 	0 	2 	0 	12 	12 	12 	12 	2 	0 	1 	7 	2 	0 	0 	0
3 	0 	0 	0 	2 	4 	3 	0 	4 	0 	3 	2 	0 	0 	0 	3
4 	0 	2 	0 	0 	0 	0 	0 	0 	0 	9 	0 	0 	2 	0 	2
5 	1 	0 	0 	0 	2 	0 	3 	0 	0 	2 	2 	2 	0 	2 	5
6 	0 	0 	0 	0 	3 	3 	3 	0 	3 	0 	0 	3 	3 	4 	0
7 	1 	0 	2 	0 	2 	0 	0 	2 	0 	0 	4 	2 	8 	3 	0
8 	0 	0 	0 	3 	2 	0 	4 	0 	0 	0 	0 	0 	1 	0 	0
48 	0 	0 	0 	0 	10 	0 	0 	0 	0 	0 	1 	0 	0 	0 	0
83 	0 	2 	0 	0 	2 	4 	0 	5 	0 	0 	0 	0 	5 	0 	9

Note, I fixed obs study_id=83 in example dataset to make the sequence 14 long.

PG
PGStats
Opal | Level 21

What's the difference between FU_Day_1 and  FU_Day_1_time?

PG
joebacon
Pyrite | Level 9

FU_Day_1 is a count variable of how many alcoholic drinks individuals had. FU_DAY_1_Time is how long it took them to drink those drinks. I use the time variable to calculate BAC but I am more concerned about the FU_Day_1. I included the FU_Day_1_Time variables as a way of saying that I have other variables that look similar that I will use in an array as well.

PGStats
Opal | Level 21

A side note. If it's not too late, and if you intend to work with SAS a lot on this, I suggest that you put the number part of your variable names at the end. That way you may use SAS variable list shortcuts such as FU_Day_Time_1 - FU_Day_Time_30 to refer to the whole series.

PG
ballardw
Super User

Extremely long lines of code are problematic with this forum.

Copy/ paste of your code will not run because informats are split across lines.

 

I am not going to attempt to fix it to fit a reasonable line length.

 

Do you really have lines of code longer than 250 characters or do we some artifact of different systems?

 

PS: You could use an INFORMAT statement and avoid having to type the same informat for multiple variables. And since you do not show any values anywhere near 12 characters you don't even need to define an informat as the default BEST.

And even if you do need to specify an informat there are shorthand ways to avoid repeating the informat that many times.

 

PGStats
Opal | Level 21

If it may help - cleaned up code:

 

data WORK.FIELD2;
  infile datalines truncover;

input Study_ID:BEST12. Sex:BEST12. 
FU_Day_1:BEST12. FU_Day_1_time:BEST12. FU_Day_2:BEST12. FU_Day_2_time:BEST12. FU_Day_3:BEST12. 
FU_Day_3_time:BEST12. FU_Day_4:BEST12. FU_Day_4_time:BEST12. FU_Day_5:BEST12. FU_Day_5_time:BEST12. FU_Day_6:BEST12. 
FU_Day_6_time:BEST12. FU_Day_7:BEST12. FU_Day_7_time:BEST12. FU_Day_8:BEST12. FU_Day_8_time:BEST12. FU_Day_9:BEST12. 
FU_Day_9_time:BEST12. FU_Day_10:BEST12. FU_Day_10_time:BEST12. FU_Day_11:BEST12. FU_Day_11_time:BEST12. FU_Day_12:BEST12. 
FU_Day_12_time:BEST12. FU_Day_13:BEST12. FU_Day_13_time:BEST12. FU_Day_14:BEST12. FU_Day_14_time:BEST12. FU_Day_15:BEST12. 
FU_Day_15_time:BEST12. FU_Day_16:BEST12. FU_Day_16_time:BEST12. FU_Day_17:BEST12. FU_Day_17_time:BEST12. FU_Day_18:BEST12. 
FU_Day_18_time:BEST12. FU_Day_19:BEST12. FU_Day_19_time:BEST12. FU_Day_20:BEST12. FU_Day_20_time:BEST12. FU_Day_21:BEST12. 
FU_Day_21_time:BEST12. FU_Day_22:BEST12. FU_Day_22_time:BEST12. FU_Day_23:BEST12. FU_Day_23_time:BEST12. FU_Day_24:BEST12. 
FU_Day_24_time:BEST12. FU_Day_25:BEST12. FU_Day_25_time:BEST12. FU_Day_26:BEST12. FU_Day_26_time:BEST12. FU_Day_27:BEST12. 
FU_Day_27_time:BEST12. FU_Day_28:BEST12. FU_Day_28_time:BEST12. FU_Day_29:BEST12. FU_Day_29_time:BEST12. FU_Day_30:BEST12. 
FU_Day_30_time:BEST12.;

format Study_ID BEST12. Sex BEST12. FU_Day_1 BEST12. FU_Day_1_time BEST12. FU_Day_2 BEST12. FU_Day_2_time BEST12. 
FU_Day_3 BEST12. FU_Day_3_time BEST12. FU_Day_4 BEST12. FU_Day_4_time BEST12. FU_Day_5 BEST12. 
FU_Day_5_time BEST12. FU_Day_6 BEST12. FU_Day_6_time BEST12. FU_Day_7 BEST12. FU_Day_7_time BEST12. FU_Day_8 BEST12. 
FU_Day_8_time BEST12. FU_Day_9 BEST12. FU_Day_9_time BEST12. FU_Day_10 BEST12. FU_Day_10_time BEST12. FU_Day_11 BEST12. 
FU_Day_11_time BEST12. FU_Day_12 BEST12. FU_Day_12_time BEST12. FU_Day_13 BEST12. FU_Day_13_time BEST12. FU_Day_14 BEST12. 
FU_Day_14_time BEST12. FU_Day_15 BEST12. FU_Day_15_time BEST12. FU_Day_16 BEST12. FU_Day_16_time BEST12. FU_Day_17 BEST12. 
FU_Day_17_time BEST12. FU_Day_18 BEST12. FU_Day_18_time BEST12. FU_Day_19 BEST12. FU_Day_19_time BEST12. FU_Day_20 BEST12. 
FU_Day_20_time BEST12. FU_Day_21 BEST12. FU_Day_21_time BEST12. FU_Day_22 BEST12. FU_Day_22_time BEST12. FU_Day_23 BEST12. 
FU_Day_23_time BEST12. FU_Day_24 BEST12. FU_Day_24_time BEST12. FU_Day_25 BEST12. FU_Day_25_time BEST12. FU_Day_26 BEST12. 
FU_Day_26_time BEST12. FU_Day_27 BEST12. FU_Day_27_time BEST12. FU_Day_28 BEST12. FU_Day_28_time BEST12. FU_Day_29 BEST12. 
FU_Day_29_time BEST12. FU_Day_30 BEST12. FU_Day_30_time BEST12.;
datalines;
1 0 0 0 0 0 0 0 2 2 2 2 8 10 9 12 3 3 5 3 8 6 0 0 0 0 2 3 0 0 0 0 2 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 0 2 3 2 3 0 0 12 13.5 12 13.5 12 13.5 12 13.5 2 4 0 0 1 2 7 4 2 3.5 0 0 0 0 0 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 0 5 3 0 0 0 0 2 0.5 4 2 3 2 0 0 4 2 0 0 3 4 2 1 0 0 0 0 0 0 3 2.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4 0 0 0 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 9 5 0 0 0 0 2 1 0 0 2 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5 1 0 0 3 5 3 5 7 6 0 0 0 0 0 0 0 0 2 2 0 0 3 1.5 0 0 0 0 2 2 2 2 2 1 0 0 2 3 5 3.5 . . . . . . . . . . . . . . . . . . . . . .
6 0 0 0 0 0 0 0 0 0 3 1 3 3 3 3 0 0 3 1.5 0 0 0 0 3 2 3 2.5 4 4 0 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7 1 1 0.5 18 4 3 2 0 0 4 3 0 0 2 1 0 0 2 2 0 0 0 0 2 1 0 0 0 0 4 3 2 2 8 3 3 2 0 0 . . . . . . . . . . . . . . . . . . . . . .
8 0 14 7 8 5 0 0 2 2 0 0 0 0 3 2 2 3 0 0 4 3 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 . . . . . . . . . . . . . . . . . . . . . . . .
9 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
;
PG
joebacon
Pyrite | Level 9
Thank you, PG Stats. I was working on making that more legible to help get my answer.
joebacon
Pyrite | Level 9
Is there a best way to get from data to datastep? I used Reeza's old macro which provided those long lines of code. In the future, is this a formatting issue where I simply can't copy and paste from the log from that macro? Should I just type it all out in the future? Always looking for a better way to do things.

I think the lines of code are stemming from that Macro which I didn't realize pasted as it did, apologies.

I can also include a sample observation that has missing data first and then usable data afterward.
s_lassen
Meteorite | Level 14

The first thing I would do is to rearrange the data to a long format. Your variable names are not so good, SAS-wise, but it can be solved using a two-dimensional array:

data long;
  set field2;
  array data [30,2] FU_:;
  
  do day=1 to 30;
    FU_day=data[day,1];
    FU_time=data[day,2];
    output;
    end;
  keep Study_ID day FU_day FU_time;
run;

Then sort by descending day, and get the last 14 days with data (or a dummy row if no data at all):

proc sort data=long;
  by Study_ID descending day;
run;

data want;
  set long;
  by Study_ID;
  if first.Study_ID then
    counter=0;
  if not missing(FU_day) then do;
counter+1; if counter<=14 then output;
end; if last.Study_ID and counter=0 then output; /* to get the subjects with no data at all */ run;

It should then be quite easy to get the means and peaks and whatever.

 

 

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 14 replies
  • 1382 views
  • 2 likes
  • 5 in conversation