Hi all,
I want to add a variable to my data to do some analyses but I am not sure how to do it.
The additional variable I need is the interval time between sessions.
The tricky part is that I have nested data (level 1=episode, level 2=task, & level 3=ID) and I need to compute the interval time within the smallest cluster (between episodes).
I guess it will be best to show my data I have and then try to explain how I want the data format to be.
Here is part of my data (have).
data have;
input ID $ task episode Session;
datalines;
1 1 4 14.31
1 1 6 40.51
1 2 3 7.09
1 2 5 14.69
1 2 10 61.23
1 2 14 137.16
1 2 16 160.66
1 2 20 200.93
1 2 22 236.07
1 2 29 330.68
1 2 33 380.67
1 2 37 435.19
1 3 17 113.78
1 3 29 273.77
1 4 3 11.09
1 4 7 142.75
1 4 11 173.65
1 4 43 645.13
2 1 4 66.38
2 1 6 93.22
2 1 8 105.6
2 2 15 121.76
2 2 24 168.51
2 2 28 196.26
2 2 32 216.11
2 2 34 247.21
2 2 38 317.76
2 2 42 347.82
2 2 50 460.93
2 2 52 463.93
2 2 56 528.77
;
So what I want to do is create a variable called "Interval" which contains the difference between Sessions.
Note that I want to compute the Interval WITHIN the tasks.
It is difficult for me to clearly say what I want, so let me show you the final data set (want) that I have in my mind with the have data above.
ID task episode Session Interval
1 1 4 14.31 NA
1 1 6 40.51 (40.51-14.31)
1 2 3 7.09 NA
1 2 5 14.69 (14.69-7.09)
1 2 10 61.23 (61.23-14.69)
1 2 14 137.16 (137.16-61.23)
1 2 16 160.66 …
1 2 20 200.93 …
1 2 22 236.07 …
1 2 29 330.68 …
1 2 33 380.67 …
1 2 37 435.19 …
1 3 17 113.78 NA
1 3 29 273.77 (273.77-113.78)
1 4 3 11.09 NA
1 4 7 142.75 (142.75-11.09)
1 4 11 173.65 (173.65-142.75)
1 4 43 645.13 (645.13-173.65)
2 1 4 66.38 NA
2 1 6 93.22 (93.22-66.38)
2 1 8 105.6 (105.6-93.22)
2 2 15 121.76 NA
2 2 24 168.51 (168.51-121.76)
2 2 28 196.26 (196.26-168.51)
2 2 32 216.11 …
2 2 34 247.21 …
2 2 38 317.76 …
2 2 42 347.82 …
2 2 50 460.93 …
2 2 52 463.93 …
2 2 56 528.77 …
Note that for the variable Interval, I have put in the calculations but I want the actual value in the parenthesis.
I have put in "NA" (it doesn't have to be NA, it can/should be any number) for the first episode in a task since it does not have a preceding episode.
I have also omitted some values of Interval by "..." to be abstract.
Again, note that there should be no negative values for Interval because I am computing the Interval time within a task.
Thanks in advance for any help!
Hanjoe.
data want;
set have;
by id task ;
interval=ifn(first.task,.,dif(session));
run;
Haikuo
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.