Help using Base SAS procedures

Creating a variable that contains values that are computed within clusters

Reply
Occasional Contributor
Posts: 8

Creating a variable that contains values that are computed within clusters

Hi all,

I want to add a variable to my data to do some analyses but I am not sure how to do it.

The additional variable I need is the interval time between sessions.

The tricky part is that I have nested data (level 1=episode, level 2=task, & level 3=ID) and I need to compute the interval time within the smallest cluster (between episodes).

I guess it will be best to show my data I have and then try to explain how I want the data format to be.

Here is part of my data (have).

data have;

input ID $ task episode Session;

datalines;

1 1 4 14.31

1 1 6 40.51

1 2 3 7.09

1 2 5 14.69

1 2 10 61.23

1 2 14 137.16

1 2 16 160.66

1 2 20 200.93

1 2 22 236.07

1 2 29 330.68

1 2 33 380.67

1 2 37 435.19

1 3 17 113.78

1 3 29 273.77

1 4 3 11.09

1 4 7 142.75

1 4 11 173.65

1 4 43 645.13

2 1 4 66.38

2 1 6 93.22

2 1 8 105.6

2 2 15 121.76

2 2 24 168.51

2 2 28 196.26

2 2 32 216.11

2 2 34 247.21

2 2 38 317.76

2 2 42 347.82

2 2 50 460.93

2 2 52 463.93

2 2 56 528.77

;

So what I want to do is create a variable called "Interval" which contains the difference between Sessions.

Note that I want to compute the Interval WITHIN the tasks.

It is difficult for me to clearly say what I want, so let me show you the final data set (want) that I have in my mind with the have data above.

ID task episode Session Interval

1 1 4 14.31 NA

1 1 6 40.51 (40.51-14.31)

1 2 3 7.09 NA

1 2 5 14.69 (14.69-7.09)

1 2 10 61.23 (61.23-14.69)

1 2 14 137.16 (137.16-61.23)

1 2 16 160.66 …

1 2 20 200.93 …

1 2 22 236.07 …

1 2 29 330.68 …

1 2 33 380.67 …

1 2 37 435.19 …

1 3 17 113.78 NA

1 3 29 273.77 (273.77-113.78)

1 4 3 11.09 NA

1 4 7 142.75 (142.75-11.09)

1 4 11 173.65 (173.65-142.75)

1 4 43 645.13 (645.13-173.65)

2 1 4 66.38 NA

2 1 6 93.22 (93.22-66.38)

2 1 8 105.6 (105.6-93.22)

2 2 15 121.76 NA

2 2 24 168.51 (168.51-121.76)

2 2 28 196.26 (196.26-168.51)

2 2 32 216.11 …

2 2 34 247.21 …

2 2 38 317.76 …

2 2 42 347.82 …

2 2 50 460.93 …

2 2 52 463.93 …

2 2 56 528.77 …

Note that for the variable Interval, I have put in the calculations but I want the actual value in the parenthesis.

I have put in "NA" (it doesn't have to be NA, it can/should be any number) for the first episode in a task since it does not have a preceding episode.

I have also omitted some values of Interval by "..." to be abstract.

Again, note that there should be no negative values for Interval because I am computing the Interval time within a task.

Thanks in advance for any help!

Hanjoe.

Respected Advisor
Posts: 3,156

Re: Creating a variable that contains values that are computed within clusters

Posted in reply to HanJoeKim

data want;

  set have;

  by id task ;

interval=ifn(first.task,.,dif(session));

run;

Haikuo

Ask a Question
Discussion stats
  • 1 reply
  • 117 views
  • 0 likes
  • 2 in conversation