## Difference between values in two rows

Frequent Contributor
Posts: 84

# Difference between values in two rows

Hi All,

I have a dataset with 20 variables and 16 subjects, but each subject has two rows since I have two timepoints (1 and 2) for each subject. I want to add a row for each subject which marked as timepoint 3, but the value is the difference of (timepoint 2 - timepoint 1). for example:

subject id  timepoint      lab1      lab2   lab3  lab4    lab5

1                1                   0.5       0.6     15     18       12

1                2                  0.4        0.6      12     20      18

1                3                  -0.1       0.0      -3     2         6

2                1                  0.9        1.3     14      18      21

2                2                  0.3        1.7      19      22     14

2                3                  -0.6        0.4     5        4       -7

the rows marked in red are what I want.

any idea?

Thanks all,

Chen

Super User
Posts: 21,992

## Re: Difference between values in two rows

``````Output;

Lab1=dif(lab1); lab2=dif(lab2);.......etc;

If timePoint = 2 then do;
TimePoint=3;
Output;
End;``````

Explicitly output the records.

Use DIF to calculate the difference.

Occasional Contributor
Posts: 10

## Re: Difference between values in two rows

Can this technique be tweaked to always calculate the dif from the first obs rather than the preceding obs?

Super User
Posts: 21,992

## Re: Difference between values in two rows

kryden wrote:

Can this technique be tweaked to always calculate the dif from the first obs rather than the preceding obs?

Which technique?

Assuming mine, not quite but there are easier ways for the calculating the difference from the first observation.

Rather than using the DIF() function you can use the RETAIN function to hold the value across rows. So set it on the first observation or first. record and use that. Untested and probably doesn't account for the first record correctly.

``````retain first_obs;

if first.id then first_obs = value;

dif = value - first_obs;

``````
Occasional Contributor
Posts: 10

## Re: Difference between values in two rows

```data phys_diff;
set phys1;
AGG_PHYS_Mean = dif(AGG_PHYS_Mean);
AGG_PHYS_StdDev = dif(AGG_PHYS_StdDev);
AGG_PHYS_Median = dif(AGG_PHYS_Median);
AGG_PHYS_Q1 = dif(AGG_PHYS_Q1);
AGG_PHYS_Q3 = dif(AGG_PHYS_Q3);
AGG_PHYS_Min = dif(AGG_PHYS_Min);
AGG_PHYS_Max = dif(AGG_PHYS_Max);

run;```

Above is what I started with.

Works great except the new values are the differences from the immediately preceding obs.

How do I adapt what you just wrote to calc the difference from baseline for each var?  Not represented here is that this is done by visit.

Super User
Posts: 21,992

## Re: Difference between values in two rows

You replicate the code I have for each variable. You can list multiple in the retain statement but the assignment statements have to happen for each group. I'm assuming you're also do that difference across patients or groups? Ie, you need to determine the baseline for multiple groups?

You should post this as a new question with sample data.

This is probably what you want, it's a different way but probably just as quick:

https://communities.sas.com/t5/Base-SAS-Programming/Calculate-a-difference-from-quot-baseline-quot-d...

Occasional Contributor
Posts: 10

## Re: Difference between values in two rows

Will post as new question.

Thanks for the help!

Occasional Contributor
Posts: 10

Super User
Posts: 8,798

## Re: Difference between values in two rows

Post test data in the form of a datastep!!

As such, this is only theory:

```data inter;
merge have (where=(timepoint=1))
have (where=(timepoint=2) rename=(lab1=lbx1 lab2=lbx2...));
by subject_id;
timepoint=3;
lab1=lab1-lbx1;
lab2=lab2-lbx2;
...;
run;

data want;
set have inter;
run;

proc sort data=want;
by id timepoint;
run;```
PROC Star
Posts: 7,858

## Re: Difference between values in two rows

Same solution as @Reeza but, to save keystrokes (and reduce the chance of making a typo), I'd include an array:

```data want;
input subject_id  timepoint lab1-lab5;
array labs(*) lab1-lab5;
Output;

do i=1 to 5;
Labs(i)=dif(labs(i));
end;

If timePoint = 2 then do;
TimePoint=3;
Output;
end;
cards;
1  1   0.5    0.6    15     18   12
1  2   0.4    0.6    12     20   18
2  1   0.9    1.3    14     18   21
2  2   0.3    1.7    19     22   14
;
```

Art, CEO, AnalystFinder.com

Discussion stats
• 9 replies
• 757 views
• 0 likes
• 5 in conversation