Difference from Baseline Calculation

Occasional Contributor
Posts: 10

Difference from Baseline Calculation

Hey all,

So I have some output from PROC MEANS where I have some basic values spread across four visits.  The first visit is considered the baseline visit.

I am trying to crate a dataset that contains the difference from baseline for each variable at each visit:

Anyone have any thoughts on how I can do this?

PROC Star
Posts: 8,164

Re: Difference from Baseline Calculation

[ Edited ]

Here is one way:

```data have; informat visit \$25.; input visit & n Mean SD Med Q1 Q3 Min Max; cards;Baseline Visit   48 33.8 9.6 32.6 26.2 38.1 18.6 59.76 Week Follow-up   35 37.8 7.1 38.1 31.2 42.2 22.8 52.53 Month Follow-up    25 42.3 8 42.4 37.3 47.8 18.4 56.46 Month Follow-up    2 54.5 1.5 54.5 53.5 55.6 53.5 55.6;data want (drop=_:); set have; retain _Mean _SD _Med _Q1 _Q3 _Min _Max; if _n_ eq 1 then do; _Mean=Mean; _SD=SD; _Med=Med; _Q1=Q1; _Q3=Q3; _Min=Min; _Max=Max; end; else do; Mean+-_Mean; SD+-_SD; Med+-_Med; Q1+-_Q1; Q3+-_Q3; Min+-_Min; Max+-_Max; output; end;run;
```

Art, CEO, AnalystFinder.com

Note: Corrected to drop retained variables

Posts: 1,337

Re: Difference from Baseline Calculation

[ Edited ]

This is an alternative to @art297's suggestion.  It takes advantage of the DIF function (where dif(x) is defined as x-lag(x)), and interspersing a baseline observation between each of the other visits:

``````data have;
informat visit \$25.;
input visit & n Mean SD Med Q1 Q3 Min Max;
cards;
Baseline Visit           48 33.8 9.6 32.6 26.2 38.1 18.6 59.7
6 Week Follow-up         35 37.8 7.1 38.1 31.2 42.2 22.8 52.5
3 Month Follow-up        25 42.3 8 42.4 37.3 47.8 18.4 56.4
6 Month Follow-up        2 54.5 1.5 54.5 53.5 55.6 53.5 55.6
run;

data want (drop=i);
set have (where=(visit=:'B'))
have (where=(visit=:'6 Week') in=in6w)
have (where=(visit=:'B'))
have (where=(visit=:'3 M') in=in3m)
have (where=(visit=:'B'))
have (where=(visit=:'6 M') in=in6m);
array x {*} mean--max;
do i=1 to dim(x);
x{i}=dif(x{i});
end;
if in6w or in3m or in6m;
run;
``````

The SET statement interleaves 6 subsets of HAVE, where one subset (visit=:'B') is read 3 times, once preceding the visit=:'6 W', once before visit=:'3 M' and once preceding '6 M'.  This means the DIF function, which calculates x-lag(X), will be generating

X at 6 weeks minus X at baseline

X at 3 months minus X at baseline
X at 6 months minus X at baseline

The   "where" parameters use the operator  "=:", which says to take the shorter string and compare it only to the start of the longer string.

Discussion stats
• 2 replies
• 196 views
• 3 likes
• 3 in conversation