DATA Step, Macro, Functions and more

Difference from Baseline Calculation

Reply
Occasional Contributor
Posts: 10

Difference from Baseline Calculation

Hey all,

 

So I have some output from PROC MEANS where I have some basic values spread across four visits.  The first visit is considered the baseline visit.

I am trying to crate a dataset that contains the difference from baseline for each variable at each visit:

example.png

Anyone have any thoughts on how I can do this?

 

PROC Star
Posts: 8,164

Re: Difference from Baseline Calculation

[ Edited ]

Here is one way:

data have;
informat visit $25.;
input visit & n Mean SD Med Q1 Q3 Min Max;
cards;
Baseline Visit 48 33.8 9.6 32.6 26.2 38.1 18.6 59.7
6 Week Follow-up 35 37.8 7.1 38.1 31.2 42.2 22.8 52.5
3 Month Follow-up 25 42.3 8 42.4 37.3 47.8 18.4 56.4
6 Month Follow-up 2 54.5 1.5 54.5 53.5 55.6 53.5 55.6
;

data want (drop=_:);
set have;
retain _Mean _SD _Med _Q1 _Q3 _Min _Max;
if _n_ eq 1 then do;
_Mean=Mean;
_SD=SD;
_Med=Med;
_Q1=Q1;
_Q3=Q3;
_Min=Min;
_Max=Max;
end;
else do;
Mean+-_Mean;
SD+-_SD;
Med+-_Med;
Q1+-_Q1;
Q3+-_Q3;
Min+-_Min;
Max+-_Max;
output;
end;
run;

Art, CEO, AnalystFinder.com

 

Note: Corrected to drop retained variables

Trusted Advisor
Posts: 1,337

Re: Difference from Baseline Calculation

[ Edited ]

This is an alternative to @art297's suggestion.  It takes advantage of the DIF function (where dif(x) is defined as x-lag(x)), and interspersing a baseline observation between each of the other visits:

 

 

data have;
 informat visit $25.;
 input visit & n Mean SD Med Q1 Q3 Min Max;
 cards;
Baseline Visit           48 33.8 9.6 32.6 26.2 38.1 18.6 59.7
6 Week Follow-up         35 37.8 7.1 38.1 31.2 42.2 22.8 52.5
3 Month Follow-up        25 42.3 8 42.4 37.3 47.8 18.4 56.4
6 Month Follow-up        2 54.5 1.5 54.5 53.5 55.6 53.5 55.6
run;


data want (drop=i);
  set have (where=(visit=:'B'))
      have (where=(visit=:'6 Week') in=in6w)
      have (where=(visit=:'B'))
      have (where=(visit=:'3 M') in=in3m)
      have (where=(visit=:'B'))
      have (where=(visit=:'6 M') in=in6m);
  array x {*} mean--max;
  do i=1 to dim(x);
    x{i}=dif(x{i}); 
  end;
  if in6w or in3m or in6m;
run;

 

The SET statement interleaves 6 subsets of HAVE, where one subset (visit=:'B') is read 3 times, once preceding the visit=:'6 W', once before visit=:'3 M' and once preceding '6 M'.  This means the DIF function, which calculates x-lag(X), will be generating

     X at 6 weeks minus X at baseline

     X at 3 months minus X at baseline
     X at 6 months minus X at baseline

 

The   "where" parameters use the operator  "=:", which says to take the shorter string and compare it only to the start of the longer string.

Ask a Question
Discussion stats
  • 2 replies
  • 196 views
  • 3 likes
  • 3 in conversation