BookmarkSubscribeRSS Feed
echoli
Obsidian | Level 7

Hi All,

 

I have a dataset with 20 variables and 16 subjects, but each subject has two rows since I have two timepoints (1 and 2) for each subject. I want to add a row for each subject which marked as timepoint 3, but the value is the difference of (timepoint 2 - timepoint 1). for example:

 

subject id  timepoint      lab1      lab2   lab3  lab4    lab5

1                1                   0.5       0.6     15     18       12

1                2                  0.4        0.6      12     20      18

1                3                  -0.1       0.0      -3     2         6

2                1                  0.9        1.3     14      18      21

2                2                  0.3        1.7      19      22     14

2                3                  -0.6        0.4     5        4       -7

the rows marked in red are what I want.

 

any idea?

 

Thanks all,

Chen

9 REPLIES 9
Reeza
Super User
Output;

Lab1=dif(lab1); lab2=dif(lab2);.......etc;

If timePoint = 2 then do;
TimePoint=3;
Output;
End;

Explicitly output the records. 

Use DIF to calculate the difference. 

kryden
Calcite | Level 5

Can this technique be tweaked to always calculate the dif from the first obs rather than the preceding obs?

Reeza
Super User

@kryden wrote:

Can this technique be tweaked to always calculate the dif from the first obs rather than the preceding obs?


Which technique?

 

Assuming mine, not quite but there are easier ways for the calculating the difference from the first observation. 

Rather than using the DIF() function you can use the RETAIN function to hold the value across rows. So set it on the first observation or first. record and use that. Untested and probably doesn't account for the first record correctly. 

 

retain first_obs;

if first.id then first_obs = value;

dif = value - first_obs;

kryden
Calcite | Level 5

 

data phys_diff;
	set phys1;
	AGG_PHYS_Mean = dif(AGG_PHYS_Mean);
	AGG_PHYS_StdDev = dif(AGG_PHYS_StdDev);
	AGG_PHYS_Median = dif(AGG_PHYS_Median);
	AGG_PHYS_Q1 = dif(AGG_PHYS_Q1);
	AGG_PHYS_Q3 = dif(AGG_PHYS_Q3);
	AGG_PHYS_Min = dif(AGG_PHYS_Min);
	AGG_PHYS_Max = dif(AGG_PHYS_Max);

run;

Above is what I started with.

 

 

Works great except the new values are the differences from the immediately preceding obs.

How do I adapt what you just wrote to calc the difference from baseline for each var?  Not represented here is that this is done by visit.

Reeza
Super User

You replicate the code I have for each variable. You can list multiple in the retain statement but the assignment statements have to happen for each group. I'm assuming you're also do that difference across patients or groups? Ie, you need to determine the baseline for multiple groups? 

 

You should post this as a new question with sample data. 

This is probably what you want, it's a different way but probably just as quick:

https://communities.sas.com/t5/Base-SAS-Programming/Calculate-a-difference-from-quot-baseline-quot-d...

kryden
Calcite | Level 5
Will post as new question.



Thanks for the help!


kryden
Calcite | Level 5
Link to New topic
 
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Post test data in the form of a datastep!!

 

As such, this is only theory:

data inter;
  merge have (where=(timepoint=1))
            have (where=(timepoint=2) rename=(lab1=lbx1 lab2=lbx2...));
  by subject_id;
  timepoint=3;
  lab1=lab1-lbx1;
  lab2=lab2-lbx2;
  ...;
run;

data want;
  set have inter;
run;

proc sort data=want;
  by id timepoint;
run;
art297
Opal | Level 21

Same solution as @Reeza but, to save keystrokes (and reduce the chance of making a typo), I'd include an array:

 

data want;
  input subject_id  timepoint lab1-lab5;
  array labs(*) lab1-lab5;
  Output;

  do i=1 to 5;
    Labs(i)=dif(labs(i));
  end;

  If timePoint = 2 then do;
    TimePoint=3;
    Output;
  end;
  cards;
1  1   0.5    0.6    15     18   12
1  2   0.4    0.6    12     20   18
2  1   0.9    1.3    14     18   21
2  2   0.3    1.7    19     22   14
;

Art, CEO, AnalystFinder.com

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 6767 views
  • 0 likes
  • 5 in conversation