BookmarkSubscribeRSS Feed
ANKH1
Pyrite | Level 9

Hi, there is the following dataset.

data dsin;
input ID$ pair$ type$ time_point bp;
datalines;
1 1 a 3 111
1 1 a 6 134
2 1 b 3 110
2 1 b 9 123
3 2 a 6 131
3 2 a 12 102
3 2 a 18 120
4 2 b 12 121
4 2 b 18 123
;
run;

The need is to plot differences for the variable "bp" between IDs that belong to the same pair at these time_points: 3, 6, 9, 12, 15, 18. There will be instances when there is no data in a time_point but we want to record that time_point difference as missing. Also, if one ID is missing data at a time_point, the difference will be recorded as missing. This is the desired output:


data dsout;
input ID$ pair$ type$ Diff_3 Diff_6 Diff_9 Diff_12 Diff_15 Diff_18;
datalines;
1 1 a 1 . . . .
2 1 b 1 . . . .
3 2 a . . . 19 . 3
4 2 b . . . 19 . 3
;
run;

Can you please help with a code that will produce the desired output? I am stuck after transposing the data from long to wide. Thank you.

4 REPLIES 4
ballardw
Super User

For plotting it is likely that long data with a variable indicating which "diff" would be easier to plot in a meaningful way.

 

Why don't you show a value for either Diff_3 or Diff_6 for Id=1? It has values at successive time points.

How do you get 19 or 3 for the Diff values shown for Id=3 and 4?

ANKH1
Pyrite | Level 9
Thanks for your message. ID=1 and ID=2 belong to the same pair and they both only have data for time_point = 3. Therefore, the difference between 111 and 110 is 1. The difference for time_point=12 for IDs 3 and 4 is 19 (102 -121 =19 (absolute value)).
ballardw
Super User

@ANKH1 wrote:
Thanks for your message. ID=1 and ID=2 belong to the same pair and they both only have data for time_point = 3. Therefore, the difference between 111 and 110 is 1. The difference for time_point=12 for IDs 3 and 4 is 19 (102 -121 =19 (absolute value)).

I submit that you should go back to your original post an include a description of exactly how to a "pair" is defined.

And also state that you expect an ABSOLUTE value of the difference.

T

 

Since your example for each Id value includes two or more values for each ID then that seems to be the natural interpretation of "pair" is "two sequential values in the same ID".

 From Your example data how do we know that Id 2 and 3 are not to be considered a "pair"?

data dsin;
input ID$ pair$ type$ time_point bp;
datalines;
1 1 a 3 111
1 1 a 6 134
2 1 b 3 110
2 1 b 9 123
3 2 a 6 131
3 2 a 12 102
3 2 a 18 120
4 2 b 12 121
4 2 b 18 123
;
run;

 

Ksharp
Super User
data dsin;
input ID$ pair$ type$ time_point bp;
datalines;
1 1 a 3 111
1 1 a 6 134
2 1 b 3 110
2 1 b 9 123
3 2 a 6 131
3 2 a 12 102
3 2 a 18 120
4 2 b 12 121
4 2 b 18 123
;
run;
data temp;
 merge dsin(where=(type='a') in=ina) 
 dsin(where=(_type='b') rename=(id=_id bp=_bp type=_type) in=inb);
 by pair time_point;
 diff=_bp-bp;
if ina and inb;
run;
data temp2;
 set temp;
 output;
 id=_id;type=_type;output;
 drop _: bp;
run;
proc sort data=temp2;
by id pair type;
run;
proc transpose data=temp2 out=want(drop=_NAME_) prefix=Diff_;
by id pair type;
var diff;
id time_point;
run;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1392 views
  • 0 likes
  • 3 in conversation