BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
MRM95
Calcite | Level 5

Hello SAS users,

 

I have the following data, for which I am trying to build a specific boxplot.

I am nearly there, I just need to be able to display the difference between the connected means in each panel. Does anyone know how to do it? Many thanks!

 

Data:

data kccq_data;
length Treatment $ 15;
input ID $ KCCQ_var Treatment $ Timepoint;
datalines;
1 15 Implant 1
2 24 Implant 1
3 34 Implant 1
4 44 Implant 1
5 54 Implant 1
6 65 Implant 1
7 73 Implant 1
8 83 Implant 1
9 95.5 Implant 1
10 20 Implant 7
11 31 Implant 7
12 45.5 Implant 7
13 55.5 Implant 7
14 51 Implant 7
15 65.5 Implant 7
16 82 Implant 7
17 82 Implant 7
18 72 Implant 7
19 60 Implant 7
20 20 Implant 7
1 10 Control 1
2 22 Control 1
3 30 Control 1
4 42 Control 1
5 50 Control 1
6 80 Control 1
7 71 Control 1
8 60 Control 1
9 90 Control 1
10 72 Control 1
11 51 Control 7
12 30 Control 7
13 50 Control 7
14 70 Control 7
15 51 Control 7
16 60 Control 7
17 20 Control 7
18 11 Control 7
19 20 Control 7
20 30 Control 7
;
run;

 

Code to produce boxplot so far:

 

PROC SGPANEL DATA = kccq_data;
PANELBY Treatment / columns = 2 novarname;
VBOX KCCQ_var / category = Timepoint GROUP= Treatment connect=mean;
keylegend / title = "Treatment arm";
ROWAXIS label='KCCQ Overall Summary Score';
RUN;

 

MRM95_0-1701252935599.png

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
LeliaM
SAS Employee

You can use an INSET statement to display the difference value in the graph. Below is code that shows you how to calculate the difference.

 

data kccq_data;

length Treatment $ 15;

input ID $ KCCQ_var Treatment $ Timepoint;

datalines;

1 15 Implant 1

2 24 Implant 1

3 34 Implant 1

4 44 Implant 1

5 54 Implant 1

6 65 Implant 1

7 73 Implant 1

8 83 Implant 1

9 95.5 Implant 1

10 20 Implant 7

11 31 Implant 7

12 45.5 Implant 7

13 55.5 Implant 7

14 51 Implant 7

15 65.5 Implant 7

16 82 Implant 7

17 82 Implant 7

18 72 Implant 7

19 60 Implant 7

20 20 Implant 7

1 10 Control 1

2 22 Control 1

3 30 Control 1

4 42 Control 1

5 50 Control 1

6 80 Control 1

7 71 Control 1

8 60 Control 1

9 90 Control 1

10 72 Control 1

11 51 Control 7

12 30 Control 7

13 50 Control 7

14 70 Control 7

15 51 Control 7

16 60 Control 7

17 20 Control 7

18 11 Control 7

19 20 Control 7

20 30 Control 7

;

run;

proc sort data=kccq_data out=kccq_data;

 by treatment timepoint;

run;

 /* calculate mean for each treatment & timepoint */

proc means data=kccq_data;

 var kccq_var;

 by treatment timepoint;

 output out=mean mean=mean;

run;

proc sort data=mean out=result;

  by treatment;

run;

 

 /* Calculate Difference Between Mean Values For Each Treatment & Timepoint */

/* This code assumes there are two treatments. You can modify this code to make it more dynamic */

 

 data new (keep=diff treatment timepoint) ;

 label diff='Diff between means is ';

 format diff 5.2;

 set result;

   lag1=lag(mean);

   lag2=lag2(mean);

 if _n_=2 then do;

   diff=lag1-mean;

   output;

 end;

 if _n_=4 then do;

   diff=lag1-mean;

 output;

 end;

 run;

 

data all;

  merge new kccq_data;

  by treatment  ;

run;

 

proc sgpanel data = all;

  panelby treatment / columns = 2 novarname;

  vbox kccq_var / category = timepoint group= treatment connect=mean;

  keylegend / title = "Treatment Arm";

  rowaxis label='KCCQ Overall Summary Score';

  inset  diff/position=ne;

run;

View solution in original post

2 REPLIES 2
Ksharp
Super User
/*
You need to calculated it by hand.
*/

data kccq_data;
length Treatment $ 15;
infile cards truncover;
input ID $ KCCQ_var Treatment $ Timepoint mean_diff;
label mean_diff='diff between means is ';
datalines;
1 15 Implant 1  100
2 24 Implant 1
3 34 Implant 1
4 44 Implant 1
5 54 Implant 1
6 65 Implant 1
7 73 Implant 1
8 83 Implant 1
9 95.5 Implant 1
10 20 Implant 7
11 31 Implant 7
12 45.5 Implant 7
13 55.5 Implant 7
14 51 Implant 7
15 65.5 Implant 7
16 82 Implant 7
17 82 Implant 7
18 72 Implant 7
19 60 Implant 7
20 20 Implant 7
1 10 Control 1    2000
2 22 Control 1
3 30 Control 1
4 42 Control 1
5 50 Control 1
6 80 Control 1
7 71 Control 1
8 60 Control 1
9 90 Control 1
10 72 Control 1
11 51 Control 7
12 30 Control 7
13 50 Control 7
14 70 Control 7
15 51 Control 7
16 60 Control 7
17 20 Control 7
18 11 Control 7
19 20 Control 7
20 30 Control 7
;
run;


PROC SGPANEL DATA = kccq_data;
PANELBY Treatment / columns = 2 novarname;
VBOX KCCQ_var / category = Timepoint GROUP= Treatment connect=mean;
keylegend / title = "Treatment arm";
ROWAXIS label='KCCQ Overall Summary Score';
inset mean_diff/position=ne;
RUN;

Ksharp_0-1701259307520.png

 

LeliaM
SAS Employee

You can use an INSET statement to display the difference value in the graph. Below is code that shows you how to calculate the difference.

 

data kccq_data;

length Treatment $ 15;

input ID $ KCCQ_var Treatment $ Timepoint;

datalines;

1 15 Implant 1

2 24 Implant 1

3 34 Implant 1

4 44 Implant 1

5 54 Implant 1

6 65 Implant 1

7 73 Implant 1

8 83 Implant 1

9 95.5 Implant 1

10 20 Implant 7

11 31 Implant 7

12 45.5 Implant 7

13 55.5 Implant 7

14 51 Implant 7

15 65.5 Implant 7

16 82 Implant 7

17 82 Implant 7

18 72 Implant 7

19 60 Implant 7

20 20 Implant 7

1 10 Control 1

2 22 Control 1

3 30 Control 1

4 42 Control 1

5 50 Control 1

6 80 Control 1

7 71 Control 1

8 60 Control 1

9 90 Control 1

10 72 Control 1

11 51 Control 7

12 30 Control 7

13 50 Control 7

14 70 Control 7

15 51 Control 7

16 60 Control 7

17 20 Control 7

18 11 Control 7

19 20 Control 7

20 30 Control 7

;

run;

proc sort data=kccq_data out=kccq_data;

 by treatment timepoint;

run;

 /* calculate mean for each treatment & timepoint */

proc means data=kccq_data;

 var kccq_var;

 by treatment timepoint;

 output out=mean mean=mean;

run;

proc sort data=mean out=result;

  by treatment;

run;

 

 /* Calculate Difference Between Mean Values For Each Treatment & Timepoint */

/* This code assumes there are two treatments. You can modify this code to make it more dynamic */

 

 data new (keep=diff treatment timepoint) ;

 label diff='Diff between means is ';

 format diff 5.2;

 set result;

   lag1=lag(mean);

   lag2=lag2(mean);

 if _n_=2 then do;

   diff=lag1-mean;

   output;

 end;

 if _n_=4 then do;

   diff=lag1-mean;

 output;

 end;

 run;

 

data all;

  merge new kccq_data;

  by treatment  ;

run;

 

proc sgpanel data = all;

  panelby treatment / columns = 2 novarname;

  vbox kccq_var / category = timepoint group= treatment connect=mean;

  keylegend / title = "Treatment Arm";

  rowaxis label='KCCQ Overall Summary Score';

  inset  diff/position=ne;

run;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 739 views
  • 2 likes
  • 3 in conversation