Hi folks and @ballardw ,
I'm trying to convert DATA HAVE to DESIRED OUTPUT table shown in the image below.
The major data re-structuring have been resolved previously here:
Now, I'd like to present statistically significant direct or inverse effects using orange (any warm color) and green, using a logic:
IF LCL >1 AND UCL >1 THEN CELL COLOR=ORANGE; ELSE
IF LCL <1 AND UCL <1 THEN CELL COLOR=GREEN;
ELSE CELL COLOR=TRANSPARENT;
It would be great if there's a way to output the final colored table to Excel sheet. Is that possible? My actual data is huge and I've been coloring cells after copy paste to excel. I caught myself making mistakes in the process of painting the cells from eye-balling. Not only was it tiring but was inducing human errors.
I'll highly appreciate your time and suggestions.
DATA HAVE;
LENGTH PARAMETER $10.;
INPUT PARAMETER $ SITES $ Level1 RR LCL UCL;
length col $ 5 row $ 20;
retain col ;
If parameter in ('PM25' 'NO2') then col=parameter;
if level1=. then row='Adjusted RR';
else row= catx(' ','Smoking Level',level1);
CARDS;
PM25 LIVER . 1.02 1.01 1.02
SMOKE_CAT LIVER 2 0.86 0.79 0.93
SMOKE_CAT LIVER 3 1.21 1.08 1.34
SMOKE_CAT LIVER 4 1.32 1.10 1.54
SMOKE_CAT LIVER 5 1.25 1.07 1.43
NO2 LIVER . 1.41 1.14 1.68
SMOKE_CAT LIVER 2 0.81 0.63 0.99
SMOKE_CAT LIVER 3 1.60 1.23 1.97
SMOKE_CAT LIVER 4 0.59 0.63 0.54
SMOKE_CAT LIVER 5 0.98 0.96 0.99
PM25 STOMACH . 1.88 1.35 2.40
SMOKE_CAT STOMACH 2 0.34 0.23 0.45
SMOKE_CAT STOMACH 3 2.06 1.44 2.68
SMOKE_CAT STOMACH 4 0.98 0.96 0.99
SMOKE_CAT STOMACH 5 2.25 1.53 2.97
NO2 STOMACH . 2.34 1.57 3.11
SMOKE_CAT STOMACH 2 2.43 1.61 3.25
SMOKE_CAT STOMACH 3 0.58 0.56 0.59
SMOKE_CAT STOMACH 4 2.62 1.70 3.54
SMOKE_CAT STOMACH 5 2.71 1.74 3.68
;
run;
proc report data=have;
column sites row col,(rr lcl ucl);
define sites /group;
define row /group;
define col /across order=data "";
run;
Hi:
Assuming you wanted all 3 values under each across group to have the same background based on your logic, then this is possible using PROC REPORT and ODS EXCEL:
I like to test all conditions, so I tested both 1 and missing for the values of LCL and UCL to make sure the coloring came out right. This entailed changing the data slightly.
Then the top part of the program was mostly the same:
Except for making a "dummycalc" variable on the report row so I could just have 1 compute block in which to change the cell colors on each report row.
Then the bottom half of the program had the big compute block. I used Absolute Column Numbers and CALL DEFINE statements to change the background:
The helper variable SVARP was used for the style override in the CALL DEFINE for the PM25 related columns and the SVARN helper variable was used for the style override in the CALL DEFINE for the NO2 related columns.
Hope this helps,
Cynthia
Hi:
If you ONLY wanted to do trafficlighting based on each cell's value, then you could do that with a User-defined format. But to do traffic-lighting based on 2 variable values (such as you show in your logic, where both LCL and UCL come into play), will require a CALL DEFINE statement. See this paper: http://support.sas.com/resources/papers/proceedings14/SAS388-2014.pdf starting on pages 10 and 11 with Output 10. That example had a user-defined format AND a CALL DEFINE statement. I think you'll just need a CALL DEFINE.
That should get you started. Since you have LCL and UCL under each ACROSS item, you'll have to use Absolute Column names in your CALL DEFINE, so the explanation of Absolute Columns with PROC REPORT will be useful to you.
What's not clear to me is whether your posted logic applies to all 3 variables. You posted this:
IF LCL >1 AND UCL >1 THEN CELL COLOR=ORANGE; ELSE
IF LCL <1 AND UCL <1 THEN CELL COLOR=GREEN;
ELSE CELL COLOR=TRANSPARENT;
but what do you mean by cell color transparent??? a white background??? Your logic applies to all 3 variables, RR, UCL and LCL under PM25 and under NO2 -- is that correct?
Cynthia
Hi:
Assuming you wanted all 3 values under each across group to have the same background based on your logic, then this is possible using PROC REPORT and ODS EXCEL:
I like to test all conditions, so I tested both 1 and missing for the values of LCL and UCL to make sure the coloring came out right. This entailed changing the data slightly.
Then the top part of the program was mostly the same:
Except for making a "dummycalc" variable on the report row so I could just have 1 compute block in which to change the cell colors on each report row.
Then the bottom half of the program had the big compute block. I used Absolute Column Numbers and CALL DEFINE statements to change the background:
The helper variable SVARP was used for the style override in the CALL DEFINE for the PM25 related columns and the SVARN helper variable was used for the style override in the CALL DEFINE for the NO2 related columns.
Hope this helps,
Cynthia
Thank you very much for your help. This produced very beautiful and neat table! Took my breath away!
My actual data has 8 more items in addition to PM25 and NO2. In other words I have 10 'col' including PM25 and NO2 and 40 'sites'. Since I have to define triple _c_ for each items, I will have _c30_, _31_, _c32_ for the last 'col'. Am I getting it right?
There is absolutely no shortcut to that? Easier just to show, sometimes. Please see the image attached. This shows how my data looked like before I bring in SMOKE_CAT.
I also noticed that tablekept its colors when copy and paste to excel. Nice!
Hi:
Here's an example of a macro program to generate the IF statements you'd need inside a COMPUTE block. It is important to realize that in this instance, the macro program is only "typing" code or generating text. The macro program is not executing anything. If you have 2 ACROSS variable values, then as long as you have generated &NUM correctly, you'll get the IF statements you need for 2 ACROSS values. If you have 9 ACROSS variable values, then again, if &NUM=9 and you specify the parameters correctly, then the macro program would type the IFs for 9 ACROSS values.
Understanding the logic that PROC REPORT needs and understanding how a macro program works will be necessary to make this work.
Hope it helps,
Cynthia
** 1) use same program as posted to get work.have data;
** 2) get unique values for COL (PM25, NO2) into a dataset;
proc sort data=have(keep=col) out=unique nodupkey;
by _all_;
run;
** 3) make a macro variable &NUM to hold the number of unique columns;
** could do this step with SQL too;
DATA _NULL_;
SET unique end=eof;
BY col ;
IF FIRST.col THEN count+1;
if eof then CALL SYMPUT('num',TRIM(LEFT(count)));
RUN;
%put Number of unique values for across variable is &=num;
** 4) Define macro program to generate IF statements;
/* startcol= is the starting absolute number based on the variables on the
COLUMN statement. If COLUMN statement has
column var1 var2 Avar3,(uvar4 uvar5);
Then startcol=3 and under=2 and vars_on_left=2
under= number of variables under ACROSS (what is inside the parentheses)
vars_on_left= number of variables to the left of the ACROSS variable
Also need &num, which is created OUTSIDE the macro program to hold
the number of UNIQUE values for the ACROSS variable.
*/
%macro makeif(startcol=3, under=3, vars_on_left=2);
** this macro code will only work when invoked from inside a PROC REPORT step;
** must have working code before you can generate this IF for;
** multiple ACROSS values;
%DO i=&startcol %TO %EVAL(&num*&under+&vars_on_left) %BY &under;
length svar&i $100;
IF (_C%eval(&i+1)_ >1 AND _C%eval(&i+2)_ >1) THEN do;
svar&i = 'style={background=peachpuff}';
END;
ELSE IF _C%eval(&i+1)_ <1 AND _C%eval(&i+2)_ <1 THEN do;
svar&i = 'style={background=lightgreen}';
END;
ELSE DO;
svar&i = 'style={background=white}';
END;
call define("_C%eval(&i)_",'style',svar&i);
call define("_C%eval(&i+1)_",'style',svar&i);
call define("_C%eval(&i+2)_",'style',svar&i);
%end;
%mend makeif;
** 5) Have main PROC REPORT code which invokes macro program to generate;
** IF statements for the CALL DEFINES;
options mprint nosymbolgen;
ods html path='c:\temp' file='tlight.html';
ods excel file='c:\temp\tlight.xlsx';
proc report data=have;
column sites row col,(rr lcl ucl) dummycalc;
define sites /group;
define row /group;
define col /across order=data "";
define dummycalc / noprint;
define rr / analysis;
define lcl / analysis;
define ucl / analysis;
compute dummycalc;
** want dummy variable at the end of a row in order to be able to;
** "touch" all the columns on the report row in one compute block;
** instead of multiple compute blocks;
dummycalc = 1;
%makeif(startcol=3, under=3, vars_on_left=2);
endcomp;
run;
ods excel close;
ods html close;
Thanks for following up Cynthia. I'll have to learn the macro you're tutoring here. I modified your original approach to my actual data which worked out neatly. See posted code below.
Because I'm adjusting to smoking, gender seems to play an important role. Therefore, now I'd like to bring in gender into my Proc Report to have a final output table look like following: (the data is nonsensical and used to fill the cells and colors too)
Using the rules:
1. Define color choice based on the same rule using RR, UCL and LCL;
2. Use only colored RR
3. Stratify RR by gender where each RR has male and female by exposure groups and cancer types. In the table RR is all gender combined. M=Male and F=female columns are gender specific RRs.
Cynthia, do you think this is doable with proc report? How much tweak work would it require? Doing it manually in excel would introduce a heap of human error for 42 cancer types and 12 exposures now by gender and smoke categories.
proc report data=ACTUAL_DATA;
column CANCER_TYPE row altsite col,(rr lcl ucl) dummycalc;
define CANCER_TYPE/ group noprint;
define row/group;
define col /across order=data "";
define altsite/ 'Cancer Sites' group;
define dummycalc /noprint;
define rr/analysis;
define lcl/analysis;
define ucl/analysis;
compute dummycalc;
length svarp svarn svara svarb svarc svard svare svarf svarg svarh svari svarj $100;
dummycalc=1;
IF (_c5_>1 AND _c6_>1) THEN do; svarp='style={background=peachpuff}'; END;
ELSE IF _c5_<1 AND _c6_<1 THEN do; svarp='style={background=lightgreen}'; END;
ELSE DO; svarp='style={background=white}'; END;
IF (_c8_>1 AND _c9_>1) THEN do; svarn='style={background=peachpuff}'; END;
ELSE IF _c8_<1 AND _c9_<1 THEN do; svarn='style={background=lightgreen}'; END;
ELSE DO; svarn='style={background=white}'; END;
IF (_c11_>1 AND _c12_>1) THEN DO; svara='style={background=peachpuff}'; END;
ELSE IF (_c11_<1 AND _c12_<1) THEN DO; svara='style={background=lightgreen}'; END;
ELSE DO; svara='style={background=white}'; END;
IF (_c14_>1 AND _c15_>1) THEN DO; svarb='style={background=peachpuff}'; END;
ELSE IF (_c14_<1 AND _c15_<1) THEN DO; svarb='style={background=lightgreen}'; END;
ELSE DO; svarb='style={background=white}'; END;
IF (_c17_>1 AND _c18_>1) THEN DO; svarc='style={background=peachpuff}'; END;
ELSE IF (_c17_<1 AND _c18_<1) THEN DO; svarc='style={background=lightgreen}'; END;
ELSE DO; svarc='style={background=white}'; END;
IF (_c20_>1 AND _c21_>1) THEN DO; svard='style={background=peachpuff}'; END;
ELSE IF (_c20_<1 AND _c21_<1) THEN DO; svard='style={background=lightgreen}'; END;
ELSE DO; svard='style={background=white}'; END;
IF (_c23_>1 AND _c24_>1) THEN DO; svare='style={background=peachpuff}'; END;
ELSE IF (_c23_<1 AND _c24_<1) THEN DO; svare='style={background=lightgreen}'; END;
ELSE DO; svare='style={background=white}'; END;
IF (_c26_>1 AND _c27_>1) THEN DO; svarf='style={background=peachpuff}'; END;
ELSE IF (_c26_<1 AND _c27_<1) THEN DO; svarf='style={background=lightgreen}'; END;
ELSE DO; svarf='style={background=white}'; END;
IF (_c29_>1 AND _c30_>1) THEN DO; svarg='style={background=peachpuff}'; END;
ELSE IF (_c29_<1 AND _c30_<1) THEN DO; svarg='style={background=lightgreen}'; END;
ELSE DO; svarg='style={background=white}'; END;
IF (_c32_>1 AND _c33_>1) THEN DO; svarh='style={background=peachpuff}'; END;
ELSE IF (_c32_<1 AND _c33_<1) THEN DO; svarh='style={background=lightgreen}'; END;
ELSE DO; svarh='style={background=white}'; END;
IF (_c35_>1 AND _c36_>1) THEN DO; svari='style={background=peachpuff}'; END;
ELSE IF (_c35_<1 AND _c36_<1) THEN DO; svari='style={background=lightgreen}'; END;
ELSE DO; svari='style={background=white}'; END;
IF (_c38_>1 AND _c39_>1) THEN DO; svarj='style={background=peachpuff}'; END;
ELSE IF (_c38_<1 AND _c39_<1) THEN DO; svarj='style={background=lightgreen}'; END;
ELSE DO; svarj='style={background=white}'; END;
call define('_c4_','style',svarp); call define('_c5_','style',svarp); call define('_c6_','style',svarp);
call define('_c7_','style',svarn); call define('_c8_','style',svarn); call define('_c9_','style',svarn);
call define('_c10_','style',svara); call define('_c11_','style',svara); call define('_c12_','style',svara);
call define('_c13_','style',svarb); call define('_c14_','style',svarb); call define('_c15_','style',svarb);
call define('_c16_','style',svarc); call define('_c17_','style',svarc); call define('_c18_','style',svarc);
call define('_c19_','style',svard); call define('_c20_','style',svard); call define('_c21_','style',svard);
call define('_c22_','style',svare); call define('_c23_','style',svare); call define('_c24_','style',svare);
call define('_c25_','style',svarf); call define('_c26_','style',svarf); call define('_c27_','style',svarf);
call define('_c28_','style',svarg); call define('_c29_','style',svarg); call define('_c30_','style',svarg);
call define('_c31_','style',svarh); call define('_c32_','style',svarh); call define('_c33_','style',svarh);
call define('_c34_','style',svari); call define('_c35_','style',svari); call define('_c36_','style',svari);
call define('_c37_','style',svarj); call define('_c38_','style',svarj); call define('_c39_','style',svarj);
endcomp;
run;
Hi: do-able is good news.
I would not use macro until I get the table right. The code posted appear lengthy but it didn't take me long with notepad+++ and being consistent with the underlying logic.
Good catch. Sorry about the error in the screenshot. It's not even data yet but my imaginary view I created in Excel how I want final output look like.I updated the screenshot simply taking out ozone for now.
My only warning is that you are only showing RR (for all Genders) and RR for each gender in that table, but as I understand your logic, you still need to test the LCL and UCL under each ACROSS value. You MUST have LCL and UCL on the COLUMN statement to use them the way you envision. So it looks to me like you'll have 5 values under every ACROSS:
ACROSS_VAR,(RR_OV LCL UCL RR_Fem RR_Male)
And to use LCL and UCL, but not see them on the report, you can define them as NOPRINT -- but -- they WILL each get an absolute column number under EACH ACROSS group ... even with NOPRINT. So the logic that works for 3 variables under will NOT work for 5 variables under. I would definitely make up some fake data and see how the absolute column numbers are going to change.
The OUT= option of PROC REPORT will show you how the absolute column numbers are assigned, even with NOPRINT.
Cynthia
Here's some code to show you what I mean. Using SASHELP.PRDSALE to make some fake data to start with:
data prdsale;
set sashelp.prdsale;
where country in ('CANADA' 'GERMANY');
rr_ov = round(actual / 173,.01);
rr_f = round(rr_ov * .5,.01);
rr_m = round(rr_ov * .5,.01);
run;
proc report data=prdsale;
title '1) PROC REPORT showing all 5 under the ACROSS';
column country region prodtype,(actual predict rr_ov rr_f rr_m);
define country / group;
define region / group;
define prodtype / across;
define actual / analysis;
define predict / analysis;
define rr_ov / analysis;
define rr_f / analysis;
define rr_m / analysis;
run;
proc report data=prdsale out=see_abs_col;
title '2) PROC REPORT with NOPRINT and OUT= to see abs col nums';
column country region prodtype,(actual predict rr_ov rr_f rr_m);
define country / group;
define region / group;
define prodtype / across;
define actual / analysis noprint;
define predict / analysis noprint;
define rr_ov / analysis;
define rr_f / analysis;
define rr_m / analysis;
run;
proc print data=see_abs_col;
title '3) PROC PRINT shows absolute column numbers';
run;
Let's say that you wanted to use Actual and Predict for each ACROSS value to determine colors for RR_OV, RR_F and RR_M. Then Actual and Predict would need to be on the column statement. Here's the output for the 3 steps. As you can see, I have 5 absolute column numbers for each item under the ACROSS, whether NOPRINT or visible:
Hi @Cynthia_sas ,
Thanks for demo. Is it possible to show your last demo on my mock data? I confused myself when I try to follow your example.
/*CREATE DATA*/
DATA HAVE1;
LENGTH PARAMETER $10.;
INPUT PARAMETER $ SITES $ SEX Level1 RR LCL UCL;
CARDS;
PM25 LIVER 1 . 1.02 1.01 1.02
SMOKE_CAT LIVER 1 2 0.86 0.79 0.93
SMOKE_CAT LIVER 1 3 1.21 1.08 1.34
SMOKE_CAT LIVER 1 4 1.32 1.1 1.54
SMOKE_CAT LIVER 1 5 1.25 1.07 1.43
NO2 LIVER 1 . 1.41 1.14 1.68
SMOKE_CAT LIVER 1 2 0.81 0.63 0.99
SMOKE_CAT LIVER 1 3 1.6 1.23 1.97
SMOKE_CAT LIVER 1 4 0.59 0.63 0.54
SMOKE_CAT LIVER 1 5 0.98 0.96 0.99
PM25 LIVER 2 . 1.02 1.01 1.02
SMOKE_CAT LIVER 2 2 0.86 0.79 0.93
SMOKE_CAT LIVER 2 3 1.21 1.08 1.34
SMOKE_CAT LIVER 2 4 1.32 1.1 1.54
SMOKE_CAT LIVER 2 5 1.25 1.07 1.43
NO2 LIVER 2 . 1.41 1.14 1.68
SMOKE_CAT LIVER 2 2 0.81 0.63 0.99
SMOKE_CAT LIVER 2 3 1.6 1.23 1.97
SMOKE_CAT LIVER 2 4 0.59 0.63 0.54
SMOKE_CAT LIVER 2 5 0.98 0.96 0.99
PM25 STOMACH 1 . 1.88 1.35 2.4
SMOKE_CAT STOMACH 1 2 0.34 0.23 0.45
SMOKE_CAT STOMACH 1 3 2.06 1.44 2.68
SMOKE_CAT STOMACH 1 4 0.98 0.96 0.99
SMOKE_CAT STOMACH 1 5 2.25 1.53 2.97
NO2 STOMACH 1 . 2.34 1.57 3.11
SMOKE_CAT STOMACH 1 2 2.43 1.61 3.25
SMOKE_CAT STOMACH 1 3 0.58 0.56 0.59
SMOKE_CAT STOMACH 1 4 2.62 1.7 3.54
SMOKE_CAT STOMACH 1 5 2.71 1.74 3.68
PM25 STOMACH 2 . 1.88 1.35 2.4
SMOKE_CAT STOMACH 2 2 0.34 0.23 0.45
SMOKE_CAT STOMACH 2 3 2.06 1.44 2.68
SMOKE_CAT STOMACH 2 4 0.98 0.96 0.99
SMOKE_CAT STOMACH 2 5 2.25 1.53 2.97
NO2 STOMACH 2 . 2.34 1.57 3.11
SMOKE_CAT STOMACH 2 2 2.43 1.61 3.25
SMOKE_CAT STOMACH 2 3 0.58 0.56 0.59
SMOKE_CAT STOMACH 2 4 2.62 1.7 3.54
SMOKE_CAT STOMACH 2 5 2.71 1.74 3.68
;
RUN;
Desired output is still:
Hi:
If you notice in my program, I had a numeric variable for RR_OV (RR Overall), another separate numeric variable for RR_F and RR_M. I only see one RR value in your changed data and you have values of 1 and 2 for the SEX variable. How do you know which one is Male and which one is female? How do you envision getting 3 columns into your output -- you only have 1 RR value for Sex=1 and 1 RR value for Sex=2. How do you plan to get the overall RR value that you show in your output?
So in my program, I had these variables: ACTUAL, PREDICT, RR_OV, RR_F and RR_M where the "RR" variables were just fake data. But still to use my program, your data has to be structured differently than you show it now.
I've indicated which variables on my test would be the equivalent of your existing variables. As you can see, LCL and UCL might go where I show ACTUAL and PREDICT. Then your RR might be placed where I have RR_OV (which was fake data). But my values for RR_F and RR_M have no equivalent in your HAVE1 data.
My program could not be used with your HAVE1 data. In your data, if you try to put RR under SEX, as shown in the COLUMN statement, this is what you would get:
Note that there is no RR Overall, as you envision and only numbers 1 and 2 for Sex . And, although you could clean up the headers for 'RR' and 'Sex', and use a format to get values for the 1 and 2, you still would not have an RR overall, as shown below:
You have to get the right data structure for your report BEFORE you jump into trafficlighting with CALL DEFINE. The structure of the variables in the COLUMN statement is going to impact the absolute column numbers. Right now, you only have 4 under each ACROSS value. I don't see how you are going to derive the overall RR from the values you have for Sex=1 and Sex=2, which I assume are Male and Female.
I'd recommend that you work on understanding PROC REPORT and getting your data structured so that the report values are correct before you focus on the color coding. Color coding is possible, IF the data is structured correctly for the report you want. But, right now, your HAVE1 data will not get you what you envision.
Cynthia
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.