BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
sasuser_joe
Fluorite | Level 6

I have a survey data which includes these fields: ID, answer, previous_answer.  I need the report highlighting the row (or just the 'ID' column) only if 'answer' is different from 'previous_answer'.  it works if I just use the 'answer' as 'display' as below: 

proc report data=mysurvey missing;
column n=cntid answer previous_answer id;
define cntid/noprint;
define id/display;
define answer/display noprint;
define previous_answer/display noprint;
compute id;
	if answer ^= previous_answer then 
			call define(_col_,'STYLE','STYLE=[backgroundcolor=lightred font_weight=bold]'); 
endcomp;
Compute after / style=[backgroundcolor=lightblue font_weight=bold];
	line "Grand Total: " cntid 8.;
endcomp;
run;

However, as I need add a 'subtotal' before each 'answer' group, I have to use the 'answer' as an 'order' variable, then the highlight doesn't work:

proc report data=mysurvey missing;
column n=cntid answer previous_answer id;
define cntid/noprint;
define id/display;
define answer/order noprint;
define previous_answer/display noprint;
compute id;
	if answer ^= previous_answer then 
			call define(_col_,'STYLE','STYLE=[backgroundcolor=lightred font_weight=bold]'); 
endcomp;
Compute before answer / style=[font_weight=bold];
	if answer=' ' then label_ = '(blank): '; else label_ = strip(answer)||': ';
	line label_ $10. cntid 8.;
endcomp;
Compute after / style=[backgroundcolor=lightblue font_weight=bold];
	line "Grand Total: " cntid 8.;
endcomp;
run;

I attached a dataset I am using for this report. the desired report should only have 'A27' highlighted as it is the only row meets the condition.

Thank you in advance!

1 ACCEPTED SOLUTION

Accepted Solutions
Cynthia_sas
Diamond | Level 26

Hi:

  The reason is clear when you realize the difference between DISPLAY and ORDER. Consider SASHELP.CLASS -- a list of 19 students of various ages. Compare these 2 reports:

Cynthia_sas_0-1608345642623.png

 

Now, look what happens when I use SPANROWS to remove the cell boundaries inside the ORDER variable:

Cynthia_sas_1-1608345754994.png

  So that's why you can't directly compare your ANSWER to the previous answer inside a COMPUTE block once ANSWER becomes used as an ORDER variable.

  There are a few different ways to do it using helper variables. In report #2, the helper variable is created in a data step before the PROC REPORT step -- this is usually my initial method for showing PROC REPORT beginners because it is clear that the new helper variable ANSGRP is just a straight copy of ANSWER -- so I have a separate variable that can be used for ordering. That's because you can't use the same variable with 2 different usages. Then Report #3 is also using a helper variable, but a temporary helper variable whose value is automatically retained and the helper variable does NOT appear on the COLUMN statement. It is a memory-only temporary variable used during the step. Since people can't really "see" the variable value, they have to take on faith that HOLDANS will contain the saved value of ANSWER to be used within the PROC REPORT COMPUTE block.

  Here's are the 2 other approaches based on just a few changes to your original code:


data work.mysurvey2;
  set work.mysurvey;
  ansgrp = answer; /* used for example #2 */
run;
  
proc report data=mysurvey2 missing;
title '1) usage display';
column  answer previous_answer id n;
define answer/display  ;
define previous_answer/display ;
define id/display;
define n / 'Cntid';
compute id;
	if answer ^= previous_answer then 
			call define(_col_,'STYLE','STYLE=[backgroundcolor=lightred font_weight=bold]'); 
endcomp;
Compute after / style=[backgroundcolor=lightblue font_weight=bold];
	line "Grand Total: " n 8.;
endcomp;
run;
  
proc report data=mysurvey2 missing;
title '2) MAKE helper variable ANSGRP to be ORDER for breaking';
column ansgrp answer previous_answer id n;
define ansgrp / order;
define answer/display ;
define previous_answer/display ;
define id / display;
define n /'cntid' ;
Compute before ansgrp / style=[font_weight=bold];
	if ansgrp=' ' then label_ = '(blank): '; 
    else label_ = strip(ansgrp)||': ';
	line label_ $10. n 8.;
endcomp;
compute id;
	if answer ^= previous_answer then 
		call define(_col_,'STYLE','STYLE=[backgroundcolor=lightred font_weight=bold]'); 
endcomp;
Compute after / style=[backgroundcolor=lightblue font_weight=bold];
	line "Grand Total: " n 8.;
endcomp;
run;
  
proc report data=mysurvey missing;
title '3) alternative making temporary helper variable HOLDANS';
column answer previous_answer id n;
define answer/order ;
define previous_answer/display ;
define id / display;
define n /'cntid' ;
Compute before answer / style=[font_weight=bold];
    if upcase(_break_) = 'ANSWER' then holdans = answer;
	if answer=' ' then label_ = '(blank): '; 
    else label_ = strip(answer)||': ';
	line label_ $10. n 8.;
endcomp;
compute id;
	if holdans ^= previous_answer then 
	  call define(_col_,'STYLE','STYLE=[backgroundcolor=lightred font_weight=bold]'); 
endcomp;
Compute after / style=[backgroundcolor=lightblue font_weight=bold];
	line "Grand Total: " n 8.;
endcomp;
run;

  I like to turn off all the NOPRINT until I am sure that everything on the report is the way I want and then I put them back on. So that's why there are not any NOPRINT options in the code. The other thing that makes sense to me is to have the N statistics at the end of report row if it's going to be NOPRINT. It doesn't matter where you're placing it because you only need it on the break line. And it didn't need to be aliased.

  I am also somewhat finicky about the order of the DEFINEs and the order of the COMPUTE blocks. I like my DEFINE order to match my COLUMN order it just makes more sense to me that way. Notice that the temporary helper variable in #3, HOLDANS is what you use instead of ANSWER in the COMPUTE block for ID. Since ANSWER is the ORDER variable in #3, you have to use the HELPER variable in the COMPUTE block.

  But, in #2 because the helper variable ANSGRP exists in the DATA set and since you are breaking on ANSGRP and not ANSWER, then you can use ANSWER to compare in the COMPUTE block.

  Hope this helps clarify about PROC REPORT and how to do what you want.

Cynthia

View solution in original post

6 REPLIES 6
PhilC
Rhodochrosite | Level 12

I admit, this is a cheap solve, since I completely avoided learning the inner workings of PROC REPORT, and why it behaves as described.  Personally, I would like to know why this is.

 

Create a new dataset?  Or a view of the data...

 

Option VALIDVARNAME=ANY;

PROC SQL;
   CREATE view WORK.SQLView_FOR_MYSURVEY AS 
   SELECT id, 
          /* ans ^= prev_ans */
            (answer ^= previous_answer) AS 'ans ^= prev_ans'n, 
          answer
      FROM WORK2.MYSURVEY;
QUIT;

proc report data=&sysLast missing;
  column n=cntid answer 'ans ^= prev_ans'n id;   *< Changed;
  define cntid/noprint;
  define id/display;
  define answer/order noprint;
  define 'ans ^= prev_ans'n/display noprint;     *< Changed;
  Compute before answer / style=[font_weight=bold];
  	if answer=' ' then label_ = '(blank): '; else label_ = strip(answer)||': ';
  	line label_ $10. cntid 8.;
  endcomp;
  compute id;
  	if 'ans ^= prev_ans'n = 1 then              /*< Changed;*/
  			call define(_col_,'STYLE','STYLE=[backgroundcolor=lightred font_weight=bold]'); 
  endcomp;
  Compute after / style=[backgroundcolor=lightblue font_weight=bold];
  	line "Grand Total: " cntid 8.;
  endcomp;
run;

 

 

Cynthia_sas
Diamond | Level 26
Hi:
Actually, using a helper variable, one way or another is the workaround you have to use to do this with PROC REPORT. There are a few different methods that are possible. When I have the chance, I'll post some alternatives.
Cynthia
sasuser_joe
Fluorite | Level 6
Thank you, Cynthia. I believe there is a way to do it with only the report procedure, so looking forward to your solutions. Joe
sasuser_joe
Fluorite | Level 6
Thank you, PhilC. Your solution actually works. Just have the the same wondering as yours - why cannot the condition (comparing 'answer' with 'previous_answer') be put into the compute block of the report procedure?
Cynthia_sas
Diamond | Level 26

Hi:

  The reason is clear when you realize the difference between DISPLAY and ORDER. Consider SASHELP.CLASS -- a list of 19 students of various ages. Compare these 2 reports:

Cynthia_sas_0-1608345642623.png

 

Now, look what happens when I use SPANROWS to remove the cell boundaries inside the ORDER variable:

Cynthia_sas_1-1608345754994.png

  So that's why you can't directly compare your ANSWER to the previous answer inside a COMPUTE block once ANSWER becomes used as an ORDER variable.

  There are a few different ways to do it using helper variables. In report #2, the helper variable is created in a data step before the PROC REPORT step -- this is usually my initial method for showing PROC REPORT beginners because it is clear that the new helper variable ANSGRP is just a straight copy of ANSWER -- so I have a separate variable that can be used for ordering. That's because you can't use the same variable with 2 different usages. Then Report #3 is also using a helper variable, but a temporary helper variable whose value is automatically retained and the helper variable does NOT appear on the COLUMN statement. It is a memory-only temporary variable used during the step. Since people can't really "see" the variable value, they have to take on faith that HOLDANS will contain the saved value of ANSWER to be used within the PROC REPORT COMPUTE block.

  Here's are the 2 other approaches based on just a few changes to your original code:


data work.mysurvey2;
  set work.mysurvey;
  ansgrp = answer; /* used for example #2 */
run;
  
proc report data=mysurvey2 missing;
title '1) usage display';
column  answer previous_answer id n;
define answer/display  ;
define previous_answer/display ;
define id/display;
define n / 'Cntid';
compute id;
	if answer ^= previous_answer then 
			call define(_col_,'STYLE','STYLE=[backgroundcolor=lightred font_weight=bold]'); 
endcomp;
Compute after / style=[backgroundcolor=lightblue font_weight=bold];
	line "Grand Total: " n 8.;
endcomp;
run;
  
proc report data=mysurvey2 missing;
title '2) MAKE helper variable ANSGRP to be ORDER for breaking';
column ansgrp answer previous_answer id n;
define ansgrp / order;
define answer/display ;
define previous_answer/display ;
define id / display;
define n /'cntid' ;
Compute before ansgrp / style=[font_weight=bold];
	if ansgrp=' ' then label_ = '(blank): '; 
    else label_ = strip(ansgrp)||': ';
	line label_ $10. n 8.;
endcomp;
compute id;
	if answer ^= previous_answer then 
		call define(_col_,'STYLE','STYLE=[backgroundcolor=lightred font_weight=bold]'); 
endcomp;
Compute after / style=[backgroundcolor=lightblue font_weight=bold];
	line "Grand Total: " n 8.;
endcomp;
run;
  
proc report data=mysurvey missing;
title '3) alternative making temporary helper variable HOLDANS';
column answer previous_answer id n;
define answer/order ;
define previous_answer/display ;
define id / display;
define n /'cntid' ;
Compute before answer / style=[font_weight=bold];
    if upcase(_break_) = 'ANSWER' then holdans = answer;
	if answer=' ' then label_ = '(blank): '; 
    else label_ = strip(answer)||': ';
	line label_ $10. n 8.;
endcomp;
compute id;
	if holdans ^= previous_answer then 
	  call define(_col_,'STYLE','STYLE=[backgroundcolor=lightred font_weight=bold]'); 
endcomp;
Compute after / style=[backgroundcolor=lightblue font_weight=bold];
	line "Grand Total: " n 8.;
endcomp;
run;

  I like to turn off all the NOPRINT until I am sure that everything on the report is the way I want and then I put them back on. So that's why there are not any NOPRINT options in the code. The other thing that makes sense to me is to have the N statistics at the end of report row if it's going to be NOPRINT. It doesn't matter where you're placing it because you only need it on the break line. And it didn't need to be aliased.

  I am also somewhat finicky about the order of the DEFINEs and the order of the COMPUTE blocks. I like my DEFINE order to match my COLUMN order it just makes more sense to me that way. Notice that the temporary helper variable in #3, HOLDANS is what you use instead of ANSWER in the COMPUTE block for ID. Since ANSWER is the ORDER variable in #3, you have to use the HELPER variable in the COMPUTE block.

  But, in #2 because the helper variable ANSGRP exists in the DATA set and since you are breaking on ANSGRP and not ANSWER, then you can use ANSWER to compare in the COMPUTE block.

  Hope this helps clarify about PROC REPORT and how to do what you want.

Cynthia

sasuser_joe
Fluorite | Level 6

@Cynthia_sas This is fantastic! thank you for solving my question in short period of time.  the #3 report was THE improvement from other approaches, with artful application of '_break_' automatic variable. I don't know much about it, but knowing it's the key to avoid the pitfall I experienced earlier.  Also, appreciate the clear illustration why it's not working when used as 'order' variable comparing with usage of as 'display',  as well as the useful tips of good practice regarding this special SAS procedure.   Thank you!

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 4184 views
  • 2 likes
  • 3 in conversation