BookmarkSubscribeRSS Feed
titania
Calcite | Level 5

Hi everyone,

I'm trying to generate a summary table grouped by certain categories using the list report wizard in Enterprise Guide 4.3 / Office Add-In and I'm puzzled by the way missing values are treated. The output table doesn't show the missing values, instead it contains two lines with the same categories.

Here's a piece of code generated by the wizard as part of a proc report that aroused my suspicion:

define myvar / group 'myvar' missing;

compute GP_Brarisk_CK;

  if myvar ne ' ' then hold2=myvar;

  if myvar eq ' ' then myvar=hold2;

endcomp;

My question: Is there any way to modify the list report so that it treats the missing values as a separate category? Is there an alternative method that does what I want and that's available via the SAS add-in for Excel? Thanks!

4 REPLIES 4
Cynthia_sas
SAS Super FREQ

Hi:

  Before we talk about PROC REPORT. Let's talk about your data -- it becomes important to understand the full COLUMN statement - -which variables are coming from your data and which variables are being computed. And important to understand what values are missing, truly missing, missing numeric variables, missing character variables that you are using for grouping or ordering or ???

  When you talk about "missing values" keep in mind that the MISSING option does not refer to missing values for numeric variables (such as a . for a missing SALARY or missing . for AGE). If MYVAR is a GROUP item, then is it possible that you will have space or missing as a valid value for MYVAR???? Or, are you referring to the places where the report does not display duplicate or repeated values for your data in the final report.

  The code that is generated by the List Report Wizard sometimes makes assumptions about your report or inserts extra code that might not be needed if you were using just regular PROC REPORT and writing your own code...so knowing what choices you picked in the selection panes is important too. Did you say that you wanted a summary report and/or did you click the choice to show the values on every row.

  Let's look at some data:

grade  name   advisor donation donation_num

------ ------ ------- -------- ------------

junior Alan   Jones     25       101

junior Barb             10       102   <----

junior Carl   Wilson    20       103

junior Dave   Jones     10       104

junior Edna              5       105   <----

senior Alfred Wilson    20       106

senior Bob    Wilson    15       107

senior Cathy  Jones     10       108

senior Dana             25       109   <----

senior Edgar  Jones     25       110

senior Fran   Wilson    20       111

You can see that 3 donations have missing values for the ADVISOR column (donation 102, 105 and 109), but that all the other columns have values. The MISSING option in PROC REPORT only says that you want to see the 3 blank observations for ADVISOR, if ADVISOR is listed as a GROUP or ORDER item on your report and you use MISSING on the DEFINE statement for ADVISOR.

So here's a default report using both ADVISO and GRADE as ORDER variables.

  advisor   grade      donation  donation_num

  Jones     junior           25           101

                             10           104

            senior           10           108

                             25           110

  Wilson    junior           20           103

            senior           20           106

                             15           107

                             20           111

Notice in the above report, that you are seeing almost, but not all of the data rows that I initially showed above. Without the MISSING option in my code (or as an option), donations 102, 105 and 109 are NOT on the above report. All the other rows are here. But some of the rows show spaces where "Jones" or "Wilson" or "junior" or "senior" would have been repetitive. The List Report Wizard (and PROC REPORT) make the assumption that in a production quality report, you do not want to see duplicated values repeated on every report row.

Now, if I USE the MISSING option specified on a DEFINE statement for advisor, notice how the donations for 102, 105 and 109 now appear with blanks for the advisor name. These 3 observations are truly missing the ADVISOR name. This is the ONLY thing that the MISSING option should do:

advisor   grade      donation  donation_num

          junior           10           102  ** these 3 rows are from using the

                            5           105  ** MISSING option with the DEFINE

          senior           25           109  ** statement for ADVISOR

Jones     junior           25           101 

                           10           104

          senior           10           108

                           25           110

Wilson    junior           20           103

          senior           20           106

                           15           107

                           20           111

I put ** on the rows with missing values for ADVISOR. All the other blank spaces under advisor and grade rows are NOT missing. PROC REPORT suppresses the repetitious display of duplicate values. So you don't see Jones or Wilson on every report row. You also do not see junior on senior on every report row. The syntax that you show (using MYVAR and using HOLD2) is typically syntax that is sometimes used by the List Report Wizard to "fill" in these normally empty rows. This syntax usually has NOTHING to do with MISSING values or the MISSING option. As you can see, on the row for donation 104, the repetition of Jones is suppressed by default and the repetition of junior is suppressed by default.

You might want to work with Tech Support on this question, because someone needs to look at your data and at the code generated by the List Report Wizard and at the output and help you figure out whether you are getting the right results from the List Report Wizard. If I wanted to "fill in" the rows for ADVISOR and GRADE on my report, I could have produced this:

advisor   grade      donation  donation_num

          junior           10           102

          junior            5           105

          senior           25           109

Jones     junior           25           101

Jones     junior           10           104

Jones     senior           10           108

Jones     senior           25           110

Wilson    junior           20           103

Wilson    senior           20           106

Wilson    senior           15           107

Wilson    senior           20           111

Notice that for the 3  rows with missing values for ADVISOR, there was nothing to "fill in" but for the rows with grade, I could "fill in" junior and senior where appropriate. Tech Support is better able to look at your data and look at the generated code and help you figure out this issue. And, since you say you are using the Office Add-in, that may complicate things a bit in terms of how you fix the issue.

  To open a track with Tech Support, fill out the form at this link:

http://support.sas.com/ctx/supportform/createForm

cynthia

titania
Calcite | Level 5

Thanks a lot for your detailed reply; it is very helpful to understand these ideas. However, I'm still puzzled by the result I get.

I have extracted a simple example data set that shows the same phenomenon:

amount    country myvar

--------  ------- -----

1         BEL     11

100       LUX

10.000    BEL

1.000.000 LUX     56

10        BEL     32

1         BEL     11

100       LUX


What I did: Using the list report wizard and without changing any of the settings, I created a summary report on this data set. "myvar" is a character field that happens to contain numeric values.

Here's what I got:

amount    country myvar

--------  ------- -----

10.000    BEL

2         BEL     11

10        BEL     32

200       LUX     32

1.000.000 LUX     56

Now I'm wondering why I'm getting '32' where I would expect a missing value; I'm also wondering why, of the three nonempty values, it is this value that is displayed in the 4th line.

For completeness, here's the code (generated by Enterprise Guide 4.3):

proc report data=WORK.mydata nowd;

    column Amount, SUM=Amount_SUM Country myvar;

    define Amount / analysis SUM 'Amount' format=COMMAX22.2 missing;

    define Amount_SUM / format=COMMAX22.2;

    define Country / group 'Country' missing;

    compute Country;

        if Country ne ' ' then hold1=Country;

        if Country eq ' ' then Country=hold1;

    endcomp;

    define myvar / group 'myvar' missing;

    compute myvar;

        if myvar ne ' ' then hold2=myvar;

        if myvar eq ' ' then myvar=hold2;

    endcomp;

    run;

quit;

It seems to me that the list report wizard is maybe not what I want; is there anything similar that produces the result I would expect, i.e., something like a pivot table, but with each value shown in each line of the result? In SQL this would be a simple "group by"/"sum" statement, but as I said, I'd like to do this using the Excel add-in. Thanks for your suggestions!

Cynthia_sas
SAS Super FREQ

Hi:

  I'm curious. Are you building a stored process using EG and then trying to run the stored process using the Add-in for Microsoft Office? Or are you using the List Report Wizard from inside the Add-in? Or, are you building the code in EG before using the Wizard inside AMO? I'm just trying to get a full picture of your process.

  You are correct, that you should not be seeing the "32" on the row for LUX. And, you are also correct that the extra code added by EG is responsible for making you get the "32" on the row for LUX (carried forward from the row for BEL). EG generally inserts code based on the choices you make in the Wizard, however, and I am not on a machine where I can test all the point and click choices that would have generated the code you posted.

  However, if you edit the generated code to REMOVE the COMPUTE block (not the DEFINE statement), but just the COMPUTE block for MYVAR, you would get this:

        Amount  Country       myvar

--------------  ------------  -----

        10.000  BEL

             2  BEL              11

            10  BEL              32

           200  LUX

     1.000.000  LUX              56

for your report. If you would open a track with Tech Support, they would help you figure out the correct click-path to generate the code without the COMPUTE block.  To open a track with Tech Support, fill out the form at this link:

http://support.sas.com/ctx/supportform/createForm

cynthia

titania
Calcite | Level 5

Hello and thank you for your reply. I'm using the wizard from inside the Excel Add-in; I ran the same Wizard in Enterprise Guide only to copy the SAS code, there's no stored process involved. Sorry for the confusion. I might try Tech Support next, thanks for your help though.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

Discussion stats
  • 4 replies
  • 1565 views
  • 3 likes
  • 2 in conversation