Help using Base SAS procedures

question about proc report out= option

Reply
New Contributor
Posts: 3

question about proc report out= option

I have a very strange situation here, hope somebody could help.

The following program is to set haha as a computed variable ,it stands for the # of observation in the current page.

lala is a temporary variable (sorry for the weird name) that is initialized to 0 every time a new page is created, helping haha gain increment of 1 for every new row. Listing ods is utilized here.

OK now here comes to my question: the out=options should generate a dataset that have similar data values and structure as the report shows. But somehow, maybe affected by some unknown option settings, the values of haha in the dataset haha is very different from the value of haha in the report, and variable _break_ is set to _page_ only in the last row. It seems like the first compute block was not executed at all.

I am not sure all other sas users will encounter the same situation, or it's just me. I strongly doubt it's related to some options or ods settings.

Program:

proc report data=sashelp.retail list nowd headline headskip out=test ps=16 ;

column _all_ haha;

define date/display;

compute after _page_;

lala=0;

endcomp;

compute haha;

lala+1;

haha=lala;

endcomp;

run;

proc print data=test;run;

run;

SAS Super FREQ
Posts: 8,775

Re: question about proc report out= option

Hi:

  I believe that for what you want (even though I don't entirely understand it), you may have to make your own "fake" page breaking variable. I believe that it's possible your _PAGE_ and PS= might not work the way you envision when you use OUT=.

You said that "out=options should generate a dataset that have similar data values and structure as the report shows" -- where did you get this information??? If you look in the documentation for OUT=

Base SAS(R) 9.3 Procedures Guide, Second Edition

you will see that it says:

OUT=SAS-data-set

-- names the output data set. If this data set does not exist, then PROC REPORT creates it. The data set contains one observation for each report row and one observation for each unique summary line. If you use both customized and default summaries at the same place in the report, then the output data set contains only one observation because the two summaries differ only in how they present the data. Information about customization (underlining, color, text, and so on) is not data and is not saved in the output data set.

  (you can read the rest of the documenation.)

But, OUT= is not an "exact" copy (structure/variables) of what goes to the output window. OUT= is the output dataset that ODS makes from the report rows that PROC REPORT builds. For example, consider the following example using a smaller dataset. If you compare the OUT= dataset (report #2) with the original PROC REPORT output (report #1), you will see that the OUT= dataset does not maintain the original structure or the original information and in fact, even if the BREAK statement is only doing a SKIP, the _BREAK_ row gets written to the output dataset, even if you do not see it on the original report. And if you use a LINE statement, such as I show in my sample, the LINE information is NOT written to the output dataset either.

Perhaps you could work with Tech Support with your real data and your real report need to figure out how to get what you want. However, in my experience, the output dataset from PROC REPORT is used when two passes through the data are needed (see second example that shows how to get report ordered by descending sales within region). I don't see that need in what you posted and you didn't really give a reason why you were focused on the output dataset instead of the original PROC REPORT output. If all you care about is the output dataset, you would not need options like HEADLINE or HEADSKIP or would worry about PS or LS options.

cynthia

** Sample Program to show output dataset from REPORT is not in same structure as report;

ods listing;

proc report data=sashelp.class nowd

  out=classout;

  title '1) PROC REPORT Using Temp Variable and LINE';

  column age sex name height weight;

  define age / order;

  define sex / order 'Gender' f=$6.;

  define name / order;

  define height / display;

  define weight / display;

  break after sex / skip;

  break after age / skip;

  compute before age;

    cntr = 0;

  endcomp;

  compute name;

    if _break_ = ' ' then cntr+1;

  endcomp;

  compute after age;

    line @1 'Total for this age: ' cntr 3.0;

  endcomp;

run;

   

proc print data=classout;

title '2) Dataset Created by PROC REPORT';

run;

** Make 2 passes to get descending summarized sales within region;

ods listing;

proc report data=sashelp.shoes nowd out=sumreg;

  title '1) Order by Region Only';

  column region product sales sales=order_tot;

  define region / group;

  define product / group;

  define sales / sum;

run;

   

proc report data=sumreg nowd;

  title '2) Order by descending sales within region';

  title2 'Using ORDER_TOT variable';

  column region order_tot product sales;

  define region / order;

  define order_tot / order descending /* noprint */;

  define product / order;

  define sales / sum;

  break after region / summarize skip;

run;

New Contributor
Posts: 3

Re: question about proc report out= option

Thanks very much, Cynthia.

I said that "out=options should generate a dataset that have similar data values and structure as the report shows" -- I read this from one SAS report procedure book and I believe it is true in most of the cases. As your example shows, the detail report row should match the corresponding values in the out=option dataset. the differences are mainly those rows generated by compute after/before,break after/before blocks. I use out= option is mainly because I wanna see the innate calculation of proc report and utilize it as a reference aiding me to better use compute block inside proc report.

     If I exclude summay report rows(with _break_ variable with non-missing values) then the dataset in the out= options should match the value in the report output(let's assume noprint option is not used).

let me show you the first page in report here:

   Retail
                                sales in
                                millions
                                    of $  DATE       YEAR      MONTH        DAY       haha
                              ------------------------------------------------------------

                                    $220  80Q1       1980          1          1          1
                                    $257  80Q2       1980          4          1          2
                                    $258  80Q3       1980          7          1          3
                                    $295  80Q4       1980         10          1          4
                                    $247  81Q1       1981          1          1          5
                                    $292  81Q2       1981          4          1          6
                                    $286  81Q3       1981          7          1          7
                                    $323  81Q4       1981         10          1          8

  and then the second page

  Retail
                                sales in
                                millions
                                    of $  DATE       YEAR      MONTH        DAY       haha
                              ------------------------------------------------------------

                                    $284  82Q1       1982          1          1          1
                                    $307  82Q2       1982          4          1          2
                                    $318  82Q3       1982          7          1          3
                                    $343  82Q4       1982         10          1          4
                                    $299  83Q1       1983          1          1          5
                                    $351  83Q2       1983          4          1          6
                                    $359  83Q3       1983          7          1          7
                                    $384  83Q4       1983         10          1          8

there are several more pages....

now you can notice,  haha can show the row number in the current page. so I expect in out= dataset ,haha should be in 1,2,3..8,1,2,3..8

circle with a break row(_break_=_page_) in every 8 rows. However in SAS on my computer, the out= dataset shows:


                          Obs         SALES    DATE    YEAR    MONTH    DAY    haha    _BREAK_

                              ..........

                           41          $606    90Q1    1990       1      1      41
                           42          $674    90Q2    1990       4      1      42
                           43          $705    90Q3    1990       7      1      43
                           44          $749    90Q4    1990      10      1      44
                           45          $703    91Q1    1991       1      1      45
                           46          $709    91Q2    1991       4      1      46
                           47          $728    91Q3    1991       7      1      47
                           48          $807    91Q4    1991      10      1      48
                           49          $692    92Q1    1992       1      1      49
                           50          $797    92Q2    1992       4      1      50
                           51          $826    92Q3    1992       7      1      51
                           52          $889    92Q4    1992      10      1      52
                           53          $758    93Q1    1993       1      1      53
                           54          $909    93Q2    1993       4      1      54
                           55          $920    93Q3    1993       7      1      55
                           56          $991    93Q4    1993      10      1      56
                           57          $876    94Q1    1994       1      1      57
                           58          $998    94Q2    1994       4      1      58
                           59          $998    94Q2    1994       4      1             _PAGE_

only one _break_=_page_ in the end.

haha does not match what it is in the report.

It seems like the first compute after _page_ block is completely ingored in running out= dataset, and I have no idea why this is happpening.

I hope my broken English can make you understand my confusion here. And I will also try to consult with Tech support.

Thanks again for your reply.

SAS Super FREQ
Posts: 8,775

Re: question about proc report out= option

Hi:

  I would recommend that you work with Tech Support on this. I do not believe that page-breaking behavior is captured as you expect in the output dataset. At the very least, you will need a BREAK statement that explicitly has the PAGE option....you cannot count on PAGESIZE alone. Possibly if you computed your own "page break" variable, it would get captured, but _PAGE_ was designed for LISTING behavior back in the early days of PROC REPORT and when they rewrote the output dataset creation to work with ODS OUTPUT, I do not believe _PAGE_ is completely integrated. By that I mean -- it works in LISTING, but is not surfaced as you want.

  However, there is a workaround...see the program below. In the program below, I make a "fake" variable called PGCNTR that I will use to control page breaking, that way, I do not count on PAGESIZE to do page breaking for me. I use PGCNTR. I find it is more reliable across all destinations.

cynthia

data pageit;

  set sashelp.shoes(obs=100);

  retain pgcntr 1;

  output;

  if mod(_n_,10) = 0 then pgcntr +1;

run;

 

options nodate nonumber orientation=portrait;

  title; footnote;

   

ods listing;

proc report data=pageit nowd

     out=fakeout;

  title '1) using fake page break variable';

  column pgcntr region subsidiary product sales;

  define pgcntr / order  /* NOPRINT */;

  define region / display;

  define subsidiary / display;

  define product / display;

  define sales / sum;

  compute before _page_ ;

     line 'value of pgcntr= ' pgcntr 3.;

  endcomp;

  break before pgcntr / page ;

  rbreak after / summarize;

  compute after;

    region='All Regions';

  endcomp;

run;

proc print data=fakeout;

  title '2) dataset from proc report';

run;

title;

New Contributor
Posts: 3

Re: question about proc report out= option

Thank you. I think it's better to do it as you instructed. Count the row number within each page beforehand, and  then use proc report. I was expecting no matter how I change ps= options, the counting variable would count for row number within each page. But maybe the method I create is rarely useful.

Ask a Question
Discussion stats
  • 4 replies
  • 252 views
  • 8 likes
  • 2 in conversation