01-12-2013 10:06 PM
I have a very strange situation here, hope somebody could help.
The following program is to set haha as a computed variable ,it stands for the # of observation in the current page.
lala is a temporary variable (sorry for the weird name) that is initialized to 0 every time a new page is created, helping haha gain increment of 1 for every new row. Listing ods is utilized here.
OK now here comes to my question: the out=options should generate a dataset that have similar data values and structure as the report shows. But somehow, maybe affected by some unknown option settings, the values of haha in the dataset haha is very different from the value of haha in the report, and variable _break_ is set to _page_ only in the last row. It seems like the first compute block was not executed at all.
I am not sure all other sas users will encounter the same situation, or it's just me. I strongly doubt it's related to some options or ods settings.
column _all_ haha;
compute after _page_;
proc print data=test;run;
01-13-2013 01:56 PM
I believe that for what you want (even though I don't entirely understand it), you may have to make your own "fake" page breaking variable. I believe that it's possible your _PAGE_ and PS= might not work the way you envision when you use OUT=.
You said that "out=options should generate a dataset that have similar data values and structure as the report shows" -- where did you get this information??? If you look in the documentation for OUT=
you will see that it says:
-- names the output data set. If this data set does not exist, then PROC REPORT creates it. The data set contains one observation for each report row and one observation for each unique summary line. If you use both customized and default summaries at the same place in the report, then the output data set contains only one observation because the two summaries differ only in how they present the data. Information about customization (underlining, color, text, and so on) is not data and is not saved in the output data set.
(you can read the rest of the documenation.)
But, OUT= is not an "exact" copy (structure/variables) of what goes to the output window. OUT= is the output dataset that ODS makes from the report rows that PROC REPORT builds. For example, consider the following example using a smaller dataset. If you compare the OUT= dataset (report #2) with the original PROC REPORT output (report #1), you will see that the OUT= dataset does not maintain the original structure or the original information and in fact, even if the BREAK statement is only doing a SKIP, the _BREAK_ row gets written to the output dataset, even if you do not see it on the original report. And if you use a LINE statement, such as I show in my sample, the LINE information is NOT written to the output dataset either.
Perhaps you could work with Tech Support with your real data and your real report need to figure out how to get what you want. However, in my experience, the output dataset from PROC REPORT is used when two passes through the data are needed (see second example that shows how to get report ordered by descending sales within region). I don't see that need in what you posted and you didn't really give a reason why you were focused on the output dataset instead of the original PROC REPORT output. If all you care about is the output dataset, you would not need options like HEADLINE or HEADSKIP or would worry about PS or LS options.
** Sample Program to show output dataset from REPORT is not in same structure as report;
proc report data=sashelp.class nowd
title '1) PROC REPORT Using Temp Variable and LINE';
column age sex name height weight;
define age / order;
define sex / order 'Gender' f=$6.;
define name / order;
define height / display;
define weight / display;
break after sex / skip;
break after age / skip;
compute before age;
cntr = 0;
if _break_ = ' ' then cntr+1;
compute after age;
line @1 'Total for this age: ' cntr 3.0;
proc print data=classout;
title '2) Dataset Created by PROC REPORT';
** Make 2 passes to get descending summarized sales within region;
proc report data=sashelp.shoes nowd out=sumreg;
title '1) Order by Region Only';
column region product sales sales=order_tot;
define region / group;
define product / group;
define sales / sum;
proc report data=sumreg nowd;
title '2) Order by descending sales within region';
title2 'Using ORDER_TOT variable';
column region order_tot product sales;
define region / order;
define order_tot / order descending /* noprint */;
define product / order;
define sales / sum;
break after region / summarize skip;
01-14-2013 12:01 AM
Thanks very much, Cynthia.
I said that "out=options should generate a dataset that have similar data values and structure as the report shows" -- I read this from one SAS report procedure book and I believe it is true in most of the cases. As your example shows, the detail report row should match the corresponding values in the out=option dataset. the differences are mainly those rows generated by compute after/before,break after/before blocks. I use out= option is mainly because I wanna see the innate calculation of proc report and utilize it as a reference aiding me to better use compute block inside proc report.
If I exclude summay report rows(with _break_ variable with non-missing values) then the dataset in the out= options should match the value in the report output(let's assume noprint option is not used).
let me show you the first page in report here:
of $ DATE YEAR MONTH DAY haha
$220 80Q1 1980 1 1 1
$257 80Q2 1980 4 1 2
$258 80Q3 1980 7 1 3
$295 80Q4 1980 10 1 4
$247 81Q1 1981 1 1 5
$292 81Q2 1981 4 1 6
$286 81Q3 1981 7 1 7
$323 81Q4 1981 10 1 8
and then the second page
of $ DATE YEAR MONTH DAY haha
$284 82Q1 1982 1 1 1
$307 82Q2 1982 4 1 2
$318 82Q3 1982 7 1 3
$343 82Q4 1982 10 1 4
$299 83Q1 1983 1 1 5
$351 83Q2 1983 4 1 6
$359 83Q3 1983 7 1 7
$384 83Q4 1983 10 1 8
there are several more pages....
now you can notice, haha can show the row number in the current page. so I expect in out= dataset ,haha should be in 1,2,3..8,1,2,3..8
circle with a break row(_break_=_page_) in every 8 rows. However in SAS on my computer, the out= dataset shows:
Obs SALES DATE YEAR MONTH DAY haha _BREAK_
41 $606 90Q1 1990 1 1 41
42 $674 90Q2 1990 4 1 42
43 $705 90Q3 1990 7 1 43
44 $749 90Q4 1990 10 1 44
45 $703 91Q1 1991 1 1 45
46 $709 91Q2 1991 4 1 46
47 $728 91Q3 1991 7 1 47
48 $807 91Q4 1991 10 1 48
49 $692 92Q1 1992 1 1 49
50 $797 92Q2 1992 4 1 50
51 $826 92Q3 1992 7 1 51
52 $889 92Q4 1992 10 1 52
53 $758 93Q1 1993 1 1 53
54 $909 93Q2 1993 4 1 54
55 $920 93Q3 1993 7 1 55
56 $991 93Q4 1993 10 1 56
57 $876 94Q1 1994 1 1 57
58 $998 94Q2 1994 4 1 58
59 $998 94Q2 1994 4 1 _PAGE_
only one _break_=_page_ in the end.
haha does not match what it is in the report.
It seems like the first compute after _page_ block is completely ingored in running out= dataset, and I have no idea why this is happpening.
I hope my broken English can make you understand my confusion here. And I will also try to consult with Tech support.
Thanks again for your reply.
01-14-2013 09:29 AM
I would recommend that you work with Tech Support on this. I do not believe that page-breaking behavior is captured as you expect in the output dataset. At the very least, you will need a BREAK statement that explicitly has the PAGE option....you cannot count on PAGESIZE alone. Possibly if you computed your own "page break" variable, it would get captured, but _PAGE_ was designed for LISTING behavior back in the early days of PROC REPORT and when they rewrote the output dataset creation to work with ODS OUTPUT, I do not believe _PAGE_ is completely integrated. By that I mean -- it works in LISTING, but is not surfaced as you want.
However, there is a workaround...see the program below. In the program below, I make a "fake" variable called PGCNTR that I will use to control page breaking, that way, I do not count on PAGESIZE to do page breaking for me. I use PGCNTR. I find it is more reliable across all destinations.
retain pgcntr 1;
if mod(_n_,10) = 0 then pgcntr +1;
options nodate nonumber orientation=portrait;
proc report data=pageit nowd
title '1) using fake page break variable';
column pgcntr region subsidiary product sales;
define pgcntr / order /* NOPRINT */;
define region / display;
define subsidiary / display;
define product / display;
define sales / sum;
compute before _page_ ;
line 'value of pgcntr= ' pgcntr 3.;
break before pgcntr / page ;
rbreak after / summarize;
proc print data=fakeout;
title '2) dataset from proc report';
01-14-2013 10:18 AM
Thank you. I think it's better to do it as you instructed. Count the row number within each page beforehand, and then use proc report. I was expecting no matter how I change ps= options, the counting variable would count for row number within each page. But maybe the method I create is rarely useful.