Re: Variable showing in 1 column only (PROC TABULATE)

deleted_user · Posted 08-21-2009 12:47 PM

Hi,

I have come accross yet another problem. I need to get a variable in a table, but only in one column. As you will see in the code, I'm creating a table with the column AMORT as CLASS (and this class has 4 subclasses : <25 , 25-30 , 30-35 , 35-40 . Basically, I would like the AR25 variable to only appear in the <25 subclass, the AR30 to only appear in the 25-30 subclass etc.

Is there a way to do this? Thanks in advance!

Here's my code:

PROC TABULATE
DATA=WORK.SUMMARYTABLE2;

VAR AR25 AR30;

CLASS PRODUCT ;
CLASS AMORT ;
CLASS BEACONR ;
CLASS PERIOD ;

TABLE

PRODUCT,

PERIOD*(
BEACONR ALL),

AMORT*(
AR25*
Sum
AR30*
Sum)

PRINTMISS MISSTEXT='' INDENT=0 ;

RUN;

Cynthia_sas · Posted 08-21-2009 04:52 PM

Hi:
I'm confused. When you have this for the COLUMN dimension in your TABLE statement:
[pre]
AMORT*(AR25*Sum AR30*Sum)
[/pre]

You are telling TABULATE to explicitly put two columns, AR25 and AR30, underneath EACH level of AMORT (the purpose of the * is to do nesting or stacking).

Have you tried it without AMORT in the TABLE statement??? That might work if your variables already correspond to the categories.
[pre]
(AR25='<25'*Sum AR30='25-30'*Sum AR35='30-35'*sum AR40='35-40'*sum)
[/pre]

How you do this sort of depends on the structure of your data. Normally, I would expect to see the nesting syntax AMORT*(val*sum) if your data looked like this:
[pre]
amort product beaconr period val
<25 AAA XXX Period 1 2525
<25 AAA XXX Period 1 2525
25-30 AAA XXX Period 1 3030
25-30 AAA XXX Period 1 3030
30-35 AAA XXX Period 1 3535
30-35 AAA XXX Period 1 3535
35-40 AAA XXX Period 1 4040
35-40 AAA XXX Period 1 4040
<25 AAA XXX Period 2 2525
<25 AAA XXX Period 2 2525
25-30 AAA XXX Period 2 3030
25-30 AAA XXX Period 2 3030
30-35 AAA XXX Period 2 3535
30-35 AAA XXX Period 2 3535
[/pre]

where AMORT is truly setting a category for a numeric variable, then you would use your nesting technique with AMORT in the COLUMN dimension. But if every observation has AR25, AR30, etc, I don't see that you need AMORT at all, because the variables themselves distinguish the category.

cynthia

deleted_user · Posted 08-24-2009 09:23 AM

Hi,

Sadly, the AMORT CLASS must be dividing the table. There is a total of 11 variables plus 5 computed columns (derived from those 11 variables) plus 5 columns that I don't know how to produce.

My last post was an attempt at resolving my last issue for those 5 columns.

The problem is, I need to compute columns based on the AMORT CLASS. Let me give you an example of what I mean.

Ex : I want to calculate the column "AR30" which is equal to the ARREAR RATE of AMORT 25-30 divided by ARREAR RATE of AMORT 25. This column would appear under the AMORT 25-30 subclass. As you can see, I need that specific column to only appear in one subclass. The "AR35" would be equal to ARREAR RATE of AMORT 30-35 divided by ARREAR RATE of AMORT 25... Each of the subclass have their own "Index XXX"...

So is there a way I can use the variable "ARXX" to only appear in the right subclass?

I hope the new info I gave you is relevant and that it can help you help me. Thanks again for your time and petience.

Samuel

Cynthia_sas · Posted 08-24-2009 10:02 AM

Hi:
I'm not sure that PROC TABULATE is the right procedure for this report. Although you can specify a denominator definition in order to calculate you own percents, you can't really compute columns with TABULATE -- most folks who need real conditional processing or calculation of columns use PROC REPORT.

Or else, they calculate the new variables they need in a DATA step program before PROC TABULATE.

It's hard to visualize the report you want, particularly, since it's hard to visualize what the input data looks like. I thought you already HAD AR25 and AR30 -- not that you needed to calculate them. Having
[pre]
AMORT*(AR25*SUM AR30*SUM AR35*SUM)
[/pre]
will only work with TABULATE if the AR25, AR30 and AR35 variables/columns already exist. And, given this piece of TABLE statement, the AR25, AR30 and AR35 columns would be nested under EACH value for AMORT -- which, I gather, is not your desired outcome.

You can use PCTSUM and <denom> to place a column on a tabulate report, but without knowing what your desired INPUT data looks like and your desired OUTPUT report looks like, it's hard to tell whether TABULATE is the right approach.

cynthia

deleted_user · Posted 08-24-2009 11:03 AM

I'm sorry I'm not being clear enough.

As to the input and output I have/want, my initial dataset has PRODUCT, PERIOD, AMORT, BEACONR and 11 other variables. From that dataset, I create a new one using DATA, to get 5 new columns : ARREAR RATE (#), ARREAR RATE ($), etc. the 11 variables plus the 5 new columns (16 columns) are compiled in a PROC TABULATE. About 90% of the job is done so far, but that 10% is a pain.

As for the final output, I am unable to draw it here on the forum, but here is what a column might look like :

AMORT -> " 25-30 " -> 11 variables + 5 computed + AR 30 (no AR25)

etc...

If you think this case is not lost, please enlight me on the solution as I am completely clueless what to do.

Thank you.

Cynthia_sas · Posted 08-25-2009 12:20 AM

Hi:
Your question, and answer, really are data-dependent. For example...compare these two possibilities for data structure:
[pre]
Option 1

Obs amort product beaconr period val +plus more variables
1 <25 AAA XXX Period 1 2525
2 25-30 AAA XXX Period 1 3030
3 30-35 AAA XXX Period 1 3535
4 <25 AAA XXX Period 2 2525
5 25-30 AAA XXX Period 2 3030
6 30-35 AAA XXX Period 2 3535
[/pre]

versus
[pre]
Option 2

Obs product beaconr period amort1 ar25 amort2 ar30 amort3 ar35 + more variables
1 AAA XXX Period 1 <25 2525 25-30 3030 30-35 3535
2 AAA XXX Period 2 <25 2525 25-30 3030 30-35 3535
[/pre]

In option1 version of the data, there is a single variable/column named AMORT, and each observation has a different value for AMORT. Compare this to the structure of the data in Option 2...there are 3 related AMORT variables, AMORT1, AMORT2, AMORT3 and each of these AMORT column variables "goes with" the matching AR25, AR30 and AR35 variables.

Another option for how the data is structured is related to Option 1 something like this, where only 1 of the 3 variables (in this case AR25, AR30, AR35) has a value based on what the observation's value for AMORT is:
[pre]
Option 3

Obs amort product beaconr period ar25 ar30 ar35 +plus more variables
1 <25 AAA XXX Period 1 2525 . .
2 25-30 AAA XXX Period 1 . 3030 .
3 30-35 AAA XXX Period 1 . . 3535
4 <25 AAA XXX Period 2 2525 . .
5 25-30 AAA XXX Period 2 . 3030 .
6 30-35 AAA XXX Period 2 . . 3535

[/pre]

The TABULATE syntax that you use, if TABULATE is even the right procedure, will be different for Option 1 structure than it is for Option 2 structure. This is why showing the structure of the input data, if possible, will help.

You might need to open a track with Tech Support. They could look at a copy of your input data and you could send them a sketch or mockup of what the final report should be and then they could recommend whether TABULATE or REPORT or even a DATA Step report would be better. Another alternative might be to look at PROC TABULATE examples in the documentation and try to compare the structure of the data they used in their programs (these will either be SASHELP datasets or the programs will have a sample of the data) with the structure of your data.

cynthia

deleted_user · Posted 08-25-2009 11:20 AM

Hi Cynthia,

My Data structure corresponds to your option #1... And I'm sorry I wasn't clear enough on my goals and my data structure...

Good news though, I found out a way to achieve my goal, even though it was not the way I expected (with some help) :

I created a "key" containing the AMORT, PRODUCT and BEACONR by concatenating all of them. I then created a new variable based on AMORT only and I finally merged (using the "key" variable) the last data set with a complete one.

So I didn't change the way my tabulate was made, I just changed the way my data was structured.

Thanks a lot for your time and I hope this thread will help some people in the future.

Cynthia_sas · Posted 08-25-2009 11:13 PM

Hi:
I'm glad it worked out. As you discovered, the input structure of your data really does play a role in how your TABULATE program is coded.

That's why, when asking a question, it's useful to show a bit of actual data (possibly with changed codes/names, etc) -- or use a SAS dataset that mimics your data structure...and then show the entire code that you tried -- it gives people a way to visualize both the input structure and the output report that you want. I was about to suggest that you take a look at SASHELP.PRDSALE, the data has REGION, COUNTRY, PRODUCT, PRODTYPE and DIVISION, all of which make good class variables, along with MONTH, QUARTER and YEAR -- which can also be CLASS variables. Then, ACTUAL and PREDICT can be in a VAR statement.

If you can mockup a report using SASHELP.PRDSALE and TABULATE, it's easier to post the code you've tried and describe how your desired report differs from what you can generate with the code you've tried. Since everybody has SASHELP.PRDSALE (or SASHELP.CLASS or SASHELP.SHOES), more people can cut and paste the code and see whether their ideas about what -might- work actually -does- work.

Just some ideas for future postings. I am glad you got what you wanted. Congrats!

cynthia

Register Today!

SAS Training: Just a Click Away