BookmarkSubscribeRSS Feed
Priyamvada07
Obsidian | Level 7

I have the following data

DATA HAVE;
input year dz $8. area;
cards;
2000 stroke 08
2000 stroke 06
2000 stroke 06
2001 stroke 08
2001 stroke 06
2001 stroke 06
2002 stroke 08
2002 stroke 06
2002 stroke 06
;
run;

 

I want to create a table  for year*area with a row for total. I have used the following code  for it

proc tabulate data=have;
class year area dz;
tables (year=' ' all='Total'), ((area=' ')*(n pctn)
(all='Total')*(n pctn)) /box=year;
Title j=center height=16pt "Table year by spa for dz";
OPTION LINESIZE= 256 NODATE;
run;

 

Which gave me the following output

 

 

Priyamvada07_0-1617696543617.png

In this output I want all the exact rows and columns except the row for year '2000'. Please note I want the same values in the total row and column. Is there a way to do it?

 

Thank you!

11 REPLIES 11
Kurt_Bremser
Super User

I saved the TABULATE result to a dataset, modified it, and used REPORT to create the output:

data have;
input year dz $8. area;
cards;
2000 stroke 08
2000 stroke 06
2000 stroke 06
2001 stroke 08
2001 stroke 06
2001 stroke 06
2002 stroke 08
2002 stroke 06
2002 stroke 06
;

proc tabulate
  data=have
  out=want (where=(year ne 2000))
;
class year area dz;
tables (year=' ' all='Total'), ((area=' ')*(n pctn) 
(all='Total')*(n pctn)) /box=year;
Title j=center height=16pt "Table year by spa for dz";
OPTION LINESIZE= 256 NODATE;
run;

data want2;
set want;
if year = .
then _year = "Total";
else _year = put(year,z4.);
if area = .
then _area = "Total";
else _area = put(area,2.);
run;

proc report data=want2;
column _year _area,(n pctn_00);
define _year / "Year" group;
define n / analysis sum;
define pctn_00 / "PctN" analysis sum format=6.2;
define _area / "" across;
run;
Priyamvada07
Obsidian | Level 7

So this code is working perfectly. However, in my actual dataset I am using a WEIGHT variable and I am suspecting because of it I am not getting the desired output from the PROC REPORT step.

I have created a new data with a dummy WEIGHT variable 'dz' (which is actually not weighted).
DATA HAVE;
input year dz $8. area;
cards;
2000 1 08
2000 1 06
2000 2 06
2001 2 08
2001 2 06
2001 3 06
2002 3 08
2002 3 06
2002 1 06
;
run;

I am using similar code in my original dataset
proc format;
picture count (round)
0-4 = ' <5' (NOEDIT);
picture pcnt (round)
0 = ' - '
other = '009.9%';


proc tabulate
data=have
out=want (where=(year ne 2000))
;
class year area;
tables (year=' ' all='Total'), ((area=' ')*(dz="No."*sum*f=count. dz='%'*rowpctsum*f=pcnt.)
(all='Total')*(dz="No."*sum*f=count. dz='%'*rowpctsum*f=pcnt.)) /box=year;
var dz;
run;

This code doesn't work for this data as 'dz' is actually not weighted. However, I am sharing it to discuss the issue with my original dataset.

Now lets assume the above proc step actually works then I do the below from your code

data want2;
set want;
if year = .
then _year = "Total";
else _year = put(year,z4.);
if area = .
then _area = "Total";
else _area = put(area,2.);
run;

When I print the 'want2' data I would get somethin like below

Priyamvada07_0-1617749312499.png

 

Based on the above columns I would lastly do the proc report step

proc report data=want2;
column _year _area,(dz_wt_Sum dz_wt_PctSum_10);
define _year / "Year" group;
define dz_wt_Sum / analysis sum;
define dz_wt_PctSum_10 / "PctN" analysis sum format=6.2;
define _area / "" across;
run;

 

All of the steps until PROC REPORT work fine but after the PROC REPORT I get a warning that says "dz_wt_PctSum_10" and "dz_wt_Sum" are not in the report definition. And of course I don't get the desired report. Can you let me know where I am going wrong.

 

I know it is hard to understand the issue from my example code which actually doesn't work. But this has worked in the original dataset until the PROC REPORT step.

 

Thank you!

 

 

Kurt_Bremser
Super User

Look at the variables in want2. You have

  • dz_sum
  • dz_pctsum_10_dz
  • dz_pctsum_00_dz

So your complete code should be

data have;
input year dz area;
cards;
2000 1 08
2000 1 06
2000 2 06
2001 2 08
2001 2 06
2001 3 06
2002 3 08
2002 3 06
2002 1 06
;

proc format;
picture count (round)
  0-4 = ' <5' (NOEDIT)
;
picture pcnt (round)
  0 = ' - '
  other = '009.9%'
;
run;

proc tabulate
  data=have
  out=want (where=(year ne 2000))
;
class year area;
tables (year=' ' all='Total'), ((area=' ')*(dz="No."*sum*f=count. dz='%'*rowpctsum*f=pcnt.) 
(all='Total')*(dz="No."*sum*f=count. dz='%'*rowpctsum*f=pcnt.)) /box=year;
var dz;
run;

data want2;
set want;
if year = .
then _year = "Total";
else _year = put(year,z4.);
if area = .
then _area = "Total";
else _area = put(area,2.);
run;

proc report data=want2;
column _year _area,(dz_Sum dz_PctSum_10_dz);
define _year / "Year" group;
define dz_Sum / analysis sum;
define dz_PctSum_10_dz / "PctN" analysis sum format=6.2;
define _area / "" across;
run;

Note: use the indicated button for posting logs or other fixed-width text:

Bildschirmfoto 2020-04-07 um 08.32.59.jpg

and the "little running man" right next to it for SAS code.

Priyamvada07
Obsidian | Level 7
Thank you for the reply. However, the variables in the want2 are:
dz_wt_sum
dz_wt_pctsum_10
dz_wt_pctsum_00
Sorry for the poor image with small font!

I have used the above variables in my code and now it's working. I am getting the right statistics but the 'total' row is missing. I am clueless as to why this is happening, any ideas??
Kurt_Bremser
Super User

@Priyamvada07 wrote:
Thank you for the reply. However, the variables in the want2 are:
dz_wt_sum
dz_wt_pctsum_10
dz_wt_pctsum_00

No, they're not. Proof:

data have;
input year dz area;
cards;
2000 1 08
2000 1 06
2000 2 06
2001 2 08
2001 2 06
2001 3 06
2002 3 08
2002 3 06
2002 1 06
;

proc format;
picture count (round)
  0-4 = ' <5' (NOEDIT)
;
picture pcnt (round)
  0 = ' - '
  other = '009.9%'
;
run;

proc tabulate
  data=have
  out=want (where=(year ne 2000))
;
class year area;
tables (year=' ' all='Total'), ((area=' ')*(dz="No."*sum*f=count. dz='%'*rowpctsum*f=pcnt.) 
(all='Total')*(dz="No."*sum*f=count. dz='%'*rowpctsum*f=pcnt.)) /box=year;
var dz;
run;

data want2;
set want;
if year = .
then _year = "Total";
else _year = put(year,z4.);
if area = .
then _area = "Total";
else _area = put(area,2.);
run;

proc contents data=want2;
run;

(Partial) result:

Alphabetische Liste der Variablen und Attribute
#	Variable	Typ	Länge	Etikett
4	_PAGE_	Num	8	Seite für Beobachtung
5	_TABLE_	Num	8	Tabelle für Beobachtung
3	_TYPE_	Char	2	Beobachtungstyp
10	_area	Char	5	 
9	_year	Char	5	 
2	area	Num	8	 
8	dz_PctSum_00_dz	Num	8	 
7	dz_PctSum_10_dz	Num	8	 
6	dz_Sum	Num	8	 
1	year	Num	8	 

 

 

 

Priyamvada07
Obsidian | Level 7

I see! You were able to run the example program, I couldn't. So, basically I had adapted the variables to match the variables from my original dataset (equivalent to want2 here). Any ideas why I am not getting the 'total' row? 

My original program has following PROC FORMAT code, could this have any bearing on why I am not getting a 'total' row?

 

PROC FORMAT;
value $yrfmt    '9999'='    '
                   '88'  ='Cumulative Total ';

 value $yrfmtc  '0000-9999'='yr'
                   'ZZZZ'='All';

value yrfmt	9999='Cumulative Total';

 

Kurt_Bremser
Super User

To adapt the program to your original data, I need at least a short example of that data, and the exact TABULATE code you run, so I can adapt the DATA and REPORT steps to the actual dataset structures.

Priyamvada07
Obsidian | Level 7

Actually now when I ran the code I got the following for data=want2

data have;
input year dz area;
cards;
2000 1 08
2000 1 06
2000 2 06
2001 2 08
2001 2 06
2001 3 06
2002 3 08
2002 3 06
2002 1 06
;

proc format;
picture count (round)
  0-4 = ' <5' (NOEDIT)
;
picture pcnt (round)
  0 = ' - '
  other = '009.9%'
;
run;

proc tabulate
  data=have
  out=want (where=(year ne 2000))
;
class year area;
tables (year=' ' all='Total'), ((area=' ')*(dz="No."*sum*f=count. dz='%'*rowpctsum*f=pcnt.) 
(all='Total')*(dz="No."*sum*f=count. dz='%'*rowpctsum*f=pcnt.)) /box=year;
var dz;
run;

data want2;
set want;
if year = .
then _year = "Total";
else _year = put(year,z4.);
if area = .
then _area = "Total";
else _area = put(area,2.);
run;

proc contents data=want2; run;

The variables are dz_PctSum_00, dz_PctSum_10 and dz_Sum.

 

Not sure why you got , very strange...the same code is giving different variables.

dz_PctSum_00_dz		 
dz_PctSum_10_dz

Priyamvada07_0-1617868266431.png

Although in this example the 'total' row is being generated, the 'pctn' for 'total' row is missing. 

Priyamvada07_1-1617871947070.png

 

 

Oligolas
Barite | Level 11

Hi,

by suppressing the year2000 row you will be reporting mistakenly since total pctn will still show 100%

I guess you'll have to modify your data afterwards:

 

ods select none;
proc tabulate data=have out=want;
class year area dz;
tables (year=' ' all='Total'), ((area=' ')*(n pctn)
(all='Total')*(n pctn)) /box=year;
Title j=center height=16pt "Table year by spa for dz";
OPTION LINESIZE= 256 NODATE;
run;
ods select all;

PROC FORMAT;
   value f_total
   9999='Total'
   ;
RUN;

data want;
   set want;
   if missing(year) then year=9999;
   if missing(area) then area=9999;
   format year area f_total.;
   PctN_00=PctN_00/100;
run;

PROC REPORT data=want;
   where year ne 2000;
   column year area,(N PctN_00);
   define year    / group  order=internal f=f_total. 'Year';
   define area    / across order=internal f=f_total. '';
   define N       / sum 'N';
   define PctN_00 / 'PctN' f=percent8.2;
RUN;

 

________________________

- Cheers -

ballardw
Super User

@Priyamvada07 wrote:

I have the following data

DATA HAVE;
input year dz $8. area;
cards;
2000 stroke 08
2000 stroke 06
2000 stroke 06
2001 stroke 08
2001 stroke 06
2001 stroke 06
2002 stroke 08
2002 stroke 06
2002 stroke 06
;
run;

 

I want to create a table  for year*area with a row for total. I have used the following code  for it

proc tabulate data=have;
class year area dz;
tables (year=' ' all='Total'), ((area=' ')*(n pctn)
(all='Total')*(n pctn)) /box=year;
Title j=center height=16pt "Table year by spa for dz";
OPTION LINESIZE= 256 NODATE;
run;

 

Which gave me the following output

 

 

Priyamvada07_0-1617696543617.png

In this output I want all the exact rows and columns except the row for year '2000'. Please note I want the same values in the total row and column. Is there a way to do it?

 

Thank you!


Please see:

DATA HAVE;
input year dz $8. area;
cards;
2000 stroke 08
2000 stroke 06
2000 stroke 06
2001 stroke 08
2001 stroke 06
2001 stroke 06
2002 stroke 08
2002 stroke 06
2002 stroke 06
;
run;

data classdataset;
  input year dz $8. area;
datalines;
2001 stroke 08
2001 stroke 06
2002 stroke 08
2002 stroke 06
;
title;
proc tabulate data=have classdata=classdataset;
class year area dz;
tables (year=' ' all='Total'), ((area=' ')*(n pctn)
(all='Total')*(n pctn)) /box=year;

run;

The classdata option uses a data set with the levels of the class data variables that you want to display in the table.

 

I removed the option and title statements as they didn't have any bearing on the actual issue.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 1157 views
  • 3 likes
  • 4 in conversation