BookmarkSubscribeRSS Feed
sam_sas2
Obsidian | Level 7

Hi so here's the example of a code I'm currently working on, 

proc tabulate data= sashelp.baseball;
class Div Team;
var nHits;
table (Team * (N))
(nHits * (N MEAN MEDIAN MIN MAX)),
ALL Div;
run;

And this is how the current output looks like: 

  League and Division
  AllAEAWNENW
Team at the End of 1986 11...11
AtlantaN     
BaltimoreN1515...
BostonN1010...
CaliforniaN13.13..
ChicagoN24.1311.
CincinnatiN12...12
ClevelandN1212...
DetroitN1212...
HoustonN11...11
Kansas CityN14.14..
Los AngelesN14...14
MilwaukeeN1414...
MinneapolisN13.13..
MontrealN14..14.
New YorkN2411.13.
 

Is it possible to remove the column "NW" without a where condition?

(This is just for an example. I cannot use a "where" clause at all since I need to add a few more groups that are not mutually exclusive)
I will be displaying the output in HTML.

Any sort of help/guidance would be appreciated. Thanks in advance! 🙂

10 REPLIES 10
ballardw
Super User

I might query as to why a "where" is not allowed.

Do you expect the Teams in the NW to still appear in the data (or equivalent for your process)?

 

It may be more productive to provide data like yours and demonstrate the "not mutually exclusive" nature. Dummy data is fine as long as it behaves the same. Then show what you expect the result for that data to be. Sometimes you have to bite the bullet and use a data step or other method to prepare the data for reporting.

 

You might try using a CLASSDATA data set and the option Exclusive. This data set would have all of the combinations of the class variables that you want in the data. But if you want want the teams from the DIV to appear but not the Div that's going to be very problematic.

sam_sas2
Obsidian | Level 7

Okay, let's assume I have this below data

patientidcancertypeage
ALung18
BKidney23
CKidney65
DKidney34
ELung65
FKidney22

 

So after tabulation, I want something like this, but right now if I put a "where" condition for the 3rd column, then I would lose Patient A, but I don't want that.

lungkidneyage>18
245
ballardw
Super User

Are you asking for anything besides N for that "age" behavior?

Are any of the other variables that need this behavior character?

 

If the answer is yes to either of those two questions then you likely need to look at another approach than proc tabulate, likely involving different summary procedures and combining outputs.

 

If the variables in question are numeric and only need n statistic then you could preprocess data to set them to numeric and use as VAR variables:

data have;
   input patientid $	cancertype $	age;
datalines;
A	Lung	18
B	Kidney	23
C	Kidney	65
D	Kidney	34
E	Lung	65
F	Kidney	22
;

data need;
   set have;
   if age le 18 then age=.;
run;

proc tabulate data=need;
   class cancertype;
   var age;
   table cancertype *n=' ' age*n=' ';
run;

Likely that "need" data set would only be used for this purpose. But this will not work with character variables as they would need to be class and "missing" either removes the record or displays the missing value with the /missing option. Whether other statistics would "work" depends on what "n" would be used for things like percentages, stddev, variance in such calculations. And order statistics like median, p10, Q1 and such may not match your need either.

sam_sas2
Obsidian | Level 7
Yup! They wouldn't! So that was why I was wondering if there was a way to "cut" the unnecessary parts of proc tabulate using ODS HTML maybe?
Ksharp
Super User

Using proc sql instead ?

 

data have;
   input patientid $	cancertype $	age;
datalines;
A	Lung	18
B	Kidney	23
C	Kidney	65
D	Kidney	34
E	Lung	65
F	Kidney	22
;

proc sql;
create table want as
select cancertype,count(distinct patientid) as n
 from have
  group by cancertype
union all
select 'age>18',count(distinct patientid) as n
 from have
  where age>18
;
quit;
Astounding
PROC Star

If WHERE is not permitted, there could be ways around it.  For example, a DATA step (or view) could contain the statement:

if DIV="NW" then delete;

Is that allowed?  And whether it is allowed or not, the ALL column would change depending on whether or not it is supposed to include NW.  So should the ALL column include NW or not?

 

If ALL is supposed to include NW,  you might use PROC TABULATE to create an output data set, and subsequently apply the DELETE statement.  Then create a new PROC TABULATE to process the remaining observations from that output data set.

sam_sas2
Obsidian | Level 7
Hi!! Yes! The ALL should include "NW". I am unable to find any relevant material on how to use an output of proc tabulate to create a new proc tabulate to process the remaining. Could you please help me out?
Astounding
PROC Star

There is nothing specifically on that topic.  Create an output data set from your current PROC TABULATE.  Then delete the NW observations and look at the remaining data to become familiar with it.  Figure out what should go into a new PROC TABULATE to get the table that you want.  You will probably change all of the statistics to be "SUM" because you are just looking to print out the values that were inserted into the output data set by the first PROC TABULATE. 

sam_sas2
Obsidian | Level 7
Could you do an example on this? That would mean a lot!
ballardw
Super User

@sam_sas2 wrote:
Could you do an example on this? That would mean a lot!

You have to provide data and actual expected results.

 

Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the <> icon or attached as text to show exactly what you have and that we can test code against.

 

The output data set from proc tabulate is not simple to work with and a "generic" small example is likely not to provide enough details for anything complex.

Add to the Proc tabulate statement that you are currently using this bit :  out=work.tabulate (or what ever you want to name it). Then look very closely at the the output. There may be many places that things have to be "deleted" and the code to display the result will be more work because there are now variables with statistics as part of the names.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 1298 views
  • 2 likes
  • 4 in conversation