BookmarkSubscribeRSS Feed
wcpatton
Calcite | Level 5

Suppose my dataset is about fruits, it has two values for fruit: Apples and Oranges.  Being so different, I've decided that I want to measure Oranges by Weight and Apples by Height.  My data would then have 3 columns, plus a 4th for vendor.  I'm looking for a way to present this data in a proc tabulate side by side.  Something like:

OrangesApples
VendorWeightHeight
Jim156
Sally142

Any thoughts about the possibility of changing the vars used based on the class value?

8 REPLIES 8
mkeintz
PROC Star

Basically you have to create a new analysis variable.  The below is untested, but I think does what you want:


data vtemp / view=vtemp;

  set have;

  if fruit='oranges' then measure=weight;

  else if fruit='apples' then measure=height;

run;

** And let's include the measure description in the fruit value **;

proc format ;

value $frt

    'oranges' = 'Oranges - Weight'

    'apples'='Apples - Height';

run;

proc tabulate data=vtemp;

  class fruit vendor;  var measure;

  tables vendor,

     fruit * mean='Mean' * measure= ' 'l

  format fruit $frt. ;

run;

Of course you could also make a new class variable to identify the measure being used, and forget the format in the above example.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
wcpatton
Calcite | Level 5

I see what you're going for there, it works pretty well for the 2x3 table... In actuality, my data is more like 12x12, so I have to figure out if it's worth coding all those and hope it never changes.  I'm looking into doing two tabulates, outputting the data as a dataset then merging those back together.

mkeintz
PROC Star

Then generalize using macrovars, as in:

  %let fruitlist= oranges apples;

  %let vars    = weight  height;

data vtemp / view=vtemp;

  array meas {*} &vars;

  W=  findw("&fruitlist",trim(fruit),' ','E'); 

  measure=  meas{W};

  descrip=scan("&vars",W);

run;

proc tabulate data=vtemp;

  class vendor fruit descrip;

  var measure;

  tables vendor,

      fruit=' ' * descrip *   mean*measure=' ';

run;

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Astounding
PROC Star

All good ideas.  Here are a few minor ones to consider ...

Unless there are many different measures, it might be easier to construct macro variables along these lines:

%let height_list = apples bananas pears;

%let weight_list = oranges blueberries;

In practice, it will be necessary to pay attention to capitalization.

If it becomes necessary to control the order of the columns, that can be done but might be more involved.  You would need a macro variable to indicate the proper order, and might start using two formats.  Something like:

value $order 'oranges'='01'

                'apples'='02';

value $fruit '01'='oranges'

                '02'='apples';

Both could be constructed from that new macro variable that indicates the order of the columns.  Other than specifying the macro variables, no maintenance would be necessary.

Good luck.

mkeintz
PROC Star

I think, at this point, it's mostly a matter of style.  But if you stick with the notion of putting all the fruits in FRUITLIST, then just by listing them in desired order, there will be no order problem to solve.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Cynthia_sas
SAS Super FREQ

Sorry, maybe I'm dense, but I don't see 4 columns. I see one column for Oranges, one column for Apples and a 3rd column for vendor. Where is the 4th column???

Also, do you ALWAYS want the MEAN as the statistic for your measure (height or weight) -- or could you want the MAX for one value of FRUIT and MIN for another value of FRUIT and MEAN for a different value of FRUIT?

cynthia

wcpatton
Calcite | Level 5

Cynthia,

The table has three columns, but the data has 4 columns (vendor, fruittype, height, weight).  This is all imaginary data to simplify an bigger, more complicated data structure.  The measures could certainly change, I have some variables that I need to sum, others are mean, etc.

mkeintz
PROC Star

If you want different stats for different fruits, then you might be better off runing proc means to generate all the stats for all the variables, CLASS-ified by fruit and vendor.  Then you can select the combination of fruits, measures, and stats wanted.  PROC TABULATE then simply become a display tool, since each cell comes from a single var in a single observations in the proc means output.  That's why the example below can use the SUM= expression regardless of which stat is being reported.

Below is a sample using sashelp.class (use sex for fruit, and agegroup for vendor).  I also set it up so that you can get multiple measures and multiple stats for a single sex (fruit in the op example).  Fill the macrovars in the order you want the columns filled.

Of course, by the time you've run the PROC MEANS, there might be better ways to display results than proc tabulate, but since that's how we started ...


Let's say you want, in this order, mean height for women, min weight for men, then max weight for men, by agegroup:

%let sexlist=  F      M      M;
%let vars=     height weight weight;
%let stats =   MEAN   MIN    MAX;

data have;
  set sashelp.class;
  agegroup=1+(age>14);
  output;   output;   output;
run;

proc means data=have noprint nway &stats;
  class sex agegroup;
  var &vars;
  output out=need;
run;

data vtemp (KEEP=SEX AGEGROUP MEASURE DESCRIP W)/ view=vtemp;
  set need;
  array meas {*} &vars;
  do W=1 to dim(meas);
    sexvalue=scan("&sexlist",w);
if sex=sexvalue then do;
   S=  scan("&stats",W);
      if _stat_=S then do;
     measure=meas{W};
  descrip=catx('-',scan("&vars",W),_stat_);
  output;
   end;
end;
end;
run;

proc sort data=vtemp out=temp;
  by w;
run;

proc tabulate data=temp noseps  order=data ;
  class agegroup sex descrip;
  var measure;
  tables agegroup,
     sex * descrip=' ' * sum=' '*measure=' ';
run;

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 1148 views
  • 9 likes
  • 4 in conversation