BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
TimPrice
Fluorite | Level 6

 So the following comes from http://support.sas.com/kb/43/975.html.

 

What if we add data by year and I want to show MSRP and Invoice as rows and show by year as the column where a user can scan across to see how MSRP has changed across years or one could compute a col of differences  between years?

 

How would the following code change or would it be so overwhelming it would be easier to do with tabulate writing it out to a dataset?

 

Multiple Rows Summarizing Type Variable



Origin  Type              MSRP       Invoice
Asia    Hybrid         $59,760       $55,288
        SUV           $739,225      $672,912
        Sedan       $2,139,813    $1,954,101
        Sports        $552,681      $503,556
        Truck         $163,069      $150,412
        Wagon         $254,581      $234,875
Europe  SUV           $483,460      $442,913
        Sedan       $3,353,380    $3,096,312
        Sports      $1,655,970    $1,501,327
        Wagon         $454,215      $420,043
USA     SUV           $864,730      $781,696
        Sedan       $2,307,495    $2,125,714
        Sports        $407,315      $370,302
        Truck         $435,524      $392,390
        Wagon         $156,420      $144,451
TOTALS  Hybrid         $59,760       $55,288
        SUV         $2,087,415    $1,897,521
        Sedan       $7,800,688    $7,176,127
        Sports      $2,615,966    $2,375,185
        Truck         $598,593      $542,802
        Wagon         $865,216      $799,369

 

data cars1;
   set sashelp.cars;
   dummy1 = 1;
   dummy2 = 1;
   dummy3 = 1;
   dummy4 = 1;
   dummy5 = 1;
   dummy6 = 1;
run;

proc report data=cars1 nowd;
   column dummy1 dummy2 dummy3 dummy4 dummy5 dummy6
          origin type msrp invoice;
   define dummy1 / group noprint;
   define dummy2 / group noprint;
   define dummy3 / group noprint;
   define dummy4 / group noprint;
   define dummy5 / group noprint;
   define dummy6 / group noprint;
   define origin / group;
   define type / group;
   define msrp / format=dollar12.;
   define invoice / format=dollar12.;

   break after dummy1 / summarize;
   break after dummy2 / summarize;
   break after dummy3 / summarize;
   break after dummy4 / summarize;
   break after dummy5 / summarize;
   break after dummy6 / summarize;

  /* Calculate subtotals */
   compute invoice;  *Use the last variable on the COLUMN statement;
      if upcase(type) = "HYBRID" then do;
         msrptyp6 + msrp.sum;
         invotyp6 + invoice.sum;
      end;
      else if upcase(type) = "SUV" then do;
         msrptyp5 + msrp.sum;
         invotyp5 + invoice.sum;
      end;
      else if upcase(type) = "SEDAN" then do;
         msrptyp4 + msrp.sum;
         invotyp4 + invoice.sum;
      end;
      else if upcase(type) = "SPORTS" then do;
         msrptyp3 + msrp.sum;
         invotyp3 + invoice.sum;
      end;
      else if upcase(type) = "TRUCK" then do;
         msrptyp2 + msrp.sum;
         invotyp2 + invoice.sum;
      end;
      else if upcase(type) = "WAGON" then do;
         msrptyp1 + msrp.sum;
         invotyp1 + invoice.sum;
      end;
   endcomp;

  /* Reassign to subtotal values */
   compute after dummy6;
      origin = "TOTALS";
      type = "Hybrid";
      msrp.sum = msrptyp6;
      invoice.sum = invotyp6;
   endcomp;
   compute after dummy5;
      type = "SUV";
      msrp.sum = msrptyp5;
      invoice.sum = invotyp5;
   endcomp;
   compute after dummy4;
      type = "Sedan";
      msrp.sum = msrptyp4;
      invoice.sum = invotyp4;
   endcomp;
   compute after dummy3;
      type = "Sports";
      msrp.sum = msrptyp3;
      invoice.sum = invotyp3;
   endcomp;
   compute after dummy2;
      type = "Truck";
      msrp.sum = msrptyp2;
      invoice.sum = invotyp2;
   endcomp;
   compute after dummy1;
      type = "Wagon";
      msrp.sum = msrptyp1;
      invoice.sum = invotyp1;
   endcomp;
run;

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

If the difference could be as simple as the largest -smallest for a given group then the RANGE statistic is a likely candidate. But if actual order is important then I suspect preprocessing may be a better way to go in general. The headaches about trying to get the correct columns into a calculation in proc report when you want to use different columns if data is missing is not a trivial exercise.

 

A common issue being a "year to date" with monthly columns, similar to what I think you describe. Adjusting to get the column sums(means, ranges or what ever) shifting as the year goes on..

 

Depending on how you need to summarize the data I often find proc summary helpful as I can get multiple groups of summaries using the class statement and then watching the _type_ variable for which levels.

And often the report procedure will do a pretty good job of certain forms of transposition.

Good luck.

 

Not fan of spending the effort to place formulae into Excel cells as part of a solution because as soon as I do I know someone will ask for the output in PDF, or RTF, or make a chart to go along with a table in HTML, PDF or RTF output. Or use it to feed another procedure...

View solution in original post

8 REPLIES 8
ballardw
Super User

I don't think you are providing a very clear description of what you want.


@TimPrice wrote:

 So the following comes from http://support.sas.com/kb/43/975.html.

 

What if we add data by year and I want to show MSRP and Invoice as rows and show by year as the column where a user can scan across to see how MSRP has changed across years or one could compute a col of differences  between years?

 


You example should include something related to year in the data since that matches your description.

 

I suspect you might be looking for something like a YEAR variable used in an across role spanning other columns.

 

If you could make a small table that shows a desired appearance then we might be able to work toward it.

 

For example using the same data set (without all those dummy variables I am not quite sure why they were needed)

Proc report data=sashelp.cars;
   column origin type, msrp,(max min mean) ;
   define origin /group;
   define type/across;


run;

If the data set had a YEAR variable you could replace TYPE above with the YEAR and the result would have block of statistics for MSRP for each year.
 

 

TimPrice
Fluorite | Level 6

To be clear here is a partial sample of what the report might look like:

 

Multiple Rows Summarizing Type Variable

 
      year  
Origin Type   2017 2018 Difference
Asia Hybrid MSRP $59,760 $60,760 $1,000
    Invoice $55,288 $56,288 $1,000
  Sedan MSRP $2,139,813 $2,140,900 $1,087
    Invoice $1,954,101 $1,956,800 $2,699
ballardw
Super User

@TimPrice wrote:

To be clear here is a partial sample of what the report might look like:

 

Multiple Rows Summarizing Type Variable

 
      year  
Origin Type   2017 2018 Difference
Asia Hybrid MSRP $59,760 $60,760 $1,000
    Invoice $55,288 $56,288 $1,000
  Sedan MSRP $2,139,813 $2,140,900 $1,087
    Invoice $1,954,101 $1,956,800 $2,699

Since you are showing what would have to be a calculated column you are calling difference what would that actual contain if you have data from three years? Largest - smallest? Last year - First year? (which gets a bit problematic inside a report if all years are not present for all records).

 

For my preferences to get something similar to that appearance I would likely use proc tabulate instead of report.

Here is an example of creating a small data set simulating 3 years of data based on the SASHELP.CARS data set and then a summary. I am not claiming this to be a good simulation just something that generates 3 different values for each year.

Since you have not shown what the statistic requested might be I am following with the sum of msrp and invoice for the years and a range calculation for the period of data.

data cars2015;
   set sashelp.cars; (keep=origin type msrp invoice);
   where origin='Asia' and type in ('Sedan' 'SUV') and make in ('Acura' 'Kia');
   year=2015;
run;
data cars2016;
   set cars2015 ;
   msrp = msrp*(1+rand('uniform'));
   invoice = invoice*(1+rand('uniform'));
   year=2016;
run;
data cars2017;
   set cars2016 ;
   msrp = msrp*(1+rand('uniform'));
   invoice = invoice*(1+rand('uniform'));
   year=2017;
run;

data newcarset;
   set cars2015 cars2016 cars2017;
run;

proc tabulate data=newcarset format=dollar12.;
   class origin type year;
   var msrp invoice;
   table origin*type*(msrp invoice),
         year * sum=' ' All='Difference'*range=' '
   ;
run;
TimPrice
Fluorite | Level 6

Ballard,

First, thank you so much for taking the time to respond.   

 

I agree the straight forward solution would be to use tabulate.   And we could use whatever statistic, as we could always substitute in whatever we wanted.  I was using this data as a simple example, my exact application would be to show last month and current month or qtr over qtr change.  Then my ultimate goal was to use proc report to write a formula into the last cell involving the differences into an excel spreadsheet.  But the more I think about it, I'm thinking I might have to preprocess my data and transpose to use proc report.

 

ballardw
Super User

If the difference could be as simple as the largest -smallest for a given group then the RANGE statistic is a likely candidate. But if actual order is important then I suspect preprocessing may be a better way to go in general. The headaches about trying to get the correct columns into a calculation in proc report when you want to use different columns if data is missing is not a trivial exercise.

 

A common issue being a "year to date" with monthly columns, similar to what I think you describe. Adjusting to get the column sums(means, ranges or what ever) shifting as the year goes on..

 

Depending on how you need to summarize the data I often find proc summary helpful as I can get multiple groups of summaries using the class statement and then watching the _type_ variable for which levels.

And often the report procedure will do a pretty good job of certain forms of transposition.

Good luck.

 

Not fan of spending the effort to place formulae into Excel cells as part of a solution because as soon as I do I know someone will ask for the output in PDF, or RTF, or make a chart to go along with a table in HTML, PDF or RTF output. Or use it to feed another procedure...

Cynthia_sas
SAS Super FREQ

Hi:

  Speaking up for PROC REPORT, here. With a different structure to the data, PROC REPORT can do that same/similar report as TABULATE using YEAR as an ACROSS item. The data does have to be structured differently. PROC REPORT uses the "NEW" data in this program and TABULATE uses the "CARS" data in this program:


data new2015(keep=origin type make year varname varvalue)
     cars2015;
   set sashelp.cars(keep=origin type make msrp invoice);
   where origin='Asia' and 
         type in ('Sedan' 'SUV') and 
         Make in ('Acura' 'Kia');
   year=2015;
   output cars2015;
   varname = 'MSRP   ';
   varvalue = msrp;
   output new2015;
   varname = 'Invoice';
   varvalue = invoice;
   output new2015;
run;
data new2016(keep=origin type make year varname varvalue)
     cars2016;
   set cars2015 ;
   msrp = msrp*(1+rand('uniform'));
   invoice = invoice*(1+rand('uniform'));
   year=2016;
   output cars2016;
   varname = 'MSRP   ';
   varvalue = msrp;
   output new2016;
   varname = 'Invoice';
   varvalue = invoice;
   output new2016;
run;
data new2017(keep=origin type make year varname varvalue)
     cars2017;
   set cars2016 ;
   msrp = msrp*(1+rand('uniform'));
   invoice = invoice*(1+rand('uniform'));
   year=2017;
   output cars2017;
   varname = 'MSRP   ';
   varvalue = msrp;
   output new2017;
   varname = 'Invoice';
   varvalue = invoice;
   output new2017;
run;

data newcarset;
   set cars2015 cars2016 cars2017;
run;

data altcarset;
   set new2015 new2016 new2017;
run;

Then, here are the results from both TABULATE and REPORT (I added some more statistics to REPORT):

report_like_tab.png

and here's the REPORT code using the newly structured data (TABULATE code is the same as already posted):

proc report data=altcarset spanrows;
  title 'REPORT';
  column origin type varname varvalue,year 
         varvalue=vdiff varvalue=vmean varvalue=vmin varvalue=vmax;
  define origin / group style(column)=Header;
  define type / group style(column)=Header;
  define varname / group ' ' order=data style(column)=Header;
  define year / across 'Year';
  define varvalue/ sum f=dollar12. ' ';
  define vdiff / 'Difference' range f=dollar12.;
  define vmean / mean 'Average' f=dollar12.;
  define vmin / min 'Min' f=dollar12.;
  define vmax / max 'Max' f=dollar12.;
run;

Depends on the structure of the data and whether you need any of the other PROC REPORT features.

 

Cynthia

 

 

TimPrice
Fluorite | Level 6

Cynthia,

 

I have followed you on several other posts over the last year or so and have told others about your solutions.  Thank you for coming to the rescue!  I like this solution and I think this is what I will use as my template.

TimPrice
Fluorite | Level 6
And I accidently accepted the wrong solution. Thank you again Cynthia!!!

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 1525 views
  • 0 likes
  • 3 in conversation