BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
GregG
Quartz | Level 8

I'm trying to output the growth percentage from one year to the next.  For ease of showing, I've condensed the data to this:

data new;

  infile datalines dlm=',' dsd;

  input year $ gr_count ug_count @@;

datalines;

"2008",5045,7420

"2009",5500,7837

"2010",5696,8481

"2011",5708,9010

"2012",5725,9443

;

run;


What I would like the report to look like (as close as possible) is this:


----     9.02     3.56     0.21     0.30     3.27

2008     2009     2010     2011     2012     avg

(gr_count percent change from previous year and overall average of all years)


( change = ( (2009-gr_count-# - 2008-gr_count-#) / (2008-gr_count-#) )* 100 )

( 9.02 = ( (5500-5045)/5045 ) * 100 ) rounded to two decimal places )

( avg = percentages summed and divided by number of percentages )


The actual data has year and class, and has 12,465 total records for year 2008 (with 5,045 where class = 'gr' and 7,420 where class = 'ug', 13,337 total records for 2009, etc., and there are other fields that for which I will need to calculate the growth (in different outputs).


I'd love to program SAS to do this calculation for me, but I've been working with SAS for about 2 months now, and I've been teaching myself as I go - so I have never encountered anything like this and I can't find a resource that will tell me how to do exactly this.


Thank you.

1 ACCEPTED SOLUTION

Accepted Solutions
damanaulakh88
Obsidian | Level 7

Hi Greg,

Please find below the code you require :-

================================================================

data new;

infile datalines dlm=',' dsd;

input year $ gr_count ug_count;

datalines;

"2008",5045,7420

"2009",5500,7837

"2010",5696,8481

"2011",5708,9010

"2012",5725,9443

;

run;

data percent;

set new;

a=lag(gr_count);

change=((gr_count-a)/a)*100;

mean=mean(change);

run;

proc transpose data=percent out=new prefix=Year;

id year;

var change;

run;

data final(drop= _NAME_);

set new;

Average=mean(of Year2008-Year2012);

run;

=======================================================================

Output:-

=============================================================================

                             Obs    Year2008    Year2009    Year2010    Year2011    Year2012    Average

                              1         .        9.01883     3.56364     0.21067     0.29783    3.27274

==============================================================================

/Daman

View solution in original post

6 REPLIES 6
damanaulakh88
Obsidian | Level 7

Hi Greg,

Please find below the code you require :-

================================================================

data new;

infile datalines dlm=',' dsd;

input year $ gr_count ug_count;

datalines;

"2008",5045,7420

"2009",5500,7837

"2010",5696,8481

"2011",5708,9010

"2012",5725,9443

;

run;

data percent;

set new;

a=lag(gr_count);

change=((gr_count-a)/a)*100;

mean=mean(change);

run;

proc transpose data=percent out=new prefix=Year;

id year;

var change;

run;

data final(drop= _NAME_);

set new;

Average=mean(of Year2008-Year2012);

run;

=======================================================================

Output:-

=============================================================================

                             Obs    Year2008    Year2009    Year2010    Year2011    Year2012    Average

                              1         .        9.01883     3.56364     0.21067     0.29783    3.27274

==============================================================================

/Daman

Cynthia_sas
SAS Super FREQ

Hi:

  As an alternative, that does not need the final PROC TRANSPOSE or the last DATA step, you could use either PROC REPORT or PROC TABULATE directly on the PERCENT dataset. The hardest part is doing the LAG and once you have that, there's no need to transpose in a second pass through the data or take the average in a third pass through the data. See attached screenshot produced with the code below. Notice that in my data, I do not get the MEAN variable in the PERCENT dataset, because both REPORT and TABULATE can calculate the MEAN statistic. PROC REPORT does need a "row" variable when you use ACROSS, which I called ROWVAR. If you don't want to see it, you could always put NOPRINT as an option on the DEFINE statement. But, TABULATE does not need a row item. It can make a table entirely with COLUMNS.

  Also, in my PERCENT data, I make an "ALT_CHANGE" variable that is not multiplied by 100. If you want to use the SAS PERCENT format for display of a "%", it will do an automatic multiply by 100.

cynthia

data new;

infile datalines dlm=',' dsd;

input year $ gr_count ug_count;

datalines;
"2008",5045,7420
"2009",5500,7837
"2010",5696,8481
"2011",5708,9010
"2012",5725,9443
;
run;


data percent;
set new;
a=lag(gr_count);
change=((gr_count-a)/a)*100;
alt_change=((gr_count-a)/a);
rowvar = 'Percent Change';
put _all_;
run;

     

options missing = '-';
ods listing close;
ods html file='use proc report.html';

    
proc report data=percent nowd;
title '1) Using PROC REPORT instead of TRANSPOSE';
  column rowvar change,year change=cmean;
  define rowvar / group 'Type';
  define year / across;
  define change / mean ' ' f=9.2;
  define cmean / mean 'Overall Average' f=9.2;
run;
         
proc report data=percent nowd;
title '2) Getting % with PROC REPORT and NO TRANSPOSE';
  column rowvar alt_change,year alt_change=cmean;
  define rowvar / group 'Type';
  define year / across;
  define alt_change / mean ' ' f=percent9.2;
  define cmean / mean 'Overall Average' f=percent9.2;
run;
      
proc tabulate data=percent f=9.2;
  title '3 Using PROC TABULATE and NO TRANSPOSE';
  class year;
  var change;
  table year*change=' '*mean=' ' all*change=' '*mean=' ';
run;
 
ods html close;
 
options missing='.';

GregG
Quartz | Level 8

Thank you both so much!

I ended up using a little from both - the key was the lag function!  Once I was aware of that, it was MUCH easier.

In my original "code sample", I included the ( * 100 ) bit, but I took that out and opted for the PERCENT8.2 format instead.

Thank you again!

art297
Opal | Level 21

Greg,

You already have two useful suggestions but, since you are learning SAS, I decided to throw in one more that does everything in one simple datastep.  Is there any reason why you enter year as a character variable?  In the following, I enter the years as numbers:

data new (keep=year2008-year2012 Avg);

  infile datalines dlm=',' dsd eof=last;

  array gr_array(2008:2012);

  array ug_array(2008:2012);

  array year_array(2008:2012) year2008-year2012;

  format year2008-year2012 Avg percent8.2;

  retain gr_array ug_array;

  input year gr_count ug_count;

  gr_array(year)=gr_count;

  ug_array(year)=ug_count;

  year_array(year)=year;

  return;

  last:

    do year=2009 to 2012;

      year_array(year)=gr_array(year)/gr_array(year-1)-1;

    end;

    Avg=mean(of Year2008-Year2012);

    output;

  datalines;

2008,5045,7420

2009,5500,7837

2010,5696,8481

2011,5708,9010

2012,5725,9443

;

run;

art297
Opal | Level 21

Greg,

Also, if you like the approach I suggested, you could wrap it in a macro and build all of your output files in one step.  e.g.:

data have;

  infile datalines dlm=',' dsd;

  input year gr_count ug_count;

  datalines;

2008,5045,7420

2009,5500,7837

2010,5696,8481

2011,5708,9010

2012,5725,9443

;

run;

%macro dothemall;

  proc sql noprint;

    select name

      into:names separated by " "

        from dictionary.columns

          where libname="WORK" and

                memname="HAVE" and

                lowcase(name) ne "year"

    ;

  quit;

  %let i=1;

  %do %while (%scan(&names,&i.," ") ne );

    data %scan(&names,&i.," ") (keep=year2008-year2012 Avg);

      set have (keep=year %scan(&names,&i.," ")) end=last;

      array var_array(2008:2012);

      array year_array(2008:2012) year2008-year2012;

      format year2008-year2012 Avg percent8.2;

      retain var_array;

      var_array(year)=%scan(&names,&i.," ");

      year_array(year)=year;

      if last then do;

        do year=2009 to 2012;

          year_array(year)=var_array(year)/var_array(year-1)-1;

        end;

     Avg=mean(of Year2008-Year2012);

        output;

      end;

    run;

    %let i=%eval(&i.+1);

  %end;

%mend dothemall;

%dothemall

GregG
Quartz | Level 8

Arthur,

Thank you so much for your additional methods.  It may take me a while to fully unpack and adapt your method, but I definitely love learning different approaches.

I am using year as a string because I am using it as the vbar in a gchart. When I tried this with numerical year, it would create classes (2002.5 2007.5, etc), and I need things reported by instances per year.

There may be another way, but I was handed this as a kind of last minute thing, so much of what I did was "brute force" and not particularly elegant, even by the standards of what little knowledge I have.

I have completed the report, which used nearly 400,000 records split across 11 years and was reported differently on 27 pages, each with a gchart and a table that reported percentage growth of different variables per year.  Now that it is done, I plan on exploring various method and seeing if I can tighten it up a bit.  I've used macros in the past before, but due to the rushed nature of this, and how I (at least initially) felt that each page was different enough to make the macro not my best option.

The lag function, of which I was not previously aware, made a huge difference - but I also like elegant solutions when I understand them.

Thank you all again!

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 5774 views
  • 6 likes
  • 4 in conversation