Compare and calculate numeric values based on character variables

Reply
Occasional Contributor
Posts: 16

Compare and calculate numeric values based on character variables

Hi,

I have the following dataset

Date       Company      Sales

Jan05     Coca-Cola      256    

Jan05     Lowes            70

Jan05     Apple             90    

Feb05     McDonald      456

Feb05     Apple            89

Feb05     Microsoft      876

March05

April05

.

.

.

Jan06     Apple            678

Jan06     Lowes           87

Jan06     Intel              56 

Feb06    McDonald      67

Feb06    Apple            281

Feb06    Barclays       876

March06

April06

.

.

.

Jan07    Apple              567

Jan07    Lowes             76

Jan07    Intel                87

Feb07    TJmax           678

Feb07    McDonald      765

Feb07   Astraz            67

March07   

April06

.

.

.

May15

As you can see, the data set starts from January 2005 and continues all the way up to May 2015. Each month of the year has hundreds of Companies with different Sales number

I am trying to calculate a 12 month sales growth rate for each Company that are obviously 12 months apart. What is the best way to do this considering the following issues:

I obviously can only calculate growth rate for companies that are present 12 months apart. For example to calculate growth rate for Jan05 to Jan06 I can only calculate growth rate for Lowes and Apple. For Jan06 to Jan07 I can calculate growth rate for all three companies (Apple, Intel and Lowes) since they are available on both dates.  How can I accomplish such a task considering that there are hundreds of companies in each for each month and year and it's impossible to mention each company individually.

Grand Advisor
Posts: 10,210

Re: Compare and calculate numeric values based on character variables

First thing is to have a SAS date valued variable so you can do time interval determinations:

data temp;

     set have;

    newdate= input(catt(substr(date,1,3), substr(date,(anydigit(date)))),monyy5.);

     format newdate monyy5.;

run;

proc sql;

     create table want as

     select a.*, b.newdate as basedate, b.sales as basesales

     from temp as a join temp.b

          on a.company = b.company and a.date = intnx('month',b.newdate,12);

quit;

/* you should be able to calculate your rate with a.sales (the later value) and b.sales (earlier0

but I don't know what calculation you may want.

Occasional Contributor
Posts: 16

Re: Compare and calculate numeric values based on character variables

I actually wanted to calculate a 12 month median growth rate. Can you tell me what does code exactly does, specially since I am not that familiar with proc sql.  Thanks!

Grand Advisor
Posts: 10,210

Re: Compare and calculate numeric values based on character variables

You will need to define what a median 12 month growth rate would be. I don't know what that is.

Since you said you needed to compare 2 values that looked to be 12 months apart I tried to get the 2 values together to allow that.

Are you talking median within company? Median across companies within some specific interval such as all companies with values for Jan05 and Jan06? Something else?

Overall change in sales would be the later minus the earlier value. I suspect that with more information we can likely get something out of Proc Means.


Occasional Contributor
Posts: 16

Re: Compare and calculate numeric values based on character variables

A 12 month median growth rate for each company 12 months apart. So for example, the median growth rate for Jan06 for Apple will be (assuming the numbers I have are median numbers is:

(Jan06 Sales Number/Jan05 Sales Number)-1 *100. So it will be (678/90)-1 *100. Note how I cannot calculate a growth rate for Coca-Cola cause it doesn't exists in both data sets that are 12 months apart.

This should then continue for Feb, March, etc etc.

Grand Advisor
Posts: 10,210

Re: Compare and calculate numeric values based on character variables

The example code below maintains the dates and sales numbers for verification, remove from the SELECT cause as needed.

proc sql;

     create table want as

     select a.*, b.newdate as basedate, b.sales as basesales, ( (a.sales/ b.sales) -1 *100) as GrowthRate

     from temp as a join temp.b

          on a.company = b.company and a.date = intnx('month',b.newdate,12);

quit;

I have no understanding how that is considered a median anything as median is usually the value that have one-half the values less than and one-half greater. Usually if I have values 1 2 3 5 198, then 3 would be the median value.

Ask a Question
Discussion stats
  • 5 replies
  • 369 views
  • 1 like
  • 2 in conversation