DATA Step, Macro, Functions and more

compaare observations and extract deltas

Reply
Frequent Contributor
Posts: 127

compaare observations and extract deltas

Dear experts,

 

here what I have:

data have;
source="y_2016"; segment="a"; revenue=100; output;
source="y_2016"; segment="b"; revenue=50; output;
source="y_2015"; segment="a"; revenue=90; output;
source="y_2015"; segment="b"; revenue=30; output;
run;

 and what I want:

 

data want;
source="y_2016"; segment="a"; revenue=100; output;
source="y_2016"; segment="b"; revenue=50; output;
source="y_2015"; segment="a"; revenue=90; output;
source="y_2015"; segment="b"; revenue=30; output;
source="delta"; segment="a"; revenue=10; output;
source="delta"; segment="b"; revenue=20; output;
source="delta%"; segment="a"; revenue=10/90; output;
source="delta%"; segment="b"; revenue=20/30; output;
run;

 

how can I specify the calculation I want?

Super User
Posts: 5,081

Re: compaare observations and extract deltas

You would need to sort the data, to get those results (but in a different order).  For example:

 

proc sort data=have;

by segment source;

run;

 

data want;

set have;

by segment;

if first.segment that starting_value=revenue;

retain starting_value;

output;

if last.segment and (first.segment=0);

source='delta';

revenue = revenue - starting_value;

output;

source='delta%';

revenue = revenue/starting_value;

output;

run;

Frequent Contributor
Posts: 127

Re: compaare observations and extract deltas

Dear @Astounding,

 

thanks a lot, it fits to the case I provided my my case is a bit more complicated: I do not have only revenues but several variables and I cannot list the calculation for all of them. Anny way to automatize it? Thanks again, SH.

Super User
Posts: 5,081

Re: compaare observations and extract deltas

At this point, there's too much that I would need to guess.  You'll need to show what the inputs and outputs look like when you have an additional variable.

Frequent Contributor
Posts: 127

Re: compaare observations and extract deltas

[ Edited ]

well let's assume that the input is the following one with revenue from revenue1 to revenue100.

 

data have;
source="y_2016"; segment="a"; revenue1=100; revenue2=70; output;
source="y_2016"; segment="b"; revenue1=50; revenue2=60; output;
source="y_2015"; segment="a"; revenue1=90; revenue2=70; output;
source="y_2015"; segment="b"; revenue1=30; revenue2=700; output;
run;

 

The output should be done in the same way, repeating the calculation for each variable. Basically I am trying to calculate difference between two tables from different years (2015 and 2016) aggregated with the same granularity (all the variables by segment).

Super User
Posts: 5,081

Re: compaare observations and extract deltas

For that sort of problem, arrays are the typical tool that lets you handle many variables.  Here would be an approach:

 

proc sort data=have;

by segment source;

run;

 

data want;

set have;

by segment;

array revs {100} revenue1-revenue100;

array starts {100} starting_value1-starting_value100;

if first.segment then do _n_=1 to 100;

   starts{_n_} = revs{_n_};

end;

retain starting_value1 - starting_value100;

output;

if last.segment and (first.segment=0);

source='delta';

do _n_=1 to 100;

   revs{_n_} = revs{_n_} - starts{_n_};

end;

output;

source='delta%';

do _n_=1 to 100;

   revs{_n_} = revs{_n_} / starts{_n_};

end;

output;

run;

 

As you add variables, there can be silly things that happen with the data that you might want to account for.  For example, can your % be negative instead of positive?  Can the denominator be zero?  Can there be missing values in your data?  In any case, arrays let you process many variables in the same fashion without adding too much to the code.  The only burdensome part might be adding all the names of the variables to the ARRAY statement.

Ask a Question
Discussion stats
  • 5 replies
  • 278 views
  • 0 likes
  • 2 in conversation