Subtracting columns to get difference

Accepted Solution Solved
Reply
Highlighted
New Contributor
Posts: 3
Accepted Solution

Subtracting columns to get difference

I am new to programming and I don't really understand Arrays. I am trying to calculate the day on day variance of the below dataset

 

CountrySegmentMar_30Apr_2Apr_3Apr_4
SingaporeCorp                 126,196                 126,196                 126,196                 126,196
malaysiaRet             70,521,201             70,562,347             70,455,005             70,556,299

 

so the expected output is to have the same number of records but have day on day variance i.e for malaysia the Apr2 balance would be Apr2 balance minus Mar 30 balance , something like this,

 

CountrySegmentApr_2Apr_3Apr_4
SingaporeCorp                          -                            -                            -  
malaysiaRet                   41,145                (107,342)                 101,294

 

This is the code I am tusing but not getting expected output.I am using base SAS 9.3

 

data check_op;
set check_op_e;
array nums[*] _numeric_;
array aday[*] _numeric_;
do i=2 to dim(nums);
aday(i)=nums(i)-nums(i-1);
end;
run;

 

I hope someone can help me on this problem. Thanks in advance.


Accepted Solutions
Solution
Thursday
Super User
Posts: 9,548

Re: Subtracting columns to get difference

Since you are overwriting member 2 in the first iteration, your calculation in the second iteration will be done with the new value, causing unexpected results.

Once again, an example why wide datasets are a PITA to work with:

data check_op_e;
input Country :$15. Segment $ (Mar_30 Apr_2 Apr_3 Apr_4) (:comma12.);
format Mar_30 Apr_2 Apr_3 Apr_4 comma12.;
cards;
Singapore Corp 126,196 126,196 126,196 126,196
malaysia Ret 70,521,201 70,562,347 70,455,005 70,556,299
;
run;

proc transpose data=check_op_e out=check;
by country segment;
var _numeric_;
run;

data want;
set check;
by country notsorted segment notsorted;
diff = col1 - lag(col1);
if first.segment then diff = .;
run;

proc transpose data=want out=want_wide (drop=_name_);
by country segment;
var diff;
id _name_;
run;

See how simple the subtraction step is?

 

Also note how I presented your example data in a form that makes it easy to use (datastep with datalines). Also see my footnotes.

 

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code

View solution in original post


All Replies
Super User
Super User
Posts: 9,193

Re: Subtracting columns to get difference

The problem is you have fallen into the common "Excel way of thinking" trap.  Having columns with dates, or any actual "data" is really not a good idea in any sense.  Your array is failing likely because the system does not recognise that Mar_30 occurs before Apr_02 - these date concepts are not logical and the computer only understands logic.  

Now let me point one thing out before you say you "can't change anything" in that what you do internally to program this, does need to be what is output!

So, I would normalise the data, so you would have something like:
COUNTRY  SEGMENT   PERIOD        RESULT

Singapore   Corp             30MAR2018  126,196

Singapore   Corp             02APR2018  126,196

Singapore   Corp             03APR2018  126,196

...

 

With this data structure it is very simple to get differences, there is even a function created for it:

data want;
  set have;
  by country;
  if first.segment then deff_res=.;
  else diff_res=dif(result);
run;

http://support.sas.com/documentation/cdl/en/etsug/66100/HTML/default/viewer.htm#etsug_tsdata_sect057...

 

There are many ways to do this with the structure I give, and if you need the output transposed, do this at the end when you output the data with a proc transpose, best of both worlds.

New Contributor
Posts: 3

Re: Subtracting columns to get difference

Thanks for your input. Yes, I was trapped into the excel way of thinking but now I know I need to ignore my instincts to follow excel methods.
Solution
Thursday
Super User
Posts: 9,548

Re: Subtracting columns to get difference

Since you are overwriting member 2 in the first iteration, your calculation in the second iteration will be done with the new value, causing unexpected results.

Once again, an example why wide datasets are a PITA to work with:

data check_op_e;
input Country :$15. Segment $ (Mar_30 Apr_2 Apr_3 Apr_4) (:comma12.);
format Mar_30 Apr_2 Apr_3 Apr_4 comma12.;
cards;
Singapore Corp 126,196 126,196 126,196 126,196
malaysia Ret 70,521,201 70,562,347 70,455,005 70,556,299
;
run;

proc transpose data=check_op_e out=check;
by country segment;
var _numeric_;
run;

data want;
set check;
by country notsorted segment notsorted;
diff = col1 - lag(col1);
if first.segment then diff = .;
run;

proc transpose data=want out=want_wide (drop=_name_);
by country segment;
var diff;
id _name_;
run;

See how simple the subtraction step is?

 

Also note how I presented your example data in a form that makes it easy to use (datastep with datalines). Also see my footnotes.

 

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
New Contributor
Posts: 3

Re: Subtracting columns to get difference

Posted in reply to KurtBremser

Thanks a lot , Kurt. This is exactly what I was looking for and now I see what I was doing wrong. Will try to avoid having wide data as much as possible. 

PROC Star
Posts: 1,289

Re: Subtracting columns to get difference

[ Edited ]

@Din4 Your array required a minor adjustment with a self imposed interleave, should you wanna have fun playing with it

 

data check_op_e;
input Country :$15. Segment $ (Mar_30 Apr_2 Apr_3 Apr_4) (:comma12.);
format Mar_30 Apr_2 Apr_3 Apr_4 comma12.;
cards;
Singapore Corp 126,196 126,196 126,196 126,196
malaysia Ret 70,521,201 70,562,347 70,455,005 70,556,299
;
run;

data check_op;
set check_op_e check_op_e;
by country Segment;
array nums[*] _numeric_;
do i=2 to dim(nums);
_k=lag(nums(i-1));
if last.country then nums(i)=nums(i)-_k;
end;
if last.country;
drop _k i;
run;

 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 133 views
  • 0 likes
  • 4 in conversation