I want to subtract one record from the other where the years are the same without hard coding the years. I think I'm complicating it more than it has to be so hope someone has an easier solution.
What I have:
Data sample_data;
input Fiscal_Yr$ ValueX Category$ ;
datalines;
FY17 7 A
FY17 21 B
FY18 5 A
FY18 15 B
;
Run;
I need an end result that has the difference between category B minus A bsed on fiscal year with a new category C.
FY17 14 C
FY18 10 C
Any help is appreciated.
Two approaches: either use RETAIN to remember values between observations, do the math, and logic for output your WANT observation.
Or, use SQL and join the table with itself (use alias for a filtering sub-query where you select A to be joined with B observations).
select a.Fiscal_Yr, b.ValueX - a.ValueX as diff, 'C' as Category
from (select * from sampla_data where Category = 'A') as a
inner join (select * from sampla_data where Category = 'B') as b
on a.Fiscal_Yr = b.Fiscal_Yr
;
Two approaches: either use RETAIN to remember values between observations, do the math, and logic for output your WANT observation.
Or, use SQL and join the table with itself (use alias for a filtering sub-query where you select A to be joined with B observations).
select a.Fiscal_Yr, b.ValueX - a.ValueX as diff, 'C' as Category
from (select * from sampla_data where Category = 'A') as a
inner join (select * from sampla_data where Category = 'B') as b
on a.Fiscal_Yr = b.Fiscal_Yr
;
Thank you for the reply.
I've changed your first data step to reflect the variable names I think you intended otherwise this should give you what you want
Data sample_data;
input Fiscal_Yr $ ValueX Category $ ;
datalines;
FY17 7 A
FY17 21 B
FY18 5 A
FY18 15 B
;
Run;
proc sort data=sample_data;
by fiscal_yr category;
run;
data want(drop=pre_val);
set sample_data;
by fiscal_yr category;
pre_val=lag(valuex);
if last.fiscal_yr then do;
category="C";
valuex=valuex-pre_val;
output;
end;
run;
Thank you for the reply!
If the Fiscal_Yr are sets of two, you can skip by group processing:
Data sample_data;
input Fiscal_Yr$ ValueX Category$ ;
datalines;
FY17 7 A
FY17 21 B
FY18 5 A
FY18 15 B
;
Run;
data want(drop=t);
set sample_data;
t=lag(ValueX);
if mod(_n_,2)=0 then do;
ValueX=ValueX-t;
Category='C';
output;
end;
run;
That's true @novinosrin but I'd still be inclined to sort the file "just in case" because you know that if you don't then one day it'll be out of order 🙂
@ChrisBrooks Absolutely right, Can't avoid a sort. I took the sample as it was already sorted. I guess the only way to avoid a sort is proc sql or hashes. Linus gave the SQL solution, you gave the datastep and I had some fun. lol. And, I am not so keen to give a hash solution for this req. 🙂
Using self merge.
Data sample_data;
input Fiscal_Yr$ ValueX Category$;
datalines;
FY17 7 A
FY17 21 B
FY18 5 A
FY18 15 B
;
Run;
data tmp;
merge sample_data sample_data(firstobs=2 rename = (Valuex= _Valuex Category= _Category));
Valuex = _Valuex - Valuex;
Category = "C";
run;
data want (drop = _:);
do _n_ = 1 to n by 2;
set tmp point = _n_ nobs= n;
output;
end;
stop;
run;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.