I want to subtract one record from the other where the years are the same without hard coding the years. I think I'm complicating it more than it has to be so hope someone has an easier solution.
What I have:
Data sample_data;
input Fiscal_Yr$ ValueX Category$ ;
datalines;
FY17 7 A
FY17 21 B
FY18 5 A
FY18 15 B
;
Run;
I need an end result that has the difference between category B minus A bsed on fiscal year with a new category C.
FY17 14 C
FY18 10 C
Any help is appreciated.
Two approaches: either use RETAIN to remember values between observations, do the math, and logic for output your WANT observation.
Or, use SQL and join the table with itself (use alias for a filtering sub-query where you select A to be joined with B observations).
select a.Fiscal_Yr, b.ValueX - a.ValueX as diff, 'C' as Category
from (select * from sampla_data where Category = 'A') as a
inner join (select * from sampla_data where Category = 'B') as b
on a.Fiscal_Yr = b.Fiscal_Yr
;
Two approaches: either use RETAIN to remember values between observations, do the math, and logic for output your WANT observation.
Or, use SQL and join the table with itself (use alias for a filtering sub-query where you select A to be joined with B observations).
select a.Fiscal_Yr, b.ValueX - a.ValueX as diff, 'C' as Category
from (select * from sampla_data where Category = 'A') as a
inner join (select * from sampla_data where Category = 'B') as b
on a.Fiscal_Yr = b.Fiscal_Yr
;
Thank you for the reply.
I've changed your first data step to reflect the variable names I think you intended otherwise this should give you what you want
Data sample_data;
input Fiscal_Yr $ ValueX Category $ ;
datalines;
FY17 7 A
FY17 21 B
FY18 5 A
FY18 15 B
;
Run;
proc sort data=sample_data;
by fiscal_yr category;
run;
data want(drop=pre_val);
set sample_data;
by fiscal_yr category;
pre_val=lag(valuex);
if last.fiscal_yr then do;
category="C";
valuex=valuex-pre_val;
output;
end;
run;
Thank you for the reply!
If the Fiscal_Yr are sets of two, you can skip by group processing:
Data sample_data;
input Fiscal_Yr$ ValueX Category$ ;
datalines;
FY17 7 A
FY17 21 B
FY18 5 A
FY18 15 B
;
Run;
data want(drop=t);
set sample_data;
t=lag(ValueX);
if mod(_n_,2)=0 then do;
ValueX=ValueX-t;
Category='C';
output;
end;
run;
That's true @novinosrin but I'd still be inclined to sort the file "just in case" because you know that if you don't then one day it'll be out of order 🙂
@ChrisBrooks Absolutely right, Can't avoid a sort. I took the sample as it was already sorted. I guess the only way to avoid a sort is proc sql or hashes. Linus gave the SQL solution, you gave the datastep and I had some fun. lol. And, I am not so keen to give a hash solution for this req. 🙂
Using self merge.
Data sample_data;
input Fiscal_Yr$ ValueX Category$;
datalines;
FY17 7 A
FY17 21 B
FY18 5 A
FY18 15 B
;
Run;
data tmp;
merge sample_data sample_data(firstobs=2 rename = (Valuex= _Valuex Category= _Category));
Valuex = _Valuex - Valuex;
Category = "C";
run;
data want (drop = _:);
do _n_ = 1 to n by 2;
set tmp point = _n_ nobs= n;
output;
end;
stop;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.