Solved: Re: Subtracting from different rows meeting certain criteria

CLE · Posted 09-15-2017 09:21 AM

I want to subtract one record from the other where the years are the same without hard coding the years. I think I'm complicating it more than it has to be so hope someone has an easier solution.

What I have:

Data sample_data;

input Fiscal_Yr$ ValueX Category$ ;

datalines;

FY17 7 A

FY17 21 B

FY18 5 A

FY18 15 B

;

Run;

I need an end result that has the difference between category B minus A bsed on fiscal year with a new category C.

FY17 14 C

FY18 10 C

Any help is appreciated.

LinusH · Posted 09-15-2017 09:42 AM

Two approaches: either use RETAIN to remember values between observations, do the math, and logic for output your WANT observation.

Or, use SQL and join the table with itself (use alias for a filtering sub-query where you select A to be joined with B observations).

select a.Fiscal_Yr, b.ValueX - a.ValueX as diff, 'C' as Category
from (select * from sampla_data where Category = 'A') as a
inner join (select * from sampla_data where Category = 'B') as b
on a.Fiscal_Yr = b.Fiscal_Yr
;

Data never sleeps

View solution in original post

LinusH · Posted 09-15-2017 09:42 AM

Two approaches: either use RETAIN to remember values between observations, do the math, and logic for output your WANT observation.

Or, use SQL and join the table with itself (use alias for a filtering sub-query where you select A to be joined with B observations).

select a.Fiscal_Yr, b.ValueX - a.ValueX as diff, 'C' as Category
from (select * from sampla_data where Category = 'A') as a
inner join (select * from sampla_data where Category = 'B') as b
on a.Fiscal_Yr = b.Fiscal_Yr
;

Data never sleeps

CLE · Posted 09-15-2017 10:15 AM

Thank you for the reply.

ChrisBrooks · Posted 09-15-2017 09:55 AM

I've changed your first data step to reflect the variable names I think you intended otherwise this should give you what you want

Data sample_data;
input Fiscal_Yr $ ValueX Category $ ;
datalines;
FY17  7 A
FY17 21 B
FY18 5 A
FY18 15 B
;
Run;

proc sort data=sample_data;
	by fiscal_yr category;
run;

data want(drop=pre_val);
	set sample_data;
	by fiscal_yr category;
	pre_val=lag(valuex);
	if last.fiscal_yr then do;
		category="C";
		valuex=valuex-pre_val;
		output;
	end;
run;

CLE · Posted 09-15-2017 10:17 AM

Thank you for the reply!

novinosrin · Posted 09-15-2017 01:57 PM

If the Fiscal_Yr are sets of two, you can skip by group processing:

Data sample_data;

input Fiscal_Yr$ ValueX Category$ ;

datalines;

FY17 7 A

FY17 21 B

FY18 5 A

FY18 15 B

;

Run;

data want(drop=t);

set sample_data;

t=lag(ValueX);

if mod(_n_,2)=0 then do;

ValueX=ValueX-t;

Category='C';

output;

end;

run;

ChrisBrooks · Posted 09-15-2017 02:18 PM

That's true @novinosrin but I'd still be inclined to sort the file "just in case" because you know that if you don't then one day it'll be out of order 🙂

novinosrin · Posted 09-15-2017 02:22 PM

@ChrisBrooks Absolutely right, Can't avoid a sort. I took the sample as it was already sorted. I guess the only way to avoid a sort is proc sql or hashes. Linus gave the SQL solution, you gave the datastep and I had some fun. lol. And, I am not so keen to give a hash solution for this req. 🙂

SAS_inquisitive · Posted 09-15-2017 02:45 PM

Using self merge.

Data sample_data;
	input Fiscal_Yr$ ValueX Category$;
	datalines;
FY17  7 A
FY17 21 B
FY18 5 A
FY18 15 B
;
Run;

data tmp;
	merge sample_data sample_data(firstobs=2 rename = (Valuex= _Valuex Category= _Category));
	Valuex = _Valuex - Valuex;
	Category = "C";
run;

data want (drop = _:);
	do _n_ = 1 to n by 2;
		set tmp point = _n_ nobs= n;
		output;
	end;

	stop;
run;

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away