I am trying to calculate the percentage change between the values of first and last updated column.
For eg. For the first row, it should be (89-85)/85. Second row - (74-79)/79 and so on.
To solve it, i have transposed the data set and generate serial number against IDs and then pick variables based on the highest and smallest serial numbers to calculate % change. It seems i make it complex. There should be a simple way to calculate it.
data temp;
input ID x1-x4;
cards;
1 85 89 . .
2 79 75 74 .
3 80 82 86 85
;
run;
proc transpose data = temp out = temp2;
by ID;
var x1-x4;
run;
data temp3;
set temp2 (where = (COL1 ^= .));
retain Serial 0;
If first.ID then Serial = 1;
else Serial = Serial + 1;
by ID;
run;
proc sql noprint;
create table t2 as
select a.ID, a.COL1, b.COL1 as COL2, (a.COL1-b.COL1)/b.COL1 *100 as change
from temp3 a left join
(select ID, COL1 from temp3
group by ID
having serial = min(serial)) b
on a.ID = b.ID
group by a.ID
having serial = max(a.serial);
create table t3 as
select * from temp a left join t2 b
on a.ID = b.ID;
quit;
Or still a bit shorter:
data want;
set temp;
pct_chng=coalesce(x4,x3,x2,x1)/x1-1;
run;
However, your PROC SQL approach would produce different results for observations with, say, x1=. and x2>., namely the change from the first non-missing to the last non-missing value in (x1, x2, x3, x4). If this is what you want, you could define
pct_chng=coalesce(x4,x3,x2,x1)/coalesce(of x1-x4)-1;
Either way, of course, you should be sure that the denominator cannot be zero or add code to handle this case.
The simple change you mention in the text is this:
data temp;
input ID x1-x4;
pct_chng = (x2-x1)/x1;/* or multiply by 100 if you really need to but a Percent format may be more appropriate*/
cards;
1 85 89 . .
2 79 75 74 .
3 80 82 86 85
;
run;
I'm not sure, but were you looking to find the smallest of changes if using x2, x3 and x4?
I think this sort of simplification is entirely the right approach. But I think your problem is marginally more complex:
lastval=x2;
if x3 > . then lastval=x3;
if x4 > . then lastval=x4;
pct_change = (lastval - x1) / x1;
It's still a one-step process though.
Or still a bit shorter:
data want;
set temp;
pct_chng=coalesce(x4,x3,x2,x1)/x1-1;
run;
However, your PROC SQL approach would produce different results for observations with, say, x1=. and x2>., namely the change from the first non-missing to the last non-missing value in (x1, x2, x3, x4). If this is what you want, you could define
pct_chng=coalesce(x4,x3,x2,x1)/coalesce(of x1-x4)-1;
Either way, of course, you should be sure that the denominator cannot be zero or add code to handle this case.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.