I have a dataset that looks like that following. I am trying to find the best way to calculate the increase across days (DIFDay2-Day1, DIFDay3-Day2, DIFDay4-Day3) My data going forward will have more days included so I'm hoping I don't have to list out each day. Would an array be best here?
Group | Day1 | Day2 | Day3 | Day4 | DIFDay2 - Day1 | DIFDay3 - Day2 | DIFDay4 - Day3 |
G1 | 2639 | 2639 | 2645 | 2671 | |||
G2 | 3079 | 3503 | 3554 | 3620 | |||
G3 | 285 | 291 | 358 | 379 | |||
G4 | 9434 | 9510 | 9570 | 9690 | |||
G5 | 13182 | 13525 | 13548 | 13902 |
Here's a good start:
data have;
input Group:$2 Day1 Day2 Day3 Day4;
datalines;
G1 2639 2639 2645 2671
G2 3079 3503 3554 3620
G3 285 291 358 379
G4 9434 9510 9570 9690
G5 13182 13525 13548 13902
;
proc sql noprint;
select count(*) into :dim
from dictionary.columns
where libname="WORK" and memname="HAVE"
and name like "Day%"
;
data want;
set have;
array Day[*] Day:;
array DiffDay[&dim];
do i=2 to dim(DiffDay);
DiffDay[i]=Day[i]-Day[i-1];
end;
drop i DiffDay1;
run;
Result:
Group | Day1 | Day2 | Day3 | Day4 | DiffDay2 | DiffDay3 | DiffDay4 |
---|---|---|---|---|---|---|---|
1 | 2639 | 2639 | 2645 | 2671 | 0 | 6 | 26 |
2 | 3079 | 3503 | 3554 | 3620 | 424 | 51 | 66 |
3 | 285 | 291 | 358 | 379 | 6 | 67 | 21 |
4 | 9434 | 9510 | 9570 | 9690 | 76 | 60 | 120 |
5 | 13182 | 13525 | 13548 | 13902 | 343 | 23 | 354 |
Here's a good start:
data have;
input Group:$2 Day1 Day2 Day3 Day4;
datalines;
G1 2639 2639 2645 2671
G2 3079 3503 3554 3620
G3 285 291 358 379
G4 9434 9510 9570 9690
G5 13182 13525 13548 13902
;
proc sql noprint;
select count(*) into :dim
from dictionary.columns
where libname="WORK" and memname="HAVE"
and name like "Day%"
;
data want;
set have;
array Day[*] Day:;
array DiffDay[&dim];
do i=2 to dim(DiffDay);
DiffDay[i]=Day[i]-Day[i-1];
end;
drop i DiffDay1;
run;
Result:
Group | Day1 | Day2 | Day3 | Day4 | DiffDay2 | DiffDay3 | DiffDay4 |
---|---|---|---|---|---|---|---|
1 | 2639 | 2639 | 2645 | 2671 | 0 | 6 | 26 |
2 | 3079 | 3503 | 3554 | 3620 | 424 | 51 | 66 |
3 | 285 | 291 | 358 | 379 | 6 | 67 | 21 |
4 | 9434 | 9510 | 9570 | 9690 | 76 | 60 | 120 |
5 | 13182 | 13525 | 13548 | 13902 | 343 | 23 | 354 |
I'm on a PROC SCORE kick. You need a custom SCORE data set like this.
Which you can generate with iterative DO.
data have;
infile cards expandtabs;
input Group:$2. Day1-Day4;
datalines;
G1 2639 2639 2645 2671
G2 3079 3503 3554 3620
G3 285 291 358 379
G4 9434 9510 9570 9690
G5 13182 13525 13548 13902
;;;;
run;
data score;
if 0 then set have(keep=Day:);
array v[*] _numeric_;
retain _TYPE_ 'SCORE';
length _name_ $32;
do i = 1 to dim(v);
call missing(of v[*]);
v[i]=-1;
do j = i+1 to dim(v);
_name_ = catx('_',vname(v[i]),vname(v[j]));
v[j] = 1;
output;
v[j] = .;
end;
end;
drop i j;
stop;
run;
proc print;
run;
proc score data=have score=score out=diffs;
id group day:;
var day:;
run;
proc print;
run;
And SCORE using PROC SCORE.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.