I have a dataset that looks like that following. I am trying to find the best way to calculate the increase across days (DIFDay2-Day1, DIFDay3-Day2, DIFDay4-Day3) My data going forward will have more days included so I'm hoping I don't have to list out each day. Would an array be best here?
Group | Day1 | Day2 | Day3 | Day4 | DIFDay2 - Day1 | DIFDay3 - Day2 | DIFDay4 - Day3 |
G1 | 2639 | 2639 | 2645 | 2671 | |||
G2 | 3079 | 3503 | 3554 | 3620 | |||
G3 | 285 | 291 | 358 | 379 | |||
G4 | 9434 | 9510 | 9570 | 9690 | |||
G5 | 13182 | 13525 | 13548 | 13902 |
Here's a good start:
data have;
input Group:$2 Day1 Day2 Day3 Day4;
datalines;
G1 2639 2639 2645 2671
G2 3079 3503 3554 3620
G3 285 291 358 379
G4 9434 9510 9570 9690
G5 13182 13525 13548 13902
;
proc sql noprint;
select count(*) into :dim
from dictionary.columns
where libname="WORK" and memname="HAVE"
and name like "Day%"
;
data want;
set have;
array Day[*] Day:;
array DiffDay[&dim];
do i=2 to dim(DiffDay);
DiffDay[i]=Day[i]-Day[i-1];
end;
drop i DiffDay1;
run;
Result:
Group | Day1 | Day2 | Day3 | Day4 | DiffDay2 | DiffDay3 | DiffDay4 |
---|---|---|---|---|---|---|---|
1 | 2639 | 2639 | 2645 | 2671 | 0 | 6 | 26 |
2 | 3079 | 3503 | 3554 | 3620 | 424 | 51 | 66 |
3 | 285 | 291 | 358 | 379 | 6 | 67 | 21 |
4 | 9434 | 9510 | 9570 | 9690 | 76 | 60 | 120 |
5 | 13182 | 13525 | 13548 | 13902 | 343 | 23 | 354 |
Here's a good start:
data have;
input Group:$2 Day1 Day2 Day3 Day4;
datalines;
G1 2639 2639 2645 2671
G2 3079 3503 3554 3620
G3 285 291 358 379
G4 9434 9510 9570 9690
G5 13182 13525 13548 13902
;
proc sql noprint;
select count(*) into :dim
from dictionary.columns
where libname="WORK" and memname="HAVE"
and name like "Day%"
;
data want;
set have;
array Day[*] Day:;
array DiffDay[&dim];
do i=2 to dim(DiffDay);
DiffDay[i]=Day[i]-Day[i-1];
end;
drop i DiffDay1;
run;
Result:
Group | Day1 | Day2 | Day3 | Day4 | DiffDay2 | DiffDay3 | DiffDay4 |
---|---|---|---|---|---|---|---|
1 | 2639 | 2639 | 2645 | 2671 | 0 | 6 | 26 |
2 | 3079 | 3503 | 3554 | 3620 | 424 | 51 | 66 |
3 | 285 | 291 | 358 | 379 | 6 | 67 | 21 |
4 | 9434 | 9510 | 9570 | 9690 | 76 | 60 | 120 |
5 | 13182 | 13525 | 13548 | 13902 | 343 | 23 | 354 |
I'm on a PROC SCORE kick. You need a custom SCORE data set like this.
Which you can generate with iterative DO.
data have;
infile cards expandtabs;
input Group:$2. Day1-Day4;
datalines;
G1 2639 2639 2645 2671
G2 3079 3503 3554 3620
G3 285 291 358 379
G4 9434 9510 9570 9690
G5 13182 13525 13548 13902
;;;;
run;
data score;
if 0 then set have(keep=Day:);
array v[*] _numeric_;
retain _TYPE_ 'SCORE';
length _name_ $32;
do i = 1 to dim(v);
call missing(of v[*]);
v[i]=-1;
do j = i+1 to dim(v);
_name_ = catx('_',vname(v[i]),vname(v[j]));
v[j] = 1;
output;
v[j] = .;
end;
end;
drop i j;
stop;
run;
proc print;
run;
proc score data=have score=score out=diffs;
id group day:;
var day:;
run;
proc print;
run;
And SCORE using PROC SCORE.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.