I have a weekly report I am running where I need to take an unknown number of columns that I need to convert to a running total.
Please try to keep in mind I created this very simple example, the data including the headers are much more complicated, Also I am trying to automate this so I never would have to actually look at the the file to run my program. Variable names will never be the same from day to day
Below is the code I have started with
please let me know what I am doing wrong, Macros are not my specialty so I am very open if I am going the wrong way about this.
DATA HAVE;
infile datalines delimiter=',';
INPUT
SALES_ID $ MONDAY TUESDAY WEDNESDAY THURSDAY FRIDAY SATURDAY;
DATALINES;
A1,45,61,60,14,74,35
A2,65,72,68,79,70,30
A3,24,17,16,69,29,18
A4,81,84,46,82,49,34
A5,81,38,67,40,88,11
A6,52,89,47,48,68,70
A7,62,10,41,73,49,42
;
RUN;
PROC CONTENTS DATA = have OUT = COL_STEP1;
RUN;
/*The varnum greater than one will always be know, however the total number of columns will can not be hard coded*/
PROC SQL;
CREATE TABLE COLS AS
SELECT DISTINCT
NAME AS COL,
VARNUM AS COL_NUM
FROM COL_STEP1
WHERE VARNUM > 1
ORDER BY VARNUM;
QUIT;
PROC SQL NOPRINT;
SELECT DISTINCT MAX(COL_NUM)-2 INTO :N
FROM COLS;
QUIT;
PROC SQL NOPRINT;
SELECT DISTINCT col INTO :col1
FROM COLS
having col_num = min(col_num);
QUIT;
DATA IFILE (KEEP = COL);
SET COLS;
WHERE COL_NUM > 1;
IF &N = 1 THEN STOP;
RUN;
%PUT NOTE: &N columns to aggregate &col1;
PROC SQL;
CREATE TABLE ID_TABLE AS
SELECT
SALES_ID,
&COL1,
&COL1 AS TOTAL
FROM HAVE;
QUIT;
%MACRO TEST;
%DO X = 1 %TO &N;
DATA _NULL_;
SET IFILE (FIRSTOBS=&X OBS=&X);
CALL SYMPUT ('COL', STRIP(COL));
RUN;
PROC SQL;
CREATE TABLE ID_TABLE AS
SELECT A.SALES_ID,
B.&COL + A.TOTAL AS &COL,
B.&COL + A.TOTAL AS TOTAL
FROM ID_TABLE AS A
LEFT JOIN HAVE AS B
ON A.SALES_ID = B.SALES_ID;
QUIT;
%END;
%MEND TEST;
%TEST;
a simple array operation should do i think
DATA HAVE;
infile datalines delimiter=',';
INPUT
SALES_ID $ MONDAY TUESDAY WEDNESDAY THURSDAY FRIDAY SATURDAY;
DATALINES;
A1,45,61,60,14,74,35
A2,65,72,68,79,70,30
A3,24,17,16,69,29,18
A4,81,84,46,82,49,34
A5,81,38,67,40,88,11
A6,52,89,47,48,68,70
A7,62,10,41,73,49,42
;
RUN;
data want;
set have;
array t MONDAY--SATURDAY;
do _i_=2 to dim(t);
t(_i_)=t(_i_)+t(_i_-1);
end;
run;
Btw, I haven't looked into your code. I just saw the HAVE and WANT
a simple array operation should do i think
DATA HAVE;
infile datalines delimiter=',';
INPUT
SALES_ID $ MONDAY TUESDAY WEDNESDAY THURSDAY FRIDAY SATURDAY;
DATALINES;
A1,45,61,60,14,74,35
A2,65,72,68,79,70,30
A3,24,17,16,69,29,18
A4,81,84,46,82,49,34
A5,81,38,67,40,88,11
A6,52,89,47,48,68,70
A7,62,10,41,73,49,42
;
RUN;
data want;
set have;
array t MONDAY--SATURDAY;
do _i_=2 to dim(t);
t(_i_)=t(_i_)+t(_i_-1);
end;
run;
Btw, I haven't looked into your code. I just saw the HAVE and WANT
Thank you so much for all the help!!
You could reasonably use the macro part of your code to determine the variable names, which you say will change from day to day.
As far as the running totals across the rows, DO NOT USE MACROS for this. That would be completely unnecessary and unpleasant and difficult to program. Arrays work fine here to compute running totals, and should run faster than any macro.
So after you have determined the variable names from PROC CONTENTS and placed them into a macro variable which I will call &varnames, something like this ought to work
data want;
set have;
array x &varnames;
do i=2 to dim(x);
x(i)=x(i)+x(i-1);
end;
drop i;
run;
Once again, a typical example why putting data (weekdays) into structure (variable names) is a VERY BAD IDEA.
De-transpose your data, and it's all a matter of simple processing.
@dane77221 wrote:
I have a weekly report I am running where I need to take an unknown number of columns that I need to convert to a running total.
That is a spread sheet way of thinking and causes a vast number of the issues involved with bad macro or other code.
If you had the data such that it looked like:
id dayofweek (numeric 1 to 7) or DATE and value
The code might look like:
DATA HAVE; infile datalines delimiter=','; INPUT SALES_ID $ @; do day= 1 to 6; input value @; output; end; input; DATALINES; A1,45,61,60,14,74,35 A2,65,72,68,79,70,30 A3,24,17,16,69,29,18 A4,81,84,46,82,49,34 A5,81,38,67,40,88,11 A6,52,89,47,48,68,70 A7,62,10,41,73,49,42 ; RUN; Proc sort data=have; by sales_id day; data want; set have; by sales_id; if first.sales_id then runningtot=value; else runningtot+value; run; proc report data=want; columns sales_id day,runningtot; define sales_id/group; define day/ across; run;
If you wanted a report to show those values across in column form then that is relatively simple with either proc report or tabulate.
How do you expect use to "know" what the "unknown number of columns" might be?
If this data is coming from an external file then take control and make sense of the data before going any further. Reading data properly will say many headaches later.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.