I want to perform linear regression after every 7 days upto the total days for different serial id's. What I have done is, I used the macro function and iterated upto total days by 7... and preformed the regression... but this process is taking lot of time. Is there any better way to do it?
%MACRO LR_WEEK(TABLE);
PROC DATASETS LIBRARY=WORK NODETAILS NOLIST;
DELETE LR_PARTS;
RUN;
PROC SQL NOPRINT;
SELECT DISTINCT SERIAL_id
INTO :SERIAL_ARRAY SEPARATED BY ' '
FROM &TABLE
QUIT;
PROC SQL NOPRINT;
SELECT COUNT (DISTINCT SERIAL_id) AS TOT_DISTINCT_SERIAL
INTO :TOT_DISTINCT_SERIAL_1
FROM &TABLE
QUIT;
%DO i = 1 %TO &TOT_DISTINCT_SERIAL_1.;
PROC SQL NOPRINT;
SELECT DISTINCT CYCLE_COUNTER
INTO :CYCLE_COUNTER_ARRAY SEPARATED BY ' '
FROM &TABLE
WHERE SERIAL_NR_N = %SCAN(&SERIAL_ARRAY.,&i.);
QUIT;
PROC SQL NOPRINT;
SELECT COUNT (DISTINCT CYCLE_COUNTER) AS TOT_DISTINCT_CYCLE_COUNTER
INTO :TOT_DISTINCT_CYCLE_COUNTER_1
FROM &TABLE
WHERE SERIAL_id = %SCAN(&SERIAL_ARRAY.,&i.);
QUIT;
%DO j = 1 %TO &TOT_DISTINCT_CYCLE_COUNTER_1.;
PROC SQL NOPRINT;
SELECT MAX(TOTAL_DAYS)
INTO :TOTAL_DAYS
FROM &TABLE
WHERE SERIAL_id = %SCAN(&SERIAL_ARRAY.,&i.) AND CYCLE_COUNTER = %SCAN(&CYCLE_COUNTER_ARRAY.,&j.) ;
QUIT;
%IF &TOTAL_DAYS. > 30 %THEN %DO;
%DO k=7 %TO &TOTAL_DAYS. %BY 7;
DATA LR_INP;
SET &TABLE;
BY MATERIAL_NR_N SERIAL_id CYCLE_COUNTER;
WHERE SERIAL_id=%SCAN(&SERIAL_ARRAY.,&i.) AND CYCLE_COUNTER = %SCAN(&CYCLE_COUNTER_ARRAY.,&j.) AND DAYS_ELAPSED<&k AND TIME > 5;
RUN;
PROC REG DATA=LR_INP OUTEST=SIO_LR_VAR NOPRINT;
BY MATERIAL_NR_N SERIAL_id CYCLE_COUNTER;
MODEL SIO_MA_5 = DAYS_ELAPSED;
RUN;
DATA LR_ADD;
SET SIO_LR_VAR;
BY MATERIAL_NR_N SERIAL_id CYCLE_COUNTER;
DAYS=&k;
RUN;
PROC APPEND BASE=WORK.LR_PARTS DATA=LR_ADD;
RUN;
PROC SORT DATA=LR_PARTS;
BY MATERIAL_NR_N SERIAL_id CYCLE_COUNTER;
RUN;
PROC SQL;
CREATE TABLE WORK.LR_JOIN AS
SELECT A.*,B.INTERCEPT,B.DAYS_ELAPSED AS LR_SLOPE
FROM &TABLE A LEFT JOIN LR_PARTS B
ON A.SERIAL_id=B.SERIALid AND A.CYCLE_COUNTER=B.CYCLE_COUNTER AND A.DAYS_ELAPSED=B.DAYS
ORDER BY SERIAL_id,CYCLE_COUNTER,DATE_N;
QUIT;
%END;
%END;
%END;
%END;
%MEND;
SAS does not need shouting, so using lowercase is possible and preferred.
Use the "little running man" icon for posting code, so that the formatting and code is preserved as-is.
But if that is really the way you write code, see Maxim 12.
You have
%DO k=7 %TO &TOTAL_DAYS. %BY 7;
You can probably eliminate the inner loop which iteratively makes a series dataset and runs a regression on each one. Instead you can simultaneously make all the datasets, concatenate them into a single data set view (with new variable DAYS=macrovar &K, and run a regression adding the new by-variable DAYS.
That is you can replace this:
%IF &TOTAL_DAYS. > 30 %THEN %DO;
%DO k=7 %TO &TOTAL_DAYS. %BY 7;
DATA LR_INP;
SET &TABLE;
BY MATERIAL_NR_N SERIAL_id CYCLE_COUNTER;
WHERE SERIAL_id=%SCAN(&SERIAL_ARRAY.,&i.) AND CYCLE_COUNTER = %SCAN(&CYCLE_COUNTER_ARRAY.,&j.) AND DAYS_ELAPSED<&k AND TIME > 5;
RUN;
PROC REG DATA=LR_INP OUTEST=SIO_LR_VAR NOPRINT;
BY MATERIAL_NR_N SERIAL_id CYCLE_COUNTER;
MODEL SIO_MA_5 = DAYS_ELAPSED;
RUN;
DATA LR_ADD;
SET SIO_LR_VAR;
BY MATERIAL_NR_N SERIAL_id CYCLE_COUNTER;
DAYS=&k;
RUN;
PROC APPEND BASE=WORK.LR_PARTS DATA=LR_ADD;
RUN;
PROC SORT DATA=LR_PARTS;
BY MATERIAL_NR_N SERIAL_id CYCLE_COUNTER;
RUN;
PROC SQL;
CREATE TABLE WORK.LR_JOIN AS
SELECT A.*,B.INTERCEPT,B.DAYS_ELAPSED AS LR_SLOPE
FROM &TABLE A LEFT JOIN LR_PARTS B
ON A.SERIAL_id=B.SERIALid AND A.CYCLE_COUNTER=B.CYCLE_COUNTER AND A.DAYS_ELAPSED=B.DAYS
ORDER BY SERIAL_id,CYCLE_COUNTER,DATE_N;
QUIT;
%END;
%END;
with this:
%if &total_days. > 30 %then %do;
data
%do k=7 %to %total_days. %by 7;
lr_inp&k (where=(days_elapsed<&k))
%end; ;
set &table;
by material_nr_n serial_id cycle_counter;
where serial_id=%scan(&serial_array.,&i)
and cycle_counter = %scan(&cycle_counter_array.,&j.)
and time > 5 ;
run;
data vtemp / view=vtemp;
set
%do k=7 %to %total_days. %by 7;
lr_inp&k (in=in&k)
%end; ;
%do k=7 %to %total_days. %by 7;
if in&k then days=&k; else
%end; ;
run;
proc reg data=vtemp outest=lr_add NOPRINT;
by days material_nr_n serial_id cycle_counter;
model sio_ma_5 = days_elapsed;
run;
proc append base=work.lr_parts data=lr_add;
run;
proc sort data=lr_parts;
by material_nr_n serial_id cycle_counter days;
run;
proc sql;
create table work.lr_join as
select a.*,b.intercept,b.days_elapsed as lr_slope
from &table a left join lr_parts b
on a.serial_id=b.serialid and a.cycle_counter=b.cycle_counter and a.days_elapsed=b.days
order by serial_id,cycle_counter,date_n;
quit;
%end;
Note I have a number of loop statements
%do k=7 %to %total_days. %by 7;
But they are not looping over proc steps and data steps. Instead they just do things like
replace DATA LR_INP
with data lr_inp7 (where=(days_elapsed<7)) lr_inp14 (where=days_elapsed<14)) ...
So the loop produces much more efficient code.
You can also replace
PROC SQL NOPRINT;
SELECT DISTINCT SERIAL_id
INTO :SERIAL_ARRAY SEPARATED BY ' '
FROM &TABLE
QUIT;
PROC SQL NOPRINT;
SELECT COUNT (DISTINCT SERIAL_id) AS TOT_DISTINCT_SERIAL
INTO :TOT_DISTINCT_SERIAL_1
FROM &TABLE
QUIT;
with (no need to run the second proc sql, just use the automatic variable &sqlobs to get the count of rows generated.
PROC SQL NOPRINT;
SELECT DISTINCT SERIAL_id
INTO :SERIAL_ARRAY SEPARATED BY ' '
FROM &TABLE
QUIT;
%let tot_distinct_serial=&sqlobs;
INTO
ERIAL_ARRAY SEPARATED BY ' '
Please, click on the running man icon and paste your code into the box that appears. Not only will this format the code properly, but it will avoid these unintended smiley faces.
If it is taking a "lot of time", can you give us some idea of what a "lot of time" means? Is it 10 minutes? 48.3 minutes?
You are performing lots and lots of iterations, can you tell us how many iterations the code goes through?
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.
Early bird rate extended! Save $200 when you sign up by March 31.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.