BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
hellohere
Pyrite | Level 9
data temp(drop=i j);
do i=1 to 3;
	do j=0 to 100;
    	byvar=i;
        ind=j;
        x1=sin(j/10)+i;
        y1=cos(j/10)+i;
       	output;
  	end;
end;
run;quit;

/*proc sgplot data=temp;
	by byvar;
	series x=ind y=x1/lineattrs=(color=red thickness=2 pattern=solid);  
	series x=ind y=y1/y2axis lineattrs=(color=blue thickness=2 pattern=solid);  
run;quit;*/

proc sql;
create table temp_new as
SELECT byvar, ind, x1, y1,
       MAX(x1) OVER(ORDER BY byvar ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
       AS x1_running_max
FROM temp
group by byvar;
quit;

I need get the max from the first row to CURRENT ROW. Tried the code above. But it complains. Anyway to do it in SQL?! Surely in dataset with byvar-retain it is doable. I prefer in SAS/SQL.

1 ACCEPTED SOLUTION

Accepted Solutions
quickbluefish
Barite | Level 11

PROC SQL doesn't allow these types of 'windowing' functions, unfortunately (not sure about FEDSQL).  In a DATA step, you can do something like the following:

proc sort data=have; by id; run;

data want;
set have;
by id;
RETAIN x1_running_max y1_running_max;
if first.id then call missing(x1_running_max, y1_running_max);
x1_running_max=max(x1, x1_running_max);
y1_running_max=max(y1, y1_running_max);
run;

View solution in original post

9 REPLIES 9
quickbluefish
Barite | Level 11

PROC SQL doesn't allow these types of 'windowing' functions, unfortunately (not sure about FEDSQL).  In a DATA step, you can do something like the following:

proc sort data=have; by id; run;

data want;
set have;
by id;
RETAIN x1_running_max y1_running_max;
if first.id then call missing(x1_running_max, y1_running_max);
x1_running_max=max(x1, x1_running_max);
y1_running_max=max(y1, y1_running_max);
run;
quickbluefish
Barite | Level 11
just edited - forgot you were trying to do this within each level of ID.
LinusH
Tourmaline | Level 20

Unfortunately SAS SQL does not support window functions.

And I can't see how to solve this in ANSI 1992 SQL, at least easily.

The data step is probably your best option here.

Data never sleeps
FreelanceReinh
Jade | Level 19

Hi @hellohere,


@hellohere wrote:

I prefer in SAS/SQL.


Then try this:

 

proc sql;
create table want as
select a.*, max(b.x1) as x1_running_max
from temp a, temp b
where a.byvar=b.byvar & b.ind<=a.ind
group by 1,2,3,4;
quit;

 

 

Tom
Super User Tom
Super User

Why would it matter if you used PROC SQL or something else?

 

You should check if you have SAS/ETS licensed.

 

https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.4/etsug/etsug_expand_overview.htm

 

Kurt_Bremser
Super User
data want;
set have;
by byvar;
retain x1_running_max;
if first.byvar
then x1_running_max = x1;
else x1_running_max = max(x1_running_max,x1);
run;

Straightforward, easy to understand, and less time- and resource-consuming than anything you might do in SQL.

Maxim 14!!!

 

Just for illustration, a comparison of the DATA step method with the one SQL suggestion:

 69         data have (drop=i j);
 70         do i = 1 to 1000;
 71           do j = 0 to 100;
 72             byvar=i;
 73             ind=j;
 74             x1=sin(j/10)+i;
 75             y1=cos(j/10)+i;
 76             output;
 77           end;
 78         end;
 79         run;
 
 NOTE: The data set WORK.HAVE has 101000 observations and 4 variables.
 NOTE:  Verwendet wurde: DATA statement - (Gesamtverarbeitungszeit):
       real time           0.00 seconds
       user cpu time       0.00 seconds
       system cpu time     0.00 seconds
       memory              1931.21k
       OS Memory           22180.00k
       Timestamp           19.09.2025 03:00:09 nachm.
       Step Count                        62  Switch Count  2
       Page Faults                       0
       Page Reclaims                     165
       Page Swaps                        0
       Voluntary Context Switches        9
       Involuntary Context Switches      2
       Block Input Operations            0
       Block Output Operations           6408
       
 
 80         
 81         data want;
 82         set have;
 83         by byvar;
 84         retain x1_running_max;
 85         if first.byvar
 86         then x1_running_max = x1;
 87         else x1_running_max = max(x1_running_max,x1);
 88         run;
 
 NOTE: There were 101000 observations read from the data set WORK.HAVE.
 NOTE: The data set WORK.WANT has 101000 observations and 5 variables.
 NOTE:  Verwendet wurde: DATA statement - (Gesamtverarbeitungszeit):
       real time           0.01 seconds
       user cpu time       0.02 seconds
       system cpu time     0.00 seconds
       memory              3625.37k
       OS Memory           24488.00k
       Timestamp           19.09.2025 03:00:09 nachm.
       Step Count                        63  Switch Count  2
       Page Faults                       0
       Page Reclaims                     519
       Page Swaps                        0
       Voluntary Context Switches        13
       Involuntary Context Switches      1
       Block Input Operations            0
       Block Output Operations           7944
       
 
 89         
 90         proc sql;
 91         create table want as
 92         select a.*, max(b.x1) as x1_running_max
 93         from have a, have b
 94         where a.byvar=b.byvar & b.ind<=a.ind
 95         group by 1,2,3,4;
 NOTE: Table WORK.WANT created, with 101000 rows and 5 columns.
 
 96         quit;
 NOTE:  Verwendet wurde: PROZEDUR SQL - (Gesamtverarbeitungszeit):
       real time           1.50 seconds
       user cpu time       3.00 seconds
       system cpu time     0.27 seconds
       memory              459314.90k
       OS Memory           481956.00k
       Timestamp           19.09.2025 03:00:11 nachm.

0.01 seconds vs. 1.5 seconds. 'Nuff said.

hellohere
Pyrite | Level 9

Thanks all. 

 

The dataset is just for example. I prefer SQL because DO-With-Dataset is kind of tedious in term of coding, may

I say this. 

Tom
Super User Tom
Super User

Everyone has their preferences.  SQL syntax is very good for somethings.  But in general it is extremely verbose and much more cumbersome that DATA step code.

 

Take your example:

create table temp_new as
SELECT byvar, ind, x1, y1,
       MAX(x1) OVER(ORDER BY byvar ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
       AS x1_running_max
FROM temp
group by byvar;

In a data step that is just:

data want2;
  set temp;
  by byvar;
  if first.byvar then x1_running_max=x1;
  else x1_running_max=max(x1_running_max,x1);
  retain x1_running_max;
run;

So instead of having the say "create table...as...from" you just need to say "data...set...".  Instead of "group by" you just need "by".  Plus you don't have to put COMMAS between the names in the list if more than one BY variable is need.  Instead you can just use spaces.

 

And SQL does not support variable lists.

 

Ksharp
Super User

PROC SQL can not handle BIG table for your question.

 

And @FreelanceReinh  already gave you the Cartesian Product Solution. Here I want to present sub-query solution.

 

data temp(drop=i j);
do i=1 to 3;
do j=0 to 100;
byvar=i;
ind=j;
x1=sin(j/10)+i;
y1=cos(j/10)+i;
output;
end;
end;
stop;
run;

proc sql;
create table want as
select a.*,
(select max(x1) from temp where byvar=a.byvar and ind <= a.ind) as x1_running_max,
(select max(y1) from temp where byvar=a.byvar and ind <= a.ind) as y1_running_max
 from temp as a;
quit;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 1056 views
  • 3 likes
  • 7 in conversation