Quartz | Level 8

## Re: getting max and min value by group using PROC IML

Ksharp = SAS Super Guru. Thank you.

Is there anything you cannot solve in SAS? 🙂

5 REPLIES 5
Quartz | Level 8

## getting max and min value by group using PROC IML

Hello,

I am trying to obtain the min, max values within a group (in this instance, "SITE") for a dataset.

I am new to proc iml, so please forgive me for my mistakes.

I have provided the datalines below for your perusal, along with my code (which does not work) and end solution:

``````data pottery;
input Site X Fe Mg Ca Na;
datalines;
1 14.4 7.00 4.30 0.15 0.51
1 13.8 7.08 3.43 0.12 0.17
1 14.6 7.09 3.88 0.13 0.20
1 11.5 6.37 5.64 0.16 0.14
1 13.8 7.06 5.34 0.20 0.20
1 10.9 6.26 3.47 0.17 0.22
1 10.1 4.26 4.26 0.20 0.18
1 11.6 5.78 5.91 0.18 0.16
1 11.1 5.49 4.52 0.29 0.30
1 13.4 6.92 7.23 0.28 0.20
1 12.4 6.13 5.69 0.22 0.54
1 13.1 6.64 5.51 0.31 0.24
1 12.7 6.69 4.45 0.20 0.22
1 12.5 6.44 3.94 0.22 0.23
2 11.8 5.44 3.94 0.30 0.04
2 11.6 5.39 3.77 0.29 0.06
3 18.3 1.28 0.67 0.03 0.03
3 15.8 2.39 0.63 0.01 0.04
3 18.0 1.50 0.67 0.01 0.06
3 18.0 1.88 0.68 0.01 0.04
3 20.8 1.51 0.72 0.07 0.10
4 17.7 1.12 0.56 0.06 0.06
4 18.3 1.14 0.67 0.06 0.05
4 16.7 0.92 0.53 0.01 0.05
4 14.8 2.74 0.67 0.03 0.05
4 19.1 1.64 0.60 0.10 0.03
;

proc iml;
use pottery;
read all var{site x Fe} into a;
call sortndx(ndx,a,1);
unique_rows=uniqueby(a,1,1:nrow(a));
unique_vals=a[ndx[unique_rows],1:3];
minC=a[><,];
maxC=a[<>,];
print unique_rows, unique_vals, minC, maxC;``````

The end solution needs to look somewhat like this:

 site Min_X Min_Fe Min_Mg Min_Ca Min_Na 1 10,1 4,26 3,43 0,12 0,14 2 11,6 5,39 3,77 0,29 0,04 3 15,8 1,28 0,63 0,01 0,03 4 14,8 0,92 0,53 0,01 0,03

 site Max_X Max_Fe Max_Mg Max_Ca Max_Na 1 14,6 7,09 7,23 0,31 0,54 2 11,8 5,44 3,94 0,3 0,06 3 20,8 2,39 0,72 0,07 0,1 4 19,1 2,74 0,67 0,1 0,06

I want to avoid using the datastep and Proc Sql option.

many thanks.

Diamond | Level 26

## Re: getting max and min value by group using PROC IML

The question really, is why do you want to avoid using datastep, proc, and SQL?  Proc means/summary/freq are coded specifically for these types of process, they are optimised for it, and so not using them doesn't make much sense?

Super User

## Re: getting max and min value by group using PROC IML

It is indeed more suitable for IML.

``````data pottery;
input Site X Fe Mg Ca Na;
datalines;
1 14.4 7.00 4.30 0.15 0.51
1 13.8 7.08 3.43 0.12 0.17
1 14.6 7.09 3.88 0.13 0.20
1 11.5 6.37 5.64 0.16 0.14
1 13.8 7.06 5.34 0.20 0.20
1 10.9 6.26 3.47 0.17 0.22
1 10.1 4.26 4.26 0.20 0.18
1 11.6 5.78 5.91 0.18 0.16
1 11.1 5.49 4.52 0.29 0.30
1 13.4 6.92 7.23 0.28 0.20
1 12.4 6.13 5.69 0.22 0.54
1 13.1 6.64 5.51 0.31 0.24
1 12.7 6.69 4.45 0.20 0.22
1 12.5 6.44 3.94 0.22 0.23
2 11.8 5.44 3.94 0.30 0.04
2 11.6 5.39 3.77 0.29 0.06
3 18.3 1.28 0.67 0.03 0.03
3 15.8 2.39 0.63 0.01 0.04
3 18.0 1.50 0.67 0.01 0.06
3 18.0 1.88 0.68 0.01 0.04
3 20.8 1.51 0.72 0.07 0.10
4 17.7 1.12 0.56 0.06 0.06
4 18.3 1.14 0.67 0.06 0.05
4 16.7 0.92 0.53 0.01 0.05
4 14.8 2.74 0.67 0.03 0.05
4 19.1 1.64 0.60 0.10 0.03
;
run;
proc iml;
use pottery nobs nobs;
read all var {X Fe Mg Ca Na} into x[c=vnames];
close;

start_end=t(loc(t(Site)^={.}||remove(Site,nobs)))||
t(loc(t(Site)^=remove(Site,1)||{.}));

nrow=nrow(start_end);
ncol=ncol(x);
sites=site[start_end[,1]];
minC=j(nrow,ncol,.);
maxC=j(nrow,ncol,.);

do i=1 to nrow;
minC[i,]=x[start_end[i,1]:start_end[i,2],][><,];
maxC[i,]=x[start_end[i,1]:start_end[i,2],][<>,];
end;
print sites minC[c=vnames],sites maxC[c=vnames];
quit;
``````
Diamond | Level 26

## Re: getting max and min value by group using PROC IML

Hi,

Just wondering why IML is the better option than:

```proc summary data=work.pottery;
class site;
var x fe mg ca na;
output out=work.results min=min_x min_fe min_mg min_ca min_na max=max_x max_fe max_mg max_ca max_na;
run;```

Seems to be simpler coding to do the above, and only requires Base SAS?

Quartz | Level 8

## Re: getting max and min value by group using PROC IML

Hello RW9,

i was insisting on proc iml, because i wanted to dig deeper into statistical programming. It looks fun.

On another note, i wanted to see the difference myself, and found the following. Proc Iml is much efficient in resource usage and slightly quicker.

Spoiler

PROCEDURE IML used (Total process time):

real time 0.07 seconds

user cpu time 0.04 seconds

system cpu time 0.03 seconds

Memory 822k

OS Memory 11816k

Timestamp 02.06.2016 16:37:09

NOTE: There were 26 observations read from the data set WORK.POTTERY.

NOTE: The data set WORK.RESULTS has 5 observations and 13 variables.

NOTE: PROCEDURE SUMMARY used (Total process time):

real time 0.04 seconds

user cpu time 0.00 seconds

system cpu time 0.04 seconds

Memory 7980k

OS Memory 20044k

Timestamp 02.06.2016 16:37:09

THank you for your help RW9. Much appreciated. You guys are just SAS Super-Gurus 🙂

From The DO Loop