Statistical programming, matrix languages, and more

Re: getting max and min value by group using PROC IML

Reply
Contributor
Posts: 44

Re: getting max and min value by group using PROC IML

Ksharp = SAS Super Guru. Thank you.

Is there anything you cannot solve in SAS? Smiley Happy

Contributor
Posts: 44

getting max and min value by group using PROC IML

Hello,

I am trying to obtain the min, max values within a group (in this instance, "SITE") for a dataset.

I am new to proc iml, so please forgive me for my mistakes.

I have provided the datalines below for your perusal, along with my code (which does not work) and end solution:

 

data pottery;
        input Site X Fe Mg Ca Na;
        datalines;
      1 14.4 7.00 4.30 0.15 0.51
      1 13.8 7.08 3.43 0.12 0.17
      1 14.6 7.09 3.88 0.13 0.20
      1 11.5 6.37 5.64 0.16 0.14
      1 13.8 7.06 5.34 0.20 0.20
      1 10.9 6.26 3.47 0.17 0.22
      1 10.1 4.26 4.26 0.20 0.18
      1 11.6 5.78 5.91 0.18 0.16
      1 11.1 5.49 4.52 0.29 0.30
      1 13.4 6.92 7.23 0.28 0.20
      1 12.4 6.13 5.69 0.22 0.54
      1 13.1 6.64 5.51 0.31 0.24
      1 12.7 6.69 4.45 0.20 0.22
      1 12.5 6.44 3.94 0.22 0.23
      2 11.8 5.44 3.94 0.30 0.04
      2 11.6 5.39 3.77 0.29 0.06
      3 18.3 1.28 0.67 0.03 0.03
      3 15.8 2.39 0.63 0.01 0.04
      3 18.0 1.50 0.67 0.01 0.06
      3 18.0 1.88 0.68 0.01 0.04
      3 20.8 1.51 0.72 0.07 0.10
      4 17.7 1.12 0.56 0.06 0.06
      4 18.3 1.14 0.67 0.06 0.05
      4 16.7 0.92 0.53 0.01 0.05
      4 14.8 2.74 0.67 0.03 0.05
      4 19.1 1.64 0.60 0.10 0.03
      ;


	  proc iml;
        use pottery;
        read all var{site x Fe} into a;
		call sortndx(ndx,a,1);
       unique_rows=uniqueby(a,1,1:nrow(a)); 
	   unique_vals=a[ndx[unique_rows],1:3];
	   minC=a[><,];
	   maxC=a[<>,];
print unique_rows, unique_vals, minC, maxC;

 

 

The end solution needs to look somewhat like this:

site Min_X Min_Fe Min_Mg Min_Ca Min_Na
1 10,1 4,26 3,43 0,12 0,14
2 11,6 5,39 3,77 0,29 0,04
3 15,8 1,28 0,63 0,01 0,03
4 14,8 0,92 0,53 0,01 0,03

 

site Max_X Max_Fe Max_Mg Max_Ca Max_Na
1 14,6 7,09 7,23 0,31 0,54
2 11,8 5,44 3,94 0,3 0,06
3 20,8 2,39 0,72 0,07 0,1
4 19,1 2,74 0,67 0,1 0,06

 

I want to avoid using the datastep and Proc Sql option.

Please do let me know.

many thanks.

Esteemed Advisor
Esteemed Advisor
Posts: 6,693

Re: getting max and min value by group using PROC IML

The question really, is why do you want to avoid using datastep, proc, and SQL?  Proc means/summary/freq are coded specifically for these types of process, they are optimised for it, and so not using them doesn't make much sense?

Grand Advisor
Posts: 9,320

Re: getting max and min value by group using PROC IML

It is indeed more suitable for IML.

 

data pottery;
        input Site X Fe Mg Ca Na;
        datalines;
      1 14.4 7.00 4.30 0.15 0.51
      1 13.8 7.08 3.43 0.12 0.17
      1 14.6 7.09 3.88 0.13 0.20
      1 11.5 6.37 5.64 0.16 0.14
      1 13.8 7.06 5.34 0.20 0.20
      1 10.9 6.26 3.47 0.17 0.22
      1 10.1 4.26 4.26 0.20 0.18
      1 11.6 5.78 5.91 0.18 0.16
      1 11.1 5.49 4.52 0.29 0.30
      1 13.4 6.92 7.23 0.28 0.20
      1 12.4 6.13 5.69 0.22 0.54
      1 13.1 6.64 5.51 0.31 0.24
      1 12.7 6.69 4.45 0.20 0.22
      1 12.5 6.44 3.94 0.22 0.23
      2 11.8 5.44 3.94 0.30 0.04
      2 11.6 5.39 3.77 0.29 0.06
      3 18.3 1.28 0.67 0.03 0.03
      3 15.8 2.39 0.63 0.01 0.04
      3 18.0 1.50 0.67 0.01 0.06
      3 18.0 1.88 0.68 0.01 0.04
      3 20.8 1.51 0.72 0.07 0.10
      4 17.7 1.12 0.56 0.06 0.06
      4 18.3 1.14 0.67 0.06 0.05
      4 16.7 0.92 0.53 0.01 0.05
      4 14.8 2.74 0.67 0.03 0.05
      4 19.1 1.64 0.60 0.10 0.03
      ;
run;
proc iml;
use pottery nobs nobs;
read all var {Site};
read all var {X Fe Mg Ca Na} into x[c=vnames];
close;

start_end=t(loc(t(Site)^={.}||remove(Site,nobs)))||
          t(loc(t(Site)^=remove(Site,1)||{.}));
          
nrow=nrow(start_end);  
ncol=ncol(x);  
sites=site[start_end[,1]];  
minC=j(nrow,ncol,.);
maxC=j(nrow,ncol,.);

do i=1 to nrow;
 minC[i,]=x[start_end[i,1]:start_end[i,2],][><,];
 maxC[i,]=x[start_end[i,1]:start_end[i,2],][<>,];
end;
print sites minC[c=vnames],sites maxC[c=vnames];
quit;
          
Esteemed Advisor
Esteemed Advisor
Posts: 6,693

Re: getting max and min value by group using PROC IML

Hi,

 

Just wondering why IML is the better option than:

proc summary data=work.pottery;
  class site;
  var x fe mg ca na;
  output out=work.results min=min_x min_fe min_mg min_ca min_na max=max_x max_fe max_mg max_ca max_na;
run;

Seems to be simpler coding to do the above, and only requires Base SAS?

Contributor
Posts: 44

Re: getting max and min value by group using PROC IML

Hello RW9,

i was insisting on proc iml, because i wanted to dig deeper into statistical programming. It looks fun.

On another note, i wanted to see the difference myself, and found the following. Proc Iml is much efficient in resource usage and slightly quicker.

Spoiler

PROCEDURE IML used (Total process time):

real time 0.07 seconds

user cpu time 0.04 seconds

system cpu time 0.03 seconds

Memory 822k

OS Memory 11816k

Timestamp 02.06.2016 16:37:09




NOTE: There were 26 observations read from the data set WORK.POTTERY.

NOTE: The data set WORK.RESULTS has 5 observations and 13 variables.

NOTE: PROCEDURE SUMMARY used (Total process time):

real time 0.04 seconds

user cpu time 0.00 seconds

system cpu time 0.04 seconds

Memory 7980k

OS Memory 20044k

Timestamp 02.06.2016 16:37:09

THank you for your help RW9. Much appreciated. You guys are just SAS Super-Gurus Smiley Happy 

Post a Question
Discussion Stats
  • 5 replies
  • 393 views
  • 2 likes
  • 3 in conversation