Statistical programming, matrix languages, and more

Re: getting max and min value by group using PROC IML

Reply
Contributor
Posts: 44

Re: getting max and min value by group using PROC IML

Ksharp = SAS Super Guru. Thank you.

Is there anything you cannot solve in SAS? Smiley Happy

Contributor
Posts: 44

getting max and min value by group using PROC IML

Posted in reply to sebster24

Hello,

I am trying to obtain the min, max values within a group (in this instance, "SITE") for a dataset.

I am new to proc iml, so please forgive me for my mistakes.

I have provided the datalines below for your perusal, along with my code (which does not work) and end solution:

 

data pottery;
        input Site X Fe Mg Ca Na;
        datalines;
      1 14.4 7.00 4.30 0.15 0.51
      1 13.8 7.08 3.43 0.12 0.17
      1 14.6 7.09 3.88 0.13 0.20
      1 11.5 6.37 5.64 0.16 0.14
      1 13.8 7.06 5.34 0.20 0.20
      1 10.9 6.26 3.47 0.17 0.22
      1 10.1 4.26 4.26 0.20 0.18
      1 11.6 5.78 5.91 0.18 0.16
      1 11.1 5.49 4.52 0.29 0.30
      1 13.4 6.92 7.23 0.28 0.20
      1 12.4 6.13 5.69 0.22 0.54
      1 13.1 6.64 5.51 0.31 0.24
      1 12.7 6.69 4.45 0.20 0.22
      1 12.5 6.44 3.94 0.22 0.23
      2 11.8 5.44 3.94 0.30 0.04
      2 11.6 5.39 3.77 0.29 0.06
      3 18.3 1.28 0.67 0.03 0.03
      3 15.8 2.39 0.63 0.01 0.04
      3 18.0 1.50 0.67 0.01 0.06
      3 18.0 1.88 0.68 0.01 0.04
      3 20.8 1.51 0.72 0.07 0.10
      4 17.7 1.12 0.56 0.06 0.06
      4 18.3 1.14 0.67 0.06 0.05
      4 16.7 0.92 0.53 0.01 0.05
      4 14.8 2.74 0.67 0.03 0.05
      4 19.1 1.64 0.60 0.10 0.03
      ;


	  proc iml;
        use pottery;
        read all var{site x Fe} into a;
		call sortndx(ndx,a,1);
       unique_rows=uniqueby(a,1,1:nrow(a)); 
	   unique_vals=a[ndx[unique_rows],1:3];
	   minC=a[><,];
	   maxC=a[<>,];
print unique_rows, unique_vals, minC, maxC;

 

 

The end solution needs to look somewhat like this:

site Min_X Min_Fe Min_Mg Min_Ca Min_Na
1 10,1 4,26 3,43 0,12 0,14
2 11,6 5,39 3,77 0,29 0,04
3 15,8 1,28 0,63 0,01 0,03
4 14,8 0,92 0,53 0,01 0,03

 

site Max_X Max_Fe Max_Mg Max_Ca Max_Na
1 14,6 7,09 7,23 0,31 0,54
2 11,8 5,44 3,94 0,3 0,06
3 20,8 2,39 0,72 0,07 0,1
4 19,1 2,74 0,67 0,1 0,06

 

I want to avoid using the datastep and Proc Sql option.

Please do let me know.

many thanks.

Super User
Super User
Posts: 9,441

Re: getting max and min value by group using PROC IML

Posted in reply to sebster24

The question really, is why do you want to avoid using datastep, proc, and SQL?  Proc means/summary/freq are coded specifically for these types of process, they are optimised for it, and so not using them doesn't make much sense?

Super User
Posts: 10,695

Re: getting max and min value by group using PROC IML

Posted in reply to sebster24

It is indeed more suitable for IML.

 

data pottery;
        input Site X Fe Mg Ca Na;
        datalines;
      1 14.4 7.00 4.30 0.15 0.51
      1 13.8 7.08 3.43 0.12 0.17
      1 14.6 7.09 3.88 0.13 0.20
      1 11.5 6.37 5.64 0.16 0.14
      1 13.8 7.06 5.34 0.20 0.20
      1 10.9 6.26 3.47 0.17 0.22
      1 10.1 4.26 4.26 0.20 0.18
      1 11.6 5.78 5.91 0.18 0.16
      1 11.1 5.49 4.52 0.29 0.30
      1 13.4 6.92 7.23 0.28 0.20
      1 12.4 6.13 5.69 0.22 0.54
      1 13.1 6.64 5.51 0.31 0.24
      1 12.7 6.69 4.45 0.20 0.22
      1 12.5 6.44 3.94 0.22 0.23
      2 11.8 5.44 3.94 0.30 0.04
      2 11.6 5.39 3.77 0.29 0.06
      3 18.3 1.28 0.67 0.03 0.03
      3 15.8 2.39 0.63 0.01 0.04
      3 18.0 1.50 0.67 0.01 0.06
      3 18.0 1.88 0.68 0.01 0.04
      3 20.8 1.51 0.72 0.07 0.10
      4 17.7 1.12 0.56 0.06 0.06
      4 18.3 1.14 0.67 0.06 0.05
      4 16.7 0.92 0.53 0.01 0.05
      4 14.8 2.74 0.67 0.03 0.05
      4 19.1 1.64 0.60 0.10 0.03
      ;
run;
proc iml;
use pottery nobs nobs;
read all var {Site};
read all var {X Fe Mg Ca Na} into x[c=vnames];
close;

start_end=t(loc(t(Site)^={.}||remove(Site,nobs)))||
          t(loc(t(Site)^=remove(Site,1)||{.}));
          
nrow=nrow(start_end);  
ncol=ncol(x);  
sites=site[start_end[,1]];  
minC=j(nrow,ncol,.);
maxC=j(nrow,ncol,.);

do i=1 to nrow;
 minC[i,]=x[start_end[i,1]:start_end[i,2],][><,];
 maxC[i,]=x[start_end[i,1]:start_end[i,2],][<>,];
end;
print sites minC[c=vnames],sites maxC[c=vnames];
quit;
          
Super User
Super User
Posts: 9,441

Re: getting max and min value by group using PROC IML

Hi,

 

Just wondering why IML is the better option than:

proc summary data=work.pottery;
  class site;
  var x fe mg ca na;
  output out=work.results min=min_x min_fe min_mg min_ca min_na max=max_x max_fe max_mg max_ca max_na;
run;

Seems to be simpler coding to do the above, and only requires Base SAS?

Contributor
Posts: 44

Re: getting max and min value by group using PROC IML

Hello RW9,

i was insisting on proc iml, because i wanted to dig deeper into statistical programming. It looks fun.

On another note, i wanted to see the difference myself, and found the following. Proc Iml is much efficient in resource usage and slightly quicker.

Spoiler

PROCEDURE IML used (Total process time):

real time 0.07 seconds

user cpu time 0.04 seconds

system cpu time 0.03 seconds

Memory 822k

OS Memory 11816k

Timestamp 02.06.2016 16:37:09




NOTE: There were 26 observations read from the data set WORK.POTTERY.

NOTE: The data set WORK.RESULTS has 5 observations and 13 variables.

NOTE: PROCEDURE SUMMARY used (Total process time):

real time 0.04 seconds

user cpu time 0.00 seconds

system cpu time 0.04 seconds

Memory 7980k

OS Memory 20044k

Timestamp 02.06.2016 16:37:09

THank you for your help RW9. Much appreciated. You guys are just SAS Super-Gurus Smiley Happy 

Ask a Question
Discussion stats
  • 5 replies
  • 610 views
  • 2 likes
  • 3 in conversation