How to use Arrays to calculate Maximum value

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 14
Accepted Solution

How to use Arrays to calculate Maximum value

[ Edited ]

I realize arrays are used to group data when you want to do process it all in one data step and it needs an array, do, end statement. I was instructed to transpose this data so that time and concentration would be variables and they would all show in relation to the subjects (observations). I was able to accomplish this by transposing each of the variables individually and then merging them. Now I am trying to find the maximum value of T and C for each observation using an array statement, but I just keep getting a bunch of errors. I am then supposed to find the mean, min and max which I know I can use the PROC MEANS statement, which I have done. When I do this final step, no errors show, but I am not getting any table to check the result. Below I've put the code I have come up with followed by the errors. Any guidance as to where I went wrong or what I am misunderstanding would be greatly appreciated. I am VERY new to SAS as you can probably tell.

 

 

proc import out=Project2_F17 
        datafile='C:\Users\savanahb\Downloads\Project2_f17.xlsx' DBMS=EXCEL2000 
        REPLACE;
    getnames=yes;
run;

data one;
    set Project2_F17 (keep=subject time concentration);
run;

proc sort data=one;
    by subject;
run;

proc transpose data=one out=two prefix=T;
    by subject;
    var time;
run;

proc transpose data=one out=three prefix=C;
    by subject;
    var concentration;
run;

data transposed;
    merge final final2;
    by subject;
run;

proc print data=transposed;
    title 'Transposed Data';
run;

data transposed;
    array max[13] T1-T13;
    maximum=max(of Time[*]);
    array max[13] C1-C13;
    maximum=max(of Concentration[*]);

    do n=i to 13;

        if max[i]=maximum then
            max=i;
    end;
    drop i;
run;

data Maximum;
    set transposed;
run;

proc print data=Maximum;
    title 'Maximum Values';
run;

proc means data=transposed;
    var T1-T13 C1-C13;
run;

data MMM;
    set transposed;
run;

proc print data=MMM;
    title 'Min Max Mean';

 

 

1523 proc import out=Project2_F17
1524 datafile= 'C:\Users\savanahb\Downloads\Project2_f17.xlsx'
1525 DBMS= EXCEL2000 REPLACE;
1526 getnames=yes;
1527 run;

NOTE: WORK.PROJECT2_F17 data set was successfully created.
NOTE: The data set WORK.PROJECT2_F17 has 91 observations and 3 variables.
NOTE: PROCEDURE IMPORT used (Total process time):
real time 0.31 seconds
cpu time 0.18 seconds


1528
1529 data one;
1530 set Project2_F17 (keep = subject time concentration);
1531 run;

NOTE: There were 91 observations read from the data set WORK.PROJECT2_F17.
NOTE: The data set WORK.ONE has 91 observations and 3 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds


1532 proc sort data=one;
1533 by subject;
1534 run;

NOTE: There were 91 observations read from the data set WORK.ONE.
NOTE: The data set WORK.ONE has 91 observations and 3 variables.
NOTE: PROCEDURE SORT used (Total process time):
real time 0.01 seconds
cpu time 0.03 seconds


1535
1536 proc transpose data=one out=two prefix = T;
1537 by subject;
1538 var time;
1539 run;

NOTE: There were 91 observations read from the data set WORK.ONE.
NOTE: The data set WORK.TWO has 7 observations and 16 variables.
NOTE: PROCEDURE TRANSPOSE used (Total process time):
real time 0.02 seconds
cpu time 0.00 seconds


1540
1541 proc transpose data=one out=three prefix=C;
1542 by subject;
1543 var concentration;
1544 run;

NOTE: There were 91 observations read from the data set WORK.ONE.
NOTE: The data set WORK.THREE has 7 observations and 16 variables.
NOTE: PROCEDURE TRANSPOSE used (Total process time):
real time 0.03 seconds
cpu time 0.03 seconds


1545
1546 data transposed;
1547 merge final final2;
1548 by subject;
1549 run;

NOTE: There were 7 observations read from the data set WORK.FINAL.
NOTE: There were 7 observations read from the data set WORK.FINAL2.
NOTE: The data set WORK.TRANSPOSED has 7 observations and 27 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.00 seconds


1550
1551 proc print data=transposed;
1552 title 'Transposed Data';
1553 run;

NOTE: There were 7 observations read from the data set WORK.TRANSPOSED.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.03 seconds
cpu time 0.03 seconds


1554
1555 data transposed;
1556 array max[13] T1-T13;
1557 maximum=max(of Time[*]);
NOTE: The array max has the same name as a SAS-supplied or user-defined function. Parentheses
following this name are treated as array references and not function references.
ERROR: Undeclared array referenced: Time.
ERROR: The ARRAYNAME[*] specification requires an array.
1558 array max[13] C1-C13;
---
124
ERROR 124-185: The variable max has already been defined.

1559 maximum=max(of Concentration[*]);
ERROR: Undeclared array referenced: Concentration.
ERROR: The ARRAYNAME[*] specification requires an array.
1560 do n = i to 13;
1561 if max[i]=maximum then max=i;
ERROR: Illegal reference to the array max.
1562 end;
1563 drop i;
1564 run ;

NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.TRANSPOSED may be incomplete. When this step was stopped there were 0
observations and 28 variables.
WARNING: Data set WORK.TRANSPOSED was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.00 seconds


1565
1566 data Maximum;
1567 set transposed;
1568 run;

NOTE: There were 7 observations read from the data set WORK.TRANSPOSED.
NOTE: The data set WORK.MAXIMUM has 7 observations and 27 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds


1569
1570 proc print data=Maximum;
1571 title 'Maximum Values';
1572 run;

NOTE: There were 7 observations read from the data set WORK.MAXIMUM.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.02 seconds
cpu time 0.01 seconds


1573
1574 proc means data=transposed;
1575 var T1-T13 C1-C13;
1576 run;

NOTE: There were 7 observations read from the data set WORK.TRANSPOSED.
NOTE: PROCEDURE MEANS used (Total process time):
real time 0.16 seconds
cpu time 0.07 seconds


1577
1578 data MMM;
1579 set transposed;
1580 run;

NOTE: There were 7 observations read from the data set WORK.TRANSPOSED.
NOTE: The data set WORK.MMM has 7 observations and 27 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds


1581
1582 proc print data=MMM;
1583 title 'Min Max Mean';
1584 run;

NOTE: There were 7 observations read from the data set WORK.MMM.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.02 seconds
cpu time 0.01 seconds


Accepted Solutions
Solution
‎11-08-2017 10:59 PM
Super User
Posts: 23,235

Re: How to use Arrays to calculate Maximum value

I edited your post to make your code legible. I highly, highly recommend you format your code and add comments. This step is your issue and you have a couple of them:

 

1. No SET statement, so you have no input data

2. DO NOT name your arrays the same name as SAS functions, that's a recipe for confusion. 

3. Once you change your array names, add the SET statement, you'll want to change your max functions to point to the new array names.

4. You have no RUN at the end of your last PROC PRINT so that step will never finish.

5. I think you're mixing up data set references, this is where comments can help you keep straight what your'e trying to do. In fact, I find that writing my comments - what I plan/need to do first is the best way to write code. Writing code is easy - figuring out what you need to do is the hard part. 

 

This step is the issue and see my comments below as well. 


Good Luck. 

 

data transposed;
    array max[13] T1-T13; <- don't use the name max here;
    maximum=max(of Time[*]); <- You have no array named Time so you can't refer to it that way
    array max[13] C1-C13; <- Same name as above? Each array needs it's own name
    maximum=max(of Concentration[*]); <- No array named concentration

    do n=i to 13; *No loop is needed, that's why we use arrays and the functions.

        if max[i]=maximum then
            max=i;
    end;
    drop i;
run;

View solution in original post


All Replies
Solution
‎11-08-2017 10:59 PM
Super User
Posts: 23,235

Re: How to use Arrays to calculate Maximum value

I edited your post to make your code legible. I highly, highly recommend you format your code and add comments. This step is your issue and you have a couple of them:

 

1. No SET statement, so you have no input data

2. DO NOT name your arrays the same name as SAS functions, that's a recipe for confusion. 

3. Once you change your array names, add the SET statement, you'll want to change your max functions to point to the new array names.

4. You have no RUN at the end of your last PROC PRINT so that step will never finish.

5. I think you're mixing up data set references, this is where comments can help you keep straight what your'e trying to do. In fact, I find that writing my comments - what I plan/need to do first is the best way to write code. Writing code is easy - figuring out what you need to do is the hard part. 

 

This step is the issue and see my comments below as well. 


Good Luck. 

 

data transposed;
    array max[13] T1-T13; <- don't use the name max here;
    maximum=max(of Time[*]); <- You have no array named Time so you can't refer to it that way
    array max[13] C1-C13; <- Same name as above? Each array needs it's own name
    maximum=max(of Concentration[*]); <- No array named concentration

    do n=i to 13; *No loop is needed, that's why we use arrays and the functions.

        if max[i]=maximum then
            max=i;
    end;
    drop i;
run;
Occasional Contributor
Posts: 14

Re: How to use Arrays to calculate Maximum value

Thank you @Reeza! I like that idea of writing out what I have to do then doing the code. I think that will help a lot in the future. 

New Contributor
Posts: 4

Re: How to use Arrays to calculate Maximum value

Did you ever figure this out?

 

I am stuck on the same step. 

Super User
Super User
Posts: 9,397

Re: How to use Arrays to calculate Maximum value

Really hard to follow your post.  From what I can tell you want to see min, max, mean of time and concentration at the end correct?  If so the whole transpose, arrays and such like is not needed:

proc sql;
  create table WANT as
  select  SUBJECT,
          min(TIME) as MIN_TIME,
          max(TIME) as MAX_TIME,
          mean(TIME) as MEAN_TIME,
          min(CONCENTRATION) as MIN_CONCENTRATION,
          max(CONCENTRATION) as MAX_CONCENTRATION,
          mean(CONCENTRATION) as MEAN_CONCENTRATION
   from   HAVE
   group by SUBJECT;
quit;

Obviously I am guessing a bit here as you have not provided any test data (in the form of a datastep) nor what the output should look like.

Occasional Contributor
Posts: 9

Re: How to use Arrays to calculate Maximum value

[ Edited ]

I'm actually in the same class as savanah and am having similar problems, although I still haven't been unable to figure it out.

I have completed all of the tasks correctly, except computing Tmax . My professor did not go beyond saying it was wrong.

 

For this part of the project I have:

data temp3;

      set project2_transposed;

      array conc[13] C1-C13;

      Cmax= max(of conc{*});

      array times[13] T1-T13;

      Tmax= max(of times(of Cmax);

      by subject;

run;

 

proc print data= temp3;

title "Part3";

run;

 

Again, Cmax is correct, and I do not get any errors in my log. However, the output numbers from Tmax are incorrect. If anyone could offer any advice I would be grateful.

 

Thank you,

Mackenzie

Super User
Posts: 23,235

Re: How to use Arrays to calculate Maximum value

[ Edited ]
Posted in reply to mackenzies3

mackenzies3 wrote:

I'm actually in the same class as savanah and am having similar problems, although I still haven't been unable to figure it out.

I have completed all of the tasks correctly, except computing Tmax . My professor did not go beyond saying it was wrong.

 

For this part of the project I have:

data temp3;

      set project2_transposed;

      array conc[13] C1-C13;

      Cmax= max(of conc{*});

      array times[13] T1-T13;

      Tmax= max(of times(of Cmax);

      by subject;

run;

 

proc print data= temp3;

title "Part3";

run;

 

Again, Cmax is correct, and I do not get any errors in my log. However, the output numbers from Tmax are incorrect. If anyone could offer any advice I would be grateful.

 

Thank you,

Mackenzie


@mackenzies3 Post your log. I would expect errors from your code. 

Occasional Contributor
Posts: 9

Re: How to use Arrays to calculate Maximum value

1619 %macro import (file, dsn);
1620 proc import out = &dsn
1621 datafile = &file
1622 dbms = xlsx replace;
1623 getnames = yes;
1624 run;
1625 %mend import;
1626 %let import = "C:\project2_f17\project2.xlsx";
1627 %import (file=&import, dsn=project2);

NOTE: The import data set has 91 observations and 3 variables.
NOTE: WORK.PROJECT2 data set was successfully created.
NOTE: PROCEDURE IMPORT used (Total process time):
real time 0.03 seconds
cpu time 0.01 seconds


1628
1629 proc print data= project2;
NOTE: Writing HTML Body file: sashtml25.htm
1630 title "project 2 data";
1631 run;

NOTE: There were 91 observations read from the data set WORK.PROJECT2.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.70 seconds
cpu time 0.42 seconds


1632
1633 proc transpose data= project2 out= temp1 prefix= T;
1634 var time;
1635 by subject;
1636 run;

NOTE: There were 91 observations read from the data set WORK.PROJECT2.
NOTE: The data set WORK.TEMP1 has 7 observations and 16 variables.
NOTE: PROCEDURE TRANSPOSE used (Total process time):
real time 0.04 seconds
cpu time 0.01 seconds


1637
1638 proc transpose data= project2 out= temp2 prefix= C;
1639 var concentration;
1640 by subject;
1641 run;

NOTE: There were 91 observations read from the data set WORK.PROJECT2.
NOTE: The data set WORK.TEMP2 has 7 observations and 16 variables.
NOTE: PROCEDURE TRANSPOSE used (Total process time):
real time 0.04 seconds
cpu time 0.03 seconds


1642
1643 data project2_transposed (drop=_name_ _label_);
1644 merge temp1 temp2;
1645 by subject;
1646 run;

WARNING: Multiple lengths were specified for the variable _NAME_ by input data set(s). This can
cause truncation of data.
NOTE: There were 7 observations read from the data set WORK.TEMP1.
NOTE: There were 7 observations read from the data set WORK.TEMP2.
NOTE: The data set WORK.PROJECT2_TRANSPOSED has 7 observations and 27 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds


1647
1648 proc print data= project2_transposed;
1649 title "Project2 Part 2";
1650 run;

NOTE: There were 7 observations read from the data set WORK.PROJECT2_TRANSPOSED.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.03 seconds
cpu time 0.01 seconds


1651
1652 data temp3;
1653 set project2_transposed;
1654 array conc[13] C1-C13;
1655 Cmax= max(of conc{*});
1656 array times[13] T1-T13;
1657 Tmax= max(of times(of Cmax));
1658 by subject;
1659 run;

NOTE: There were 7 observations read from the data set WORK.PROJECT2_TRANSPOSED.
NOTE: The data set WORK.TEMP3 has 7 observations and 29 variables.
NOTE: DATA statement used (Total process time):
real time 0.03 seconds
cpu time 0.03 seconds


1660
1661 proc print data= temp3;
1662 title "Part3";
1663 run;

NOTE: There were 7 observations read from the data set WORK.TEMP3.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.03 seconds
cpu time 0.01 seconds


1664
1665 data temp4;
1666 set project2;
1667 run;

NOTE: There were 91 observations read from the data set WORK.PROJECT2.
NOTE: The data set WORK.TEMP4 has 91 observations and 3 variables.
NOTE: DATA statement used (Total process time):
real time 0.03 seconds
cpu time 0.03 seconds


1668
1669 proc means data = temp4 MEAN MIN;
1670 var concentration;
1671 by subject;
1672 title "MEAN and MIN of concentration";
1673 run;

NOTE: There were 91 observations read from the data set WORK.TEMP4.
NOTE: PROCEDURE MEANS used (Total process time):
real time 0.06 seconds
cpu time 0.03 seconds

Super User
Posts: 23,235

Re: How to use Arrays to calculate Maximum value

Posted in reply to mackenzies3

@mackenzies3 Your code here doesn’t match what you posted originally. 

 

Look at your two max formulas. They’re different. Make them think the same except for the name of the array. Delete it and start from scratch if that’s makes it easier. 

Occasional Contributor
Posts: 9

Re: How to use Arrays to calculate Maximum value

[ Edited ]

 

Okay, I tried making the formulas the same except for the arrays and put:

data temp3;

set project2_transposed;

array conc[13] C1-C13;

Cmax= max(of conc{*});

array times[13] T1-T13;

Tmax= max(of times{*});

by subject;

run;

 

proc print data= temp3;

title "Part3";

run;

 

The log I then is:

1674  data temp3;

1675      set project2_transposed;

1676      array conc[13] C1-C13;

1677      Cmax= max(of conc{*});

1678      array times[13] T1-T13;

1679      Tmax= max(of times{*});

1680      by subject;

1681  run;

 

NOTE: There were 7 observations read from the data set WORK.PROJECT2_TRANSPOSED.

NOTE: The data set WORK.TEMP3 has 7 observations and 29 variables.

NOTE: DATA statement used (Total process time):

      real time           0.04 seconds

      cpu time            0.03 seconds

 

 

1682

1683  proc print data= temp3;

1684  title "Part3";

1685  run;

 

NOTE: There were 7 observations read from the data set WORK.TEMP3.

NOTE: PROCEDURE PRINT used (Total process time):

      real time           0.04 seconds

      cpu time            0.03 seconds

 

So here I'm getting the maximum time, but not the time of the maximum concentration for each subject, which is where I am finding myself getting extremely confused.

Occasional Contributor
Posts: 9

Re: How to use Arrays to calculate Maximum value

Posted in reply to mackenzies3

@Reeza

I guess I just am having trouble understanding how exactly to go from having the max of the concentration (Cmax), and max of time, to finding the time of the highest concentration (Tmax). 

 

We have 13 times, and 13 concentrations for 7 different subjects.

and the highest time is 32 for all of the subjects, but the highest concentrations differ between each subject.

 

For instance, subject 1 has the highest concentration of 4.7 at time 2.

but then subject 2 has the highest concentration of 4.7 at time 3. 

 

So I'm very confused on how to make the Tmax = 2 for subject 1, but Tmax = 3 for subject 3 and so on.

 

Someone mentioned WHICHN(), but we have never talked about this in our class, so I'm not even sure where to start on that.

Super User
Posts: 23,235

Re: How to use Arrays to calculate Maximum value

Posted in reply to mackenzies3

mackenzies3 wrote:

@Reeza

I guess I just am having trouble understanding how exactly to go from having the max of the concentration (Cmax), and max of time, to finding the time of the highest concentration (Tmax). 

 


Ah.  See that’s a different question. Before you and the OP only asked for maximum of arrays. 

This is easier if you loop then. 

Which ever record is the max gets stored and you have the index so you can use the index to determine the time point. 

 

Oh - and burn this part into your brain please - this is much easier if you don’t store your data in a wide format. Then you could sort and take the max easily. 

 

See page 17 here

https://support.sas.com/resources/papers/97529_Using_Arrays_in_SAS_Programming.pdf

 

 

If you still need help post your code and log log back with what you’ve tried. 

New Contributor
Posts: 4

Re: How to use Arrays to calculate Maximum value

Posted in reply to mackenzies3

Have you figured this out yet? 

 

I am stuck on the same step. 

Occasional Contributor
Posts: 9

Re: How to use Arrays to calculate Maximum value

Posted in reply to briannagowin1

Yes, I did. Here is the email I receive from the professor. It helped me a lot. Hope it helps you.

 

"TMAX is the the TIME value at which there is a Maximum concentration.
 
We are not looking for the maximum time value.
 
Please note that you need to define 2 different ARRAYs, one for concentration values and another for TIME values. But you have the same array name for both. That is not correct.
 
Once you find the CMAX, you are going to use the two arrays: As an example, you would need a statement such as the following with DO and END statements
 
if c{i} = cmax then tmax = t{i}; 
 
Hope this helps. I cannot give you anymore hint on this."
New Contributor
Posts: 4

Re: How to use Arrays to calculate Maximum value

Posted in reply to mackenzies3
I got this same response but I am still stuck!
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 15 replies
  • 1221 views
  • 2 likes
  • 5 in conversation