I realize arrays are used to group data when you want to do process it all in one data step and it needs an array, do, end statement. I was instructed to transpose this data so that time and concentration would be variables and they would all show in relation to the subjects (observations). I was able to accomplish this by transposing each of the variables individually and then merging them. Now I am trying to find the maximum value of T and C for each observation using an array statement, but I just keep getting a bunch of errors. I am then supposed to find the mean, min and max which I know I can use the PROC MEANS statement, which I have done. When I do this final step, no errors show, but I am not getting any table to check the result. Below I've put the code I have come up with followed by the errors. Any guidance as to where I went wrong or what I am misunderstanding would be greatly appreciated. I am VERY new to SAS as you can probably tell.
proc import out=Project2_F17
datafile='C:\Users\savanahb\Downloads\Project2_f17.xlsx' DBMS=EXCEL2000
REPLACE;
getnames=yes;
run;
data one;
set Project2_F17 (keep=subject time concentration);
run;
proc sort data=one;
by subject;
run;
proc transpose data=one out=two prefix=T;
by subject;
var time;
run;
proc transpose data=one out=three prefix=C;
by subject;
var concentration;
run;
data transposed;
merge final final2;
by subject;
run;
proc print data=transposed;
title 'Transposed Data';
run;
data transposed;
array max[13] T1-T13;
maximum=max(of Time[*]);
array max[13] C1-C13;
maximum=max(of Concentration[*]);
do n=i to 13;
if max[i]=maximum then
max=i;
end;
drop i;
run;
data Maximum;
set transposed;
run;
proc print data=Maximum;
title 'Maximum Values';
run;
proc means data=transposed;
var T1-T13 C1-C13;
run;
data MMM;
set transposed;
run;
proc print data=MMM;
title 'Min Max Mean';
1523 proc import out=Project2_F17
1524 datafile= 'C:\Users\savanahb\Downloads\Project2_f17.xlsx'
1525 DBMS= EXCEL2000 REPLACE;
1526 getnames=yes;
1527 run;
NOTE: WORK.PROJECT2_F17 data set was successfully created.
NOTE: The data set WORK.PROJECT2_F17 has 91 observations and 3 variables.
NOTE: PROCEDURE IMPORT used (Total process time):
real time 0.31 seconds
cpu time 0.18 seconds
1528
1529 data one;
1530 set Project2_F17 (keep = subject time concentration);
1531 run;
NOTE: There were 91 observations read from the data set WORK.PROJECT2_F17.
NOTE: The data set WORK.ONE has 91 observations and 3 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
1532 proc sort data=one;
1533 by subject;
1534 run;
NOTE: There were 91 observations read from the data set WORK.ONE.
NOTE: The data set WORK.ONE has 91 observations and 3 variables.
NOTE: PROCEDURE SORT used (Total process time):
real time 0.01 seconds
cpu time 0.03 seconds
1535
1536 proc transpose data=one out=two prefix = T;
1537 by subject;
1538 var time;
1539 run;
NOTE: There were 91 observations read from the data set WORK.ONE.
NOTE: The data set WORK.TWO has 7 observations and 16 variables.
NOTE: PROCEDURE TRANSPOSE used (Total process time):
real time 0.02 seconds
cpu time 0.00 seconds
1540
1541 proc transpose data=one out=three prefix=C;
1542 by subject;
1543 var concentration;
1544 run;
NOTE: There were 91 observations read from the data set WORK.ONE.
NOTE: The data set WORK.THREE has 7 observations and 16 variables.
NOTE: PROCEDURE TRANSPOSE used (Total process time):
real time 0.03 seconds
cpu time 0.03 seconds
1545
1546 data transposed;
1547 merge final final2;
1548 by subject;
1549 run;
NOTE: There were 7 observations read from the data set WORK.FINAL.
NOTE: There were 7 observations read from the data set WORK.FINAL2.
NOTE: The data set WORK.TRANSPOSED has 7 observations and 27 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.00 seconds
1550
1551 proc print data=transposed;
1552 title 'Transposed Data';
1553 run;
NOTE: There were 7 observations read from the data set WORK.TRANSPOSED.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.03 seconds
cpu time 0.03 seconds
1554
1555 data transposed;
1556 array max[13] T1-T13;
1557 maximum=max(of Time[*]);
NOTE: The array max has the same name as a SAS-supplied or user-defined function. Parentheses
following this name are treated as array references and not function references.
ERROR: Undeclared array referenced: Time.
ERROR: The ARRAYNAME[*] specification requires an array.
1558 array max[13] C1-C13;
---
124
ERROR 124-185: The variable max has already been defined.
1559 maximum=max(of Concentration[*]);
ERROR: Undeclared array referenced: Concentration.
ERROR: The ARRAYNAME[*] specification requires an array.
1560 do n = i to 13;
1561 if max[i]=maximum then max=i;
ERROR: Illegal reference to the array max.
1562 end;
1563 drop i;
1564 run ;
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.TRANSPOSED may be incomplete. When this step was stopped there were 0
observations and 28 variables.
WARNING: Data set WORK.TRANSPOSED was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.00 seconds
1565
1566 data Maximum;
1567 set transposed;
1568 run;
NOTE: There were 7 observations read from the data set WORK.TRANSPOSED.
NOTE: The data set WORK.MAXIMUM has 7 observations and 27 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
1569
1570 proc print data=Maximum;
1571 title 'Maximum Values';
1572 run;
NOTE: There were 7 observations read from the data set WORK.MAXIMUM.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.02 seconds
cpu time 0.01 seconds
1573
1574 proc means data=transposed;
1575 var T1-T13 C1-C13;
1576 run;
NOTE: There were 7 observations read from the data set WORK.TRANSPOSED.
NOTE: PROCEDURE MEANS used (Total process time):
real time 0.16 seconds
cpu time 0.07 seconds
1577
1578 data MMM;
1579 set transposed;
1580 run;
NOTE: There were 7 observations read from the data set WORK.TRANSPOSED.
NOTE: The data set WORK.MMM has 7 observations and 27 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
1581
1582 proc print data=MMM;
1583 title 'Min Max Mean';
1584 run;
NOTE: There were 7 observations read from the data set WORK.MMM.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.02 seconds
cpu time 0.01 seconds
I edited your post to make your code legible. I highly, highly recommend you format your code and add comments. This step is your issue and you have a couple of them:
1. No SET statement, so you have no input data
2. DO NOT name your arrays the same name as SAS functions, that's a recipe for confusion.
3. Once you change your array names, add the SET statement, you'll want to change your max functions to point to the new array names.
4. You have no RUN at the end of your last PROC PRINT so that step will never finish.
5. I think you're mixing up data set references, this is where comments can help you keep straight what your'e trying to do. In fact, I find that writing my comments - what I plan/need to do first is the best way to write code. Writing code is easy - figuring out what you need to do is the hard part.
This step is the issue and see my comments below as well.
Good Luck.
data transposed;
array max[13] T1-T13; <- don't use the name max here;
maximum=max(of Time[*]); <- You have no array named Time so you can't refer to it that way
array max[13] C1-C13; <- Same name as above? Each array needs it's own name
maximum=max(of Concentration[*]); <- No array named concentration
do n=i to 13; *No loop is needed, that's why we use arrays and the functions.
if max[i]=maximum then
max=i;
end;
drop i;
run;
I edited your post to make your code legible. I highly, highly recommend you format your code and add comments. This step is your issue and you have a couple of them:
1. No SET statement, so you have no input data
2. DO NOT name your arrays the same name as SAS functions, that's a recipe for confusion.
3. Once you change your array names, add the SET statement, you'll want to change your max functions to point to the new array names.
4. You have no RUN at the end of your last PROC PRINT so that step will never finish.
5. I think you're mixing up data set references, this is where comments can help you keep straight what your'e trying to do. In fact, I find that writing my comments - what I plan/need to do first is the best way to write code. Writing code is easy - figuring out what you need to do is the hard part.
This step is the issue and see my comments below as well.
Good Luck.
data transposed;
array max[13] T1-T13; <- don't use the name max here;
maximum=max(of Time[*]); <- You have no array named Time so you can't refer to it that way
array max[13] C1-C13; <- Same name as above? Each array needs it's own name
maximum=max(of Concentration[*]); <- No array named concentration
do n=i to 13; *No loop is needed, that's why we use arrays and the functions.
if max[i]=maximum then
max=i;
end;
drop i;
run;
Thank you @Reeza! I like that idea of writing out what I have to do then doing the code. I think that will help a lot in the future.
Did you ever figure this out?
I am stuck on the same step.
Really hard to follow your post. From what I can tell you want to see min, max, mean of time and concentration at the end correct? If so the whole transpose, arrays and such like is not needed:
proc sql; create table WANT as select SUBJECT, min(TIME) as MIN_TIME, max(TIME) as MAX_TIME, mean(TIME) as MEAN_TIME, min(CONCENTRATION) as MIN_CONCENTRATION, max(CONCENTRATION) as MAX_CONCENTRATION, mean(CONCENTRATION) as MEAN_CONCENTRATION from HAVE group by SUBJECT; quit;
Obviously I am guessing a bit here as you have not provided any test data (in the form of a datastep) nor what the output should look like.
I'm actually in the same class as savanah and am having similar problems, although I still haven't been unable to figure it out.
I have completed all of the tasks correctly, except computing Tmax . My professor did not go beyond saying it was wrong.
For this part of the project I have:
data temp3;
set project2_transposed;
array conc[13] C1-C13;
Cmax= max(of conc{*});
array times[13] T1-T13;
Tmax= max(of times(of Cmax);
by subject;
run;
proc print data= temp3;
title "Part3";
run;
Again, Cmax is correct, and I do not get any errors in my log. However, the output numbers from Tmax are incorrect. If anyone could offer any advice I would be grateful.
Thank you,
Mackenzie
@mackenzies3 wrote:
I'm actually in the same class as savanah and am having similar problems, although I still haven't been unable to figure it out.
I have completed all of the tasks correctly, except computing Tmax . My professor did not go beyond saying it was wrong.
For this part of the project I have:
data temp3;
set project2_transposed;
array conc[13] C1-C13;
Cmax= max(of conc{*});
array times[13] T1-T13;
Tmax= max(of times(of Cmax);
by subject;
run;
proc print data= temp3;
title "Part3";
run;
Again, Cmax is correct, and I do not get any errors in my log. However, the output numbers from Tmax are incorrect. If anyone could offer any advice I would be grateful.
Thank you,
Mackenzie
@mackenzies3 Post your log. I would expect errors from your code.
@mackenzies3 Your code here doesn’t match what you posted originally.
Look at your two max formulas. They’re different. Make them think the same except for the name of the array. Delete it and start from scratch if that’s makes it easier.
Okay, I tried making the formulas the same except for the arrays and put:
data temp3;
set project2_transposed;
array conc[13] C1-C13;
Cmax= max(of conc{*});
array times[13] T1-T13;
Tmax= max(of times{*});
by subject;
run;
proc print data= temp3;
title "Part3";
run;
The log I then is:
1674 data temp3;
1675 set project2_transposed;
1676 array conc[13] C1-C13;
1677 Cmax= max(of conc{*});
1678 array times[13] T1-T13;
1679 Tmax= max(of times{*});
1680 by subject;
1681 run;
NOTE: There were 7 observations read from the data set WORK.PROJECT2_TRANSPOSED.
NOTE: The data set WORK.TEMP3 has 7 observations and 29 variables.
NOTE: DATA statement used (Total process time):
real time 0.04 seconds
cpu time 0.03 seconds
1682
1683 proc print data= temp3;
1684 title "Part3";
1685 run;
NOTE: There were 7 observations read from the data set WORK.TEMP3.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.04 seconds
cpu time 0.03 seconds
So here I'm getting the maximum time, but not the time of the maximum concentration for each subject, which is where I am finding myself getting extremely confused.
I guess I just am having trouble understanding how exactly to go from having the max of the concentration (Cmax), and max of time, to finding the time of the highest concentration (Tmax).
We have 13 times, and 13 concentrations for 7 different subjects.
and the highest time is 32 for all of the subjects, but the highest concentrations differ between each subject.
For instance, subject 1 has the highest concentration of 4.7 at time 2.
but then subject 2 has the highest concentration of 4.7 at time 3.
So I'm very confused on how to make the Tmax = 2 for subject 1, but Tmax = 3 for subject 3 and so on.
Someone mentioned WHICHN(), but we have never talked about this in our class, so I'm not even sure where to start on that.
@mackenzies3 wrote:
I guess I just am having trouble understanding how exactly to go from having the max of the concentration (Cmax), and max of time, to finding the time of the highest concentration (Tmax).
Ah. See that’s a different question. Before you and the OP only asked for maximum of arrays.
This is easier if you loop then.
Which ever record is the max gets stored and you have the index so you can use the index to determine the time point.
Oh - and burn this part into your brain please - this is much easier if you don’t store your data in a wide format. Then you could sort and take the max easily.
See page 17 here
https://support.sas.com/resources/papers/97529_Using_Arrays_in_SAS_Programming.pdf
If you still need help post your code and log log back with what you’ve tried.
Have you figured this out yet?
I am stuck on the same step.
Yes, I did. Here is the email I receive from the professor. It helped me a lot. Hope it helps you.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.