## SAS Homework Troubleshooting: Arrays and Do Loops

Solved
Occasional Contributor
Posts: 13

# SAS Homework Troubleshooting: Arrays and Do Loops

I was working on a homework problem regarding using arrays and looping to create a new variable to identify the date of when the maximum blood lead value was obtained but got stuck. For context, here is the homework problem:

In 1990 a study was done on the blood lead levels of children in Boston. The following variables for twenty-five children from the study have been entered on multiple lines per subject in the file lead_sum2018.txt in a list format:

Line 1 ID Number (numeric, values 1-25) Date of Birth (mmddyy8. format) Day of Blood Sample 1 (numeric, initial possible range: -9 to 31) Month of Blood Sample 1 (numeric, initial possible range: -9 to 12)

Line 2 ID Number (numeric, values 1-25) Day of Blood Sample 2 (numeric, initial possible range: -9 to 31) Month of Blood Sample 2 (numeric, initial possible range: -9 to 12)

Line 3 ID Number (numeric, values 1-25) Day of Blood Sample 3 (numeric, initial possible range: -9 to 31) Month of Blood Sample 3 (numeric, initial possible range: -9 to 12)

Line 4 ID Number (numeric, values 1-25) Blood Lead Level Sample 1 (numeric, possible range: 0.01 – 20.00) Blood Lead Level Sample 2 (numeric, possible range: 0.01 – 20.00) Blood Lead Level Sample 3 (numeric, possible range: 0.01 – 20.00) Sex (character, ‘M’ or ‘F’)

All blood samples were drawn in 1990. However, during data entry the order of blood samples was scrambled so that the first blood sample in the data file (blood sample 1) may not correspond to the first blood sample taken on a subject, it could be the first, second or third. In addition, some of the months and days and days of blood sampling were not written on the forms. At data entry, missing month and missing day values were each coded as -9.

The team of investigators for this project has made the following decisions regarding the missing values. Any missing days are to set equal to 15, any missing months are to be set equal to 6. Any analyses that are done on this data set need to follow those decisions. Be sure to implement the SAS syntax as indicated for each question. For example, use SAS arrays and loops if the item states that these must be used.

Here is the data that the HW references (it is in list format and was contained in a separate file called lead_sum2018.txt):

``````1 04/30/78 6 10
1 -9 7
1 14 1
1 1.62 1.35 1.47 F
2 05/19/79 27 11
2 20 -9
2 5 6
2 1.71 1.31 1.76 F
3 01/03/80 11 7
3 6 6
3 27 2
3 3.24 3.4 3.83 M
4 08/01/80 5 12
4 28 -9
4 3 4
4 3.1 3.69 3.27 M
5 12/26/80 21 5
5 3 7
5 -9 12
5 4.35 4.79 5.14 M
6 06/20/81 7 10
6 11 3
6 22 1
6 1.24 1.16 0.71 F
7 06/22/81 19 6
7 3 12
7 29 8
7 3.1 3.21 3.58 F
8 05/24/82 26 7
8 31 1
8 9 10
8 2.99 2.37 2.4 M
9 10/11/82 2 7
9 25 5
9 28 3
9 2.4 1.96 2.71 F
10 . 10 8
10 30 12
10 28 2
10 2.72 2.87 1.97 F
11 11/16/83 19 4
11 15 11
11 7 -9
11 4.8 4.5 4.96 M
12 03/02/84 17 6
12 11 2
12 17 11
12 2.38 2.6 2.88 F
13 04/19/84 2 12
13 -9 6
13 1 7
13 1.99 1.20 1.21 M
14 02/07/85 4 5
14 17 5
14 21 11
14 1.61 1.93 2.32 F
15 07/06/85 5 2
15 16 1
15 14 6
15 3.93 4 4.08 M
16 09/10/85 12 10
16 11 -9
16 23 6
16 3.29 2.88 2.97 M
17 11/05/85 12 7
17 18 1
17 11 11
17 1.31 0.98 1.04 F
18 12/07/85 16 2
18 18 4
18 -9 6
18 2.56 2.78 2.88 M
19 03/02/86 19 4
19 11 3
19 19 2
19 0.79 0.68 0.72 M
20 08/19/86 21 5
20 15 12
20 -9 4
20 0.66 1.15 1.42 F
21 02/22/87 16 12
21 17 9
21 13 4
21 2.92 3.27 3.23 M
22 10/11/87 7 6
22 1 12
22 -9 3
22 1.43 1.42 1.78 F
23 05/12/88 12 2
23 21 4
23 17 12
23 0.55 0.89 1.38 M
24 08/07/88 17 6
24 27 11
24 6 2
24 0.31 0.42 0.15 F
25 01/12/89 4 7
25 15 -9
25 23 1
25 1.69 1.58 1.53 M``````

A) Input the data and in the data step:

1) make sure that Date of Birth variable is recorded as a SAS date;

2) use SAS arrays and looping to create a SAS date variable for each of the three blood samples and to address the missing data in accordance to the decisions of the investigators. Hint: use a single array and do loop to recode the missing values for day and month, separately, and an array/do loop for creating the SAS date variable;

3) use a SAS function to create a variable for the highest, i.e., maximum, blood lead value for each child;

4) use SAS arrays and looping to identify the date on which this largest value was obtained and create a new variable for the date of the largest blood lead value;

5) determine the age of the child in years when the largest blood lead value was obtained (rounded to two decimal places);

6) create a new variable based on the age of the child in years when the largest lead value was obtained (call it, “agecat”) that takes on three levels: for children less than 4 years old, agecat should equal 1; for children at least 4 years old, but less than 8, agecat should equal 2; and for children at least 8 years of age, agecat should be 3.;

7) print out the variables for the date of birth, date of the largest lead level, age at blood sample for the largest blood lead level, agecat, sex, and the largest blood lead level (Only print out these requested variables). All dates should be formatted to use the mmddyy10. format on the output.

The code I used in response to this was:

``````libname HW3 'C:\Users\johns\Desktop\SAS';
data one;
infile HW3new;
informat dob mmddyy8.;
input #1 id dob dbs1 mbs1
#2 dbs2 mbs2
#3 dbs3 mbs3
#4 bls1 bls2 bls3 sex;
array dbs{3} dbs1 dbs2 dbs3;
array mbs{3} mbs1 mbs2 mbs3;
do i=1 to 3;
if dbs{i}=-9 then dbs{i}=15;
end;
do i=4 to 6;
if mbs{i}=-9 then mbs{i}=6;
end;
array date{3} mdy1 mdy2 mdy3;
do i=1 to 3;
date{i}=mdy(mbs{i}, dbs{i}, 1990);
end;
maxbls=max(of bls1-bls3);
array bls{3} bls1 bls2 bls3;
array maxdte{3} maxdte1 maxdte2 maxdte3;
do i=1 to i=3;
if bls{i}=maxbls then maxdte=i;
end;
agemax=maxdte-dob;
ageest=round(agemax/365.25,2);
if agemax=. then agecat=.;
else if agemax < 4 then agecat=1;
else if 4 <= agemax < 8 then agecat=2;
else if agemax ge 8 then agecat=3;
run; ``````

``````22             maxbls=max(of bls1-bls3);
23             array bls{3} bls1 bls2 bls3;
24             array maxdte{3} maxdte1 maxdte2 maxdte3;
25             do i=1 to i=3;
26               if bls{i}=maxbls then maxdte=i;
ERROR: Illegal reference to the array maxdte.
27               end;``````

Does anyone have any tip is regards to this issue? What did I do wrong? Was I supposed to create an additional array for the date of when the maximum blood lead sample value was collected?  Thanks!

Accepted Solutions
Solution
‎06-06-2018 01:49 PM
Super User
Posts: 23,700

## Re: SAS Homework Troubleshooting: Arrays and Do Loops

[ Edited ]

You correctly referenced your arrays in the other cases, but not in this one, this is the offending piece of code:

``````  array maxdte{3} maxdte1 maxdte2 maxdte3;
do i=1 to i=3;
if bls{i}=maxbls then maxdte (i)  =i;
end;``````

I'm assuming you're trying to:

4) use SAS arrays and looping to identify the date on which this largest value was obtained and create a new variable for the date of the largest blood lead value;

• Your loop is incorrect (extra i=), your others are correct so change the DO statement to match your others.
• Since you're looking for a single date you don't need an array in this case, you need a single variable, remove the mandate array declaration.
• You want the actual date, not the index i, so use i and the date array you have from earlier.
• You should change the name of your date array. DATE is a function in SAS and it can be confusing to someone else reading your code and isn't best practice.
• It's much easier to debug and find these issues if you comment the code with what you're trying to do. It forces you to keep it straight in your head and if you can't outline the steps, you can't code it anyways.

The answers below, but you should probably try and fix it yourself first. Your code has some other errors as well.

Spoiler
*find the date of the maximum blood test value; do i=1 to 3; if bls{i}=maxbls then maxdte=date(i); *change date to your new array name, if you change it; end;

Another issue:

``````array mbs{3} mbs1 mbs2 mbs3; * <- array with only 3 elements;
do i=1 to 3;
if dbs{i}=-9 then dbs{i}=15;
end;
do i=4 to 6;  *<- you try and reference the array from element 4 to 6 which do not exist;
if mbs{i}=-9 then mbs{i}=6;
end;``````

That should at least get you pass one error.

@JackZ295 wrote:

I was working on a homework problem regarding using arrays and looping to create a new variable to identify the date of when the maximum blood lead value was obtained but got stuck. For context, here is the homework problem:

In 1990 a study was done on the blood lead levels of children in Boston. The following variables for twenty-five children from the study have been entered on multiple lines per subject in the file lead_sum2018.txt in a list format:

Line 1 ID Number (numeric, values 1-25) Date of Birth (mmddyy8. format) Day of Blood Sample 1 (numeric, initial possible range: -9 to 31) Month of Blood Sample 1 (numeric, initial possible range: -9 to 12)

Line 2 ID Number (numeric, values 1-25) Day of Blood Sample 2 (numeric, initial possible range: -9 to 31) Month of Blood Sample 2 (numeric, initial possible range: -9 to 12)

Line 3 ID Number (numeric, values 1-25) Day of Blood Sample 3 (numeric, initial possible range: -9 to 31) Month of Blood Sample 3 (numeric, initial possible range: -9 to 12)

Line 4 ID Number (numeric, values 1-25) Blood Lead Level Sample 1 (numeric, possible range: 0.01 – 20.00) Blood Lead Level Sample 2 (numeric, possible range: 0.01 – 20.00) Blood Lead Level Sample 3 (numeric, possible range: 0.01 – 20.00) Sex (character, ‘M’ or ‘F’)

All blood samples were drawn in 1990. However, during data entry the order of blood samples was scrambled so that the first blood sample in the data file (blood sample 1) may not correspond to the first blood sample taken on a subject, it could be the first, second or third. In addition, some of the months and days and days of blood sampling were not written on the forms. At data entry, missing month and missing day values were each coded as -9.

The team of investigators for this project has made the following decisions regarding the missing values. Any missing days are to set equal to 15, any missing months are to be set equal to 6. Any analyses that are done on this data set need to follow those decisions. Be sure to implement the SAS syntax as indicated for each question. For example, use SAS arrays and loops if the item states that these must be used.

Here is the data that the HW references (it is in list format and was contained in a separate file called lead_sum2018.txt):

``````1 04/30/78 6 10
1 -9 7
1 14 1
1 1.62 1.35 1.47 F
2 05/19/79 27 11
2 20 -9
2 5 6
2 1.71 1.31 1.76 F
3 01/03/80 11 7
3 6 6
3 27 2
3 3.24 3.4 3.83 M
4 08/01/80 5 12
4 28 -9
4 3 4
4 3.1 3.69 3.27 M
5 12/26/80 21 5
5 3 7
5 -9 12
5 4.35 4.79 5.14 M
6 06/20/81 7 10
6 11 3
6 22 1
6 1.24 1.16 0.71 F
7 06/22/81 19 6
7 3 12
7 29 8
7 3.1 3.21 3.58 F
8 05/24/82 26 7
8 31 1
8 9 10
8 2.99 2.37 2.4 M
9 10/11/82 2 7
9 25 5
9 28 3
9 2.4 1.96 2.71 F
10 . 10 8
10 30 12
10 28 2
10 2.72 2.87 1.97 F
11 11/16/83 19 4
11 15 11
11 7 -9
11 4.8 4.5 4.96 M
12 03/02/84 17 6
12 11 2
12 17 11
12 2.38 2.6 2.88 F
13 04/19/84 2 12
13 -9 6
13 1 7
13 1.99 1.20 1.21 M
14 02/07/85 4 5
14 17 5
14 21 11
14 1.61 1.93 2.32 F
15 07/06/85 5 2
15 16 1
15 14 6
15 3.93 4 4.08 M
16 09/10/85 12 10
16 11 -9
16 23 6
16 3.29 2.88 2.97 M
17 11/05/85 12 7
17 18 1
17 11 11
17 1.31 0.98 1.04 F
18 12/07/85 16 2
18 18 4
18 -9 6
18 2.56 2.78 2.88 M
19 03/02/86 19 4
19 11 3
19 19 2
19 0.79 0.68 0.72 M
20 08/19/86 21 5
20 15 12
20 -9 4
20 0.66 1.15 1.42 F
21 02/22/87 16 12
21 17 9
21 13 4
21 2.92 3.27 3.23 M
22 10/11/87 7 6
22 1 12
22 -9 3
22 1.43 1.42 1.78 F
23 05/12/88 12 2
23 21 4
23 17 12
23 0.55 0.89 1.38 M
24 08/07/88 17 6
24 27 11
24 6 2
24 0.31 0.42 0.15 F
25 01/12/89 4 7
25 15 -9
25 23 1
25 1.69 1.58 1.53 M``````

A) Input the data and in the data step:

1) make sure that Date of Birth variable is recorded as a SAS date;

2) use SAS arrays and looping to create a SAS date variable for each of the three blood samples and to address the missing data in accordance to the decisions of the investigators. Hint: use a single array and do loop to recode the missing values for day and month, separately, and an array/do loop for creating the SAS date variable;

3) use a SAS function to create a variable for the highest, i.e., maximum, blood lead value for each child;

4) use SAS arrays and looping to identify the date on which this largest value was obtained and create a new variable for the date of the largest blood lead value;

5) determine the age of the child in years when the largest blood lead value was obtained (rounded to two decimal places);

6) create a new variable based on the age of the child in years when the largest lead value was obtained (call it, “agecat”) that takes on three levels: for children less than 4 years old, agecat should equal 1; for children at least 4 years old, but less than 8, agecat should equal 2; and for children at least 8 years of age, agecat should be 3.;

7) print out the variables for the date of birth, date of the largest lead level, age at blood sample for the largest blood lead level, agecat, sex, and the largest blood lead level (Only print out these requested variables). All dates should be formatted to use the mmddyy10. format on the output.

The code I used in response to this was:

``````libname HW3 'C:\Users\johns\Desktop\SAS';
data one;
infile HW3new;
informat dob mmddyy8.;
input #1 id dob dbs1 mbs1
#2 dbs2 mbs2
#3 dbs3 mbs3
#4 bls1 bls2 bls3 sex;
array dbs{3} dbs1 dbs2 dbs3;
array mbs{3} mbs1 mbs2 mbs3;
do i=1 to 3;
if dbs{i}=-9 then dbs{i}=15;
end;
do i=4 to 6;
if mbs{i}=-9 then mbs{i}=6;
end;
array date{3} mdy1 mdy2 mdy3;
do i=1 to 3;
date{i}=mdy(mbs{i}, dbs{i}, 1990);
end;
maxbls=max(of bls1-bls3);
array bls{3} bls1 bls2 bls3;
array maxdte{3} maxdte1 maxdte2 maxdte3;
do i=1 to i=3;
if bls{i}=maxbls then maxdte=i;
end;
agemax=maxdte-dob;
ageest=round(agemax/365.25,2);
if agemax=. then agecat=.;
else if agemax < 4 then agecat=1;
else if 4 <= agemax < 8 then agecat=2;
else if agemax ge 8 then agecat=3;
run; ``````

``````22             maxbls=max(of bls1-bls3);
23             array bls{3} bls1 bls2 bls3;
24             array maxdte{3} maxdte1 maxdte2 maxdte3;
25             do i=1 to i=3;
26               if bls{i}=maxbls then maxdte=i;
ERROR: Illegal reference to the array maxdte.
27               end;``````

Does anyone have any tip is regards to this issue? What did I do wrong? Was I supposed to create an additional array for the date of when the maximum blood lead sample value was collected?  Thanks!

All Replies
Solution
‎06-06-2018 01:49 PM
Super User
Posts: 23,700

## Re: SAS Homework Troubleshooting: Arrays and Do Loops

[ Edited ]

You correctly referenced your arrays in the other cases, but not in this one, this is the offending piece of code:

``````  array maxdte{3} maxdte1 maxdte2 maxdte3;
do i=1 to i=3;
if bls{i}=maxbls then maxdte (i)  =i;
end;``````

I'm assuming you're trying to:

4) use SAS arrays and looping to identify the date on which this largest value was obtained and create a new variable for the date of the largest blood lead value;

• Your loop is incorrect (extra i=), your others are correct so change the DO statement to match your others.
• Since you're looking for a single date you don't need an array in this case, you need a single variable, remove the mandate array declaration.
• You want the actual date, not the index i, so use i and the date array you have from earlier.
• You should change the name of your date array. DATE is a function in SAS and it can be confusing to someone else reading your code and isn't best practice.
• It's much easier to debug and find these issues if you comment the code with what you're trying to do. It forces you to keep it straight in your head and if you can't outline the steps, you can't code it anyways.

The answers below, but you should probably try and fix it yourself first. Your code has some other errors as well.

Spoiler
*find the date of the maximum blood test value; do i=1 to 3; if bls{i}=maxbls then maxdte=date(i); *change date to your new array name, if you change it; end;

Another issue:

``````array mbs{3} mbs1 mbs2 mbs3; * <- array with only 3 elements;
do i=1 to 3;
if dbs{i}=-9 then dbs{i}=15;
end;
do i=4 to 6;  *<- you try and reference the array from element 4 to 6 which do not exist;
if mbs{i}=-9 then mbs{i}=6;
end;``````

That should at least get you pass one error.

@JackZ295 wrote:

I was working on a homework problem regarding using arrays and looping to create a new variable to identify the date of when the maximum blood lead value was obtained but got stuck. For context, here is the homework problem:

In 1990 a study was done on the blood lead levels of children in Boston. The following variables for twenty-five children from the study have been entered on multiple lines per subject in the file lead_sum2018.txt in a list format:

Line 1 ID Number (numeric, values 1-25) Date of Birth (mmddyy8. format) Day of Blood Sample 1 (numeric, initial possible range: -9 to 31) Month of Blood Sample 1 (numeric, initial possible range: -9 to 12)

Line 2 ID Number (numeric, values 1-25) Day of Blood Sample 2 (numeric, initial possible range: -9 to 31) Month of Blood Sample 2 (numeric, initial possible range: -9 to 12)

Line 3 ID Number (numeric, values 1-25) Day of Blood Sample 3 (numeric, initial possible range: -9 to 31) Month of Blood Sample 3 (numeric, initial possible range: -9 to 12)

Line 4 ID Number (numeric, values 1-25) Blood Lead Level Sample 1 (numeric, possible range: 0.01 – 20.00) Blood Lead Level Sample 2 (numeric, possible range: 0.01 – 20.00) Blood Lead Level Sample 3 (numeric, possible range: 0.01 – 20.00) Sex (character, ‘M’ or ‘F’)

All blood samples were drawn in 1990. However, during data entry the order of blood samples was scrambled so that the first blood sample in the data file (blood sample 1) may not correspond to the first blood sample taken on a subject, it could be the first, second or third. In addition, some of the months and days and days of blood sampling were not written on the forms. At data entry, missing month and missing day values were each coded as -9.

The team of investigators for this project has made the following decisions regarding the missing values. Any missing days are to set equal to 15, any missing months are to be set equal to 6. Any analyses that are done on this data set need to follow those decisions. Be sure to implement the SAS syntax as indicated for each question. For example, use SAS arrays and loops if the item states that these must be used.

Here is the data that the HW references (it is in list format and was contained in a separate file called lead_sum2018.txt):

``````1 04/30/78 6 10
1 -9 7
1 14 1
1 1.62 1.35 1.47 F
2 05/19/79 27 11
2 20 -9
2 5 6
2 1.71 1.31 1.76 F
3 01/03/80 11 7
3 6 6
3 27 2
3 3.24 3.4 3.83 M
4 08/01/80 5 12
4 28 -9
4 3 4
4 3.1 3.69 3.27 M
5 12/26/80 21 5
5 3 7
5 -9 12
5 4.35 4.79 5.14 M
6 06/20/81 7 10
6 11 3
6 22 1
6 1.24 1.16 0.71 F
7 06/22/81 19 6
7 3 12
7 29 8
7 3.1 3.21 3.58 F
8 05/24/82 26 7
8 31 1
8 9 10
8 2.99 2.37 2.4 M
9 10/11/82 2 7
9 25 5
9 28 3
9 2.4 1.96 2.71 F
10 . 10 8
10 30 12
10 28 2
10 2.72 2.87 1.97 F
11 11/16/83 19 4
11 15 11
11 7 -9
11 4.8 4.5 4.96 M
12 03/02/84 17 6
12 11 2
12 17 11
12 2.38 2.6 2.88 F
13 04/19/84 2 12
13 -9 6
13 1 7
13 1.99 1.20 1.21 M
14 02/07/85 4 5
14 17 5
14 21 11
14 1.61 1.93 2.32 F
15 07/06/85 5 2
15 16 1
15 14 6
15 3.93 4 4.08 M
16 09/10/85 12 10
16 11 -9
16 23 6
16 3.29 2.88 2.97 M
17 11/05/85 12 7
17 18 1
17 11 11
17 1.31 0.98 1.04 F
18 12/07/85 16 2
18 18 4
18 -9 6
18 2.56 2.78 2.88 M
19 03/02/86 19 4
19 11 3
19 19 2
19 0.79 0.68 0.72 M
20 08/19/86 21 5
20 15 12
20 -9 4
20 0.66 1.15 1.42 F
21 02/22/87 16 12
21 17 9
21 13 4
21 2.92 3.27 3.23 M
22 10/11/87 7 6
22 1 12
22 -9 3
22 1.43 1.42 1.78 F
23 05/12/88 12 2
23 21 4
23 17 12
23 0.55 0.89 1.38 M
24 08/07/88 17 6
24 27 11
24 6 2
24 0.31 0.42 0.15 F
25 01/12/89 4 7
25 15 -9
25 23 1
25 1.69 1.58 1.53 M``````

A) Input the data and in the data step:

1) make sure that Date of Birth variable is recorded as a SAS date;

2) use SAS arrays and looping to create a SAS date variable for each of the three blood samples and to address the missing data in accordance to the decisions of the investigators. Hint: use a single array and do loop to recode the missing values for day and month, separately, and an array/do loop for creating the SAS date variable;

3) use a SAS function to create a variable for the highest, i.e., maximum, blood lead value for each child;

4) use SAS arrays and looping to identify the date on which this largest value was obtained and create a new variable for the date of the largest blood lead value;

5) determine the age of the child in years when the largest blood lead value was obtained (rounded to two decimal places);

6) create a new variable based on the age of the child in years when the largest lead value was obtained (call it, “agecat”) that takes on three levels: for children less than 4 years old, agecat should equal 1; for children at least 4 years old, but less than 8, agecat should equal 2; and for children at least 8 years of age, agecat should be 3.;

7) print out the variables for the date of birth, date of the largest lead level, age at blood sample for the largest blood lead level, agecat, sex, and the largest blood lead level (Only print out these requested variables). All dates should be formatted to use the mmddyy10. format on the output.

The code I used in response to this was:

``````libname HW3 'C:\Users\johns\Desktop\SAS';
data one;
infile HW3new;
informat dob mmddyy8.;
input #1 id dob dbs1 mbs1
#2 dbs2 mbs2
#3 dbs3 mbs3
#4 bls1 bls2 bls3 sex;
array dbs{3} dbs1 dbs2 dbs3;
array mbs{3} mbs1 mbs2 mbs3;
do i=1 to 3;
if dbs{i}=-9 then dbs{i}=15;
end;
do i=4 to 6;
if mbs{i}=-9 then mbs{i}=6;
end;
array date{3} mdy1 mdy2 mdy3;
do i=1 to 3;
date{i}=mdy(mbs{i}, dbs{i}, 1990);
end;
maxbls=max(of bls1-bls3);
array bls{3} bls1 bls2 bls3;
array maxdte{3} maxdte1 maxdte2 maxdte3;
do i=1 to i=3;
if bls{i}=maxbls then maxdte=i;
end;
agemax=maxdte-dob;
ageest=round(agemax/365.25,2);
if agemax=. then agecat=.;
else if agemax < 4 then agecat=1;
else if 4 <= agemax < 8 then agecat=2;
else if agemax ge 8 then agecat=3;
run; ``````

``````22             maxbls=max(of bls1-bls3);
23             array bls{3} bls1 bls2 bls3;
24             array maxdte{3} maxdte1 maxdte2 maxdte3;
25             do i=1 to i=3;
26               if bls{i}=maxbls then maxdte=i;
ERROR: Illegal reference to the array maxdte.
27               end;``````

Does anyone have any tip is regards to this issue? What did I do wrong? Was I supposed to create an additional array for the date of when the maximum blood lead sample value was collected?  Thanks!

Occasional Contributor
Posts: 13

## Re: SAS Homework Troubleshooting: Arrays and Do Loops

Hi Reeza! Thank you for your help! I have a few questions in regards to your comments (I haven't looked at the spoilers yet):

1. Isn't it okay for arrays to have 3 elements? I thought that arrays only needed to have at least two elements.

2. When you told me to remove the mandate array declaration, do you mean to remove the maxdte array?

3. I'm confused by your third bullet point. Shouldn't the maxdte array still equal something to correspond to the date of the maximum blood lead value? My previous do loop statements have always been an if then statement and the variables or arrays equaling something. For example:

do i=1 to 3;
if mbs{i}=-9 then mbs{i}=6;
end;

Also, since the question asks for me to create a new variable for the date of the largest blood lead value, shouldn't I be including that in the do loop?

4. Here is my corrected data step with the appropriate comments:

``````*Part A: Inputting the data;
libname HW3 'C:\Users\jackz\Desktop\SAS';
data one;
infile HW3new;
informat dob mmddyy8.;
input #1 id dob dbs1 mbs1
#2 dbs2 mbs2
#3 dbs3 mbs3
#4 bls1 bls2 bls3 sex;
*Part A1: Using Arrays and Looping to Recode Missing Values for Day and Month;
array dbs{3} dbs1 dbs2 dbs3;
array mbs{3} mbs1 mbs2 mbs3;
do i=1 to 3;
if dbs{i}=-9 then dbs{i}=15;
end;
do i=1 to 3;
if mbs{i}=-9 then mbs{i}=6;
end;
*Part A2: Using Arrays and Do Loops to Create the SAS Date Variable;
array dte{3} mdy1 mdy2 mdy3;
do i=1 to 3;
date{i}=mdy(mbs{i}, dbs{i}, 90);
end;
*Part A3: Using SAS Function to Create a Variable for the Maximum Blood Lead Value for Each Child;
maxbls=max(of bls1-bls3);
*Part A4: Using Arrays and Looping to Identify Date On Which Maximum Blood Lead Value Obtained;
array bls{3} bls1 bls2 bls3;
do i=1 to dim(bls);
if bls{i}=maxbls then dte{i};
end;
*Part A5: Determining Age of Child (yrs) When Largest Blood Lead Value Was Obtained;
agemax=maxdte-dob;
ageest=round(agemax/365.25,2);
*Part A6: Creating New Variable:Agecat;
if agemax=. then agecat=.;
else if agemax < 4 then agecat=1;
else if 4 <= agemax < 8 then agecat=2;
else if agemax ge 8 then agecat=3;
run;``````

Thanks again for all of your help!

Super User
Posts: 23,700

## Re: SAS Homework Troubleshooting: Arrays and Do Loops

1. Isn't it okay for arrays to have 3 elements? I thought that arrays only needed to have at least two elements.

Yes, array can have more than one element. You can have one element but at that point it's useless.

2. When you told me to remove the mandate array declaration, do you mean to remove the maxdte array?

Yes, remove maxdte array. The question asks you to create a new variable, not a new set of three variables with an array.

3. I'm confused by your third bullet point. Shouldn't the maxdte array still equal something to correspond to the date of the maximum blood lead value? My previous do loop statements have always been an if then statement and the variables or arrays equaling something

The question is: 4) use SAS arrays and looping to identify the date on which this largest value was obtained and create a new variable for the date of the largest blood lead value;

I didn't read the previous questions or code relevant to them to verify anything. In this particular question, an array for the variable created is not needed and you already have arrays for the other items, tests and dates so no new array is required.  You do need an if statement and assignment statement, but maxdte is not an array and you're not trying to store the value of i, you're trying to store the actual date.

Occasional Contributor
Posts: 13

## Re: SAS Homework Troubleshooting: Arrays and Do Loops

Thank you for your help! Also, are there other errors in my code that I should be cognizant of? You mentioned that one error I had regarding the number of elements in my arrays. However, after clarification, I'm assuming that that isn't really an error?

Super User
Posts: 23,700

## Re: SAS Homework Troubleshooting: Arrays and Do Loops

No, it generated an error when I try and run your code. I did not check the rest of your assignment.

@JackZ295 wrote:

Thank you for your help! Also, are there other errors in my code that I should be cognizant of? You mentioned that one error I had regarding the number of elements in my arrays. However, after clarification, I'm assuming that that isn't really an error?

Occasional Contributor
Posts: 13

## Re: SAS Homework Troubleshooting: Arrays and Do Loops

Thank you for your help. When I tried to run this code:

`libname HW3 'C:\Users\jackz\Desktop\SAS';filename HW3new 'C:\Users\jackz\Desktop\SAS\lead_sum2018.txt';data one; infile HW3new;informat dob mmddyy8.; input #1 id dob dbs1 mbs1 #2 dbs2 mbs2#3 dbs3 mbs3#4 bls1 bls2 bls3 sex \$;*Part A1: Using Arrays and Looping to Recode Missing Values for Day and Month; array dbs{3} dbs1 dbs2 dbs3;array mbs{3} mbs1 mbs2 mbs3;do i=1 to 3; if dbs{i}=-9 then dbs{i}=15;end; do i=1 to 3;if mbs{i}=-9 then mbs{i}=6;end; *Part A2: Using Arrays and Do Loops to Create the SAS Date Variable;array dte{3} mdy1 mdy2 mdy3;do i=1 to 3; dte{i}=mdy(mbs{i}, dbs{i}, 1990);end; *Part A3: Using SAS Function to Create a Variable for the Maximum Blood Lead Value for Each Child; maxbls=max(of bls1-bls3);*Part A4: Using Arrays and Looping to Identify Date On Which Maximum Blood Lead Value Obtained; array bls{3} bls1 bls2 bls3; do i=1 to dim(bls);if bls{i}=maxbls then maxdte=dte{i};end;*Part A5: Determining Age of Child (yrs) When Largest Blood Lead Value Was Obtained;agemax=maxdte-dob; ageest=round(agemax/365.25,2);*Part A6: Creating New Variable:Agecat;if agemax=. then agecat=.;else if agemax < 4 then agecat=1; else if 4 <= agemax < 8 then agecat=2;else if agemax ge 8 then agecat=3; run;`

I had these errors in the log. Do you know what these errors mean:

NOTE: Copyright (c) 2016 by SAS Institute Inc., Cary, NC, USA.
NOTE: SAS (r) Proprietary Software 9.4 (TS1M5)
Licensed to BOSTON UNIVERSITY - SFA T&R, Site 70009029.
NOTE: This session is executing on the W32_10HOME platform.

NOTE: Updated analytical products:

SAS/STAT 14.3
SAS/ETS 14.3
SAS/OR 14.3
SAS/IML 14.3
SAS/QC 14.3

W32_10HOME WIN 10.0.16299 Workstation

NOTE: SAS initialization used:
real time 1.23 seconds
cpu time 1.12 seconds

1 libname HW3 'C:\Users\jackz\Desktop\SAS';
NOTE: Libref HW3 was successfully assigned as follows:
Engine: V9
Physical Name: C:\Users\jackz\Desktop\SAS
3 data one;
4 infile HW3new;
5 informat dob mmddyy8.;
6 input #1 id dob dbs1 mbs1
7 #2 dbs2 mbs2
8 #3 dbs3 mbs3
9 #4 bls1 bls2 bls3 sex \$;
10 *Part A1: Using Arrays and Looping to Recode Missing Values for Day and Month;
11 array dbs{3} dbs1 dbs2 dbs3;
12 array mbs{3} mbs1 mbs2 mbs3;
13 do i=1 to 3;
14 if dbs{i}=-9 then dbs{i}=15;
15 end;
16 do i=1 to 3;
17 if mbs{i}=-9 then mbs{i}=6;
18 end;
19 *Part A2: Using Arrays and Do Loops to Create the SAS Date Variable;
20 array dte{3} mdy1 mdy2 mdy3;
21 do i=1 to 3;
22 dte{i}=mdy(mbs{i}, dbs{i}, 1990);
23 end;
24 *Part A3: Using SAS Function to Create a Variable for the Maximum Blood Lead Value for Each
24 ! Child;
25 maxbls=max(of bls1-bls3);
26 *Part A4: Using Arrays and Looping to Identify Date On Which Maximum Blood Lead Value Obtained
26 ! ;
27 array bls{3} bls1 bls2 bls3;
28 do i=1 to dim(bls);
29 if bls{i}=maxbls then maxdte=dte{i};
30 end;
31 *Part A5: Determining Age of Child (yrs) When Largest Blood Lead Value Was Obtained;
32 agemax=maxdte-dob;
33 ageest=round(agemax/365.25,2);
34 *Part A6: Creating New Variable:Agecat;
35 if agemax=. then agecat=.;
36 else if agemax < 4 then agecat=1;
37 else if 4 <= agemax < 8 then agecat=2;
38 else if agemax ge 8 then agecat=3;
39 run;

NOTE: The infile HW3NEW is:
RECFM=V,LRECL=32767,File Size (bytes)=1374,
Create Time=01Jun2018:19:47:47

NOTE: Invalid argument to function MDY(14,1,1990) at line 22 column 12.
RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----
1 1 04/30/78 6 10 15
2 1 -9 7 6
3 1 14 1 6
4 1 1.62 1.35 1.47 F 18
dob=6694 id=1 dbs1=6 mbs1=10 dbs2=1 mbs2=6 dbs3=1 mbs3=14 bls1=1 bls2=1.62 bls3=1.35 sex=1.47 i=4
mdy1=11236 mdy2=11109 mdy3=. maxbls=1.62 maxdte=11109 agemax=4415 ageest=12 agecat=3 _ERROR_=1
_N_=1
NOTE: Invalid argument to function MDY(20,2,1990) at line 22 column 12.
5 2 05/19/79 27 11 16
6 2 20 -9 7
7 2 5 6 5
8 2 1.71 1.31 1.76 F 18
dob=7078 id=2 dbs1=27 mbs1=11 dbs2=2 mbs2=20 dbs3=2 mbs3=5 bls1=2 bls2=1.71 bls3=1.31 sex=1.76 i=4
mdy1=11288 mdy2=. mdy3=11079 maxbls=2 maxdte=11288 agemax=4210 ageest=12 agecat=3 _ERROR_=1 _N_=2
NOTE: Invalid argument to function MDY(27,3,1990) at line 22 column 12.
9 3 01/03/80 11 7 15
10 3 6 6 5
11 3 27 2 6
12 3 3.24 3.4 3.83 M 17
dob=7307 id=3 dbs1=11 mbs1=7 dbs2=3 mbs2=6 dbs3=3 mbs3=27 bls1=3 bls2=3.24 bls3=3.4 sex=3.83 i=4
mdy1=11149 mdy2=11111 mdy3=. maxbls=3.4 maxdte=. agemax=. ageest=. agecat=. _ERROR_=1 _N_=3
NOTE: Invalid argument to function MDY(28,4,1990) at line 22 column 12.
13 4 08/01/80 5 12 15
14 4 28 -9 7
15 4 3 4 5
16 4 3.1 3.69 3.27 M 17
dob=7518 id=4 dbs1=5 mbs1=12 dbs2=4 mbs2=28 dbs3=4 mbs3=3 bls1=4 bls2=3.1 bls3=3.69 sex=3.27 i=4
mdy1=11296 mdy2=. mdy3=11020 maxbls=4 maxdte=11296 agemax=3778 ageest=10 agecat=3 _ERROR_=1 _N_=4
NOTE: Invalid argument to function MDY(22,6,1990) at line 22 column 12.
21 6 06/20/81 7 10 15
22 6 11 3 6
23 6 22 1 6
24 6 1.24 1.16 0.71 F 18
dob=7841 id=6 dbs1=7 mbs1=10 dbs2=6 mbs2=11 dbs3=6 mbs3=22 bls1=6 bls2=1.24 bls3=1.16 sex=0.71 i=4
mdy1=11237 mdy2=11267 mdy3=. maxbls=6 maxdte=11237 agemax=3396 ageest=10 agecat=3 _ERROR_=1 _N_=6
NOTE: Invalid argument to function MDY(29,7,1990) at line 22 column 12.
25 7 06/22/81 19 6 15
26 7 3 12 6
27 7 29 8 6
28 7 3.1 3.21 3.58 F 17
dob=7843 id=7 dbs1=19 mbs1=6 dbs2=7 mbs2=3 dbs3=7 mbs3=29 bls1=7 bls2=3.1 bls3=3.21 sex=3.58 i=4
mdy1=11127 mdy2=11023 mdy3=. maxbls=7 maxdte=11127 agemax=3284 ageest=8 agecat=3 _ERROR_=1 _N_=7
NOTE: Invalid argument to function MDY(31,8,1990) at line 22 column 12.
29 8 05/24/82 26 7 15
30 8 31 1 6
31 8 9 10 6
32 8 2.99 2.37 2.4 M 17
dob=8179 id=8 dbs1=26 mbs1=7 dbs2=8 mbs2=31 dbs3=8 mbs3=9 bls1=8 bls2=2.99 bls3=2.37 sex=2.4 i=4
mdy1=11164 mdy2=. mdy3=11208 maxbls=8 maxdte=11164 agemax=2985 ageest=8 agecat=3 _ERROR_=1 _N_=8
NOTE: Invalid argument to function MDY(25,9,1990) at line 22 column 12.
NOTE: Invalid argument to function MDY(28,9,1990) at line 22 column 12.
33 9 10/11/82 2 7 14
34 9 25 5 6
35 9 28 3 6
36 9 2.4 1.96 2.71 F 17
dob=8319 id=9 dbs1=2 mbs1=7 dbs2=9 mbs2=25 dbs3=9 mbs3=28 bls1=9 bls2=2.4 bls3=1.96 sex=2.71 i=4
mdy1=11140 mdy2=. mdy3=. maxbls=9 maxdte=11140 agemax=2821 ageest=8 agecat=3 _ERROR_=1 _N_=9
NOTE: Invalid argument to function MDY(30,10,1990) at line 22 column 12.
NOTE: Invalid argument to function MDY(28,10,1990) at line 22 column 12.
37 10 . 10 8 9
38 10 30 12 8
39 10 28 2 7
40 10 2.72 2.87 1.97 F 19
dob=. id=10 dbs1=10 mbs1=8 dbs2=10 mbs2=30 dbs3=10 mbs3=28 bls1=10 bls2=2.72 bls3=2.87 sex=1.97 i=4
mdy1=11179 mdy2=. mdy3=. maxbls=10 maxdte=11179 agemax=. ageest=. agecat=. _ERROR_=1 _N_=10
NOTE: Invalid argument to function MDY(15,11,1990) at line 22 column 12.
41 11 11/16/83 19 4 16
42 11 15 11 8
43 11 7 -9 7
44 11 4.8 4.5 4.96 M 17
dob=8720 id=11 dbs1=19 mbs1=4 dbs2=11 mbs2=15 dbs3=11 mbs3=7 bls1=11 bls2=4.8 bls3=4.5 sex=4.96 i=4
mdy1=11066 mdy2=. mdy3=11149 maxbls=11 maxdte=11066 agemax=2346 ageest=6 agecat=3 _ERROR_=1 _N_=11
NOTE: Invalid argument to function MDY(17,12,1990) at line 22 column 12.
45 12 03/02/84 17 6 16
46 12 11 2 7
47 12 17 11 8
48 12 2.38 2.6 2.88 F 18
dob=8827 id=12 dbs1=17 mbs1=6 dbs2=12 mbs2=11 dbs3=12 mbs3=17 bls1=12 bls2=2.38 bls3=2.6 sex=2.88
i=4 mdy1=11125 mdy2=11273 mdy3=. maxbls=12 maxdte=11125 agemax=2298 ageest=6 agecat=3 _ERROR_=1
_N_=12
NOTE: Invalid argument to function MDY(17,14,1990) at line 22 column 12.
NOTE: Invalid argument to function MDY(21,14,1990) at line 22 column 12.
53 14 02/07/85 4 5 15
54 14 17 5 7
55 14 21 11 8
56 14 1.61 1.93 2.32 F 19
dob=9169 id=14 dbs1=4 mbs1=5 dbs2=14 mbs2=17 dbs3=14 mbs3=21 bls1=14 bls2=1.61 bls3=1.93 sex=2.32
i=4 mdy1=11081 mdy2=. mdy3=. maxbls=14 maxdte=11081 agemax=1912 ageest=6 agecat=3 _ERROR_=1 _N_=14
NOTE: Invalid argument to function MDY(16,15,1990) at line 22 column 12.
NOTE: Invalid argument to function MDY(14,15,1990) at line 22 column 12.
57 15 07/06/85 5 2 15
58 15 16 1 7
59 15 14 6 7
60 15 3.93 4 4.08 M 16
dob=9318 id=15 dbs1=5 mbs1=2 dbs2=15 mbs2=16 dbs3=15 mbs3=14 bls1=15 bls2=3.93 bls3=4 sex=4.08 i=4
mdy1=10993 mdy2=. mdy3=. maxbls=15 maxdte=10993 agemax=1675 ageest=4 agecat=3 _ERROR_=1 _N_=15
NOTE: Invalid argument to function MDY(23,16,1990) at line 22 column 12.
61 16 09/10/85 12 10 17
62 16 11 -9 8
63 16 23 6 7
64 16 3.29 2.88 2.97 M 19
dob=9384 id=16 dbs1=12 mbs1=10 dbs2=16 mbs2=11 dbs3=16 mbs3=23 bls1=16 bls2=3.29 bls3=2.88 sex=2.97
i=4 mdy1=11242 mdy2=11277 mdy3=. maxbls=16 maxdte=11242 agemax=1858 ageest=6 agecat=3 _ERROR_=1
_N_=16
NOTE: Invalid argument to function MDY(18,17,1990) at line 22 column 12.
65 17 11/05/85 12 7 16
66 17 18 1 7
67 17 11 11 8
68 17 1.31 0.98 1.04 F 19
dob=9440 id=17 dbs1=12 mbs1=7 dbs2=17 mbs2=18 dbs3=17 mbs3=11 bls1=17 bls2=1.31 bls3=0.98 sex=1.04
i=4 mdy1=11150 mdy2=. mdy3=11278 maxbls=17 maxdte=11150 agemax=1710 ageest=4 agecat=3 _ERROR_=1
_N_=17
NOTE: Invalid argument to function MDY(18,18,1990) at line 22 column 12.
69 18 12/07/85 16 2 16
70 18 18 4 7
71 18 -9 6 7
72 18 2.56 2.78 2.88 M 19
dob=9472 id=18 dbs1=16 mbs1=2 dbs2=18 mbs2=18 dbs3=18 mbs3=6 bls1=18 bls2=2.56 bls3=2.78 sex=2.88
i=4 mdy1=11004 mdy2=. mdy3=11126 maxbls=18 maxdte=11004 agemax=1532 ageest=4 agecat=3 _ERROR_=1
_N_=18
NOTE: Invalid argument to function MDY(19,19,1990) at line 22 column 12.
73 19 03/02/86 19 4 16
74 19 11 3 7
75 19 19 2 7
76 19 0.79 0.68 0.72 M 19
dob=9557 id=19 dbs1=19 mbs1=4 dbs2=19 mbs2=11 dbs3=19 mbs3=19 bls1=19 bls2=0.79 bls3=0.68 sex=0.72
i=4 mdy1=11066 mdy2=11280 mdy3=. maxbls=19 maxdte=11066 agemax=1509 ageest=4 agecat=3 _ERROR_=1
_N_=19
NOTE: Invalid argument to function MDY(15,20,1990) at line 22 column 12.
77 20 08/19/86 21 5 16
78 20 15 12 8
79 20 -9 4 7
80 20 0.66 1.15 1.42 F 19
dob=9727 id=20 dbs1=21 mbs1=5 dbs2=20 mbs2=15 dbs3=20 mbs3=6 bls1=20 bls2=0.66 bls3=1.15 sex=1.42
i=4 mdy1=11098 mdy2=. mdy3=11128 maxbls=20 maxdte=11098 agemax=1371 ageest=4 agecat=3 _ERROR_=1
_N_=20
NOTE: Invalid argument to function MDY(17,21,1990) at line 22 column 12.
NOTE: Invalid argument to function MDY(13,21,1990) at line 22 column 12.
81 21 02/22/87 16 12 17
82 21 17 9 7
83 21 13 4 7
84 21 2.92 3.27 3.23 M 19
dob=9914 id=21 dbs1=16 mbs1=12 dbs2=21 mbs2=17 dbs3=21 mbs3=13 bls1=21 bls2=2.92 bls3=3.27 sex=3.23
i=4 mdy1=11307 mdy2=. mdy3=. maxbls=21 maxdte=11307 agemax=1393 ageest=4 agecat=3 _ERROR_=1 _N_=21
NOTE: Invalid argument to function MDY(21,23,1990) at line 22 column 12.
NOTE: Invalid argument to function MDY(17,23,1990) at line 22 column 12.
WARNING: Limit set by ERRORS= option reached. Further errors of this type will not be printed.
89 23 05/12/88 12 2 16
90 23 21 4 7
91 23 17 12 8
92 23 0.55 0.89 1.38 M 19
dob=10359 id=23 dbs1=12 mbs1=2 dbs2=23 mbs2=21 dbs3=23 mbs3=17 bls1=23 bls2=0.55 bls3=0.89 sex=1.38
i=4 mdy1=11000 mdy2=. mdy3=. maxbls=23 maxdte=11000 agemax=641 ageest=2 agecat=3 _ERROR_=1 _N_=23
NOTE: 100 records were read from the infile HW3NEW.
The minimum record length was 5.
The maximum record length was 19.
NOTE: Missing values were generated as a result of performing an operation on missing values.
Each place is given by: (Number of times) at (Line)Column).
2 at 32:14 2 at 33:8 2 at 33:20
NOTE: Mathematical operations could not be performed at the following places. The results of the
operations have been set to missing values.
Each place is given by: (Number of times) at (Line)Column).
29 at 22:12
NOTE: The data set WORK.ONE has 25 observations and 21 variables.
NOTE: DATA statement used (Total process time):
real time 0.14 seconds
cpu time 0.12 seconds

In short, errors such as these:

NOTE: Invalid argument to function MDY(14,1,1990) at line 22 column 12.

WARNING: Limit set by ERRORS= option reached. Further errors of this type will not be printed.

Super User
Posts: 23,700

## Re: SAS Homework Troubleshooting: Arrays and Do Loops

NOTE: Invalid argument to function MDY(14,1,1990) at line 22 column 12.

The function is month day year and you've entered 14 for the month. That's not possible so SAS is telling you that.

Line 12 Column 12 - check the log for that specific line and column to see the exact code:

``22 dte{i}=mdy(mbs{i}, dbs{i}, 1990);``

So some value in mbs is invalid.

If you keep looking at the log you can find the i that's generating the error:

dob=6694 id=1 dbs1=6 mbs1=10 dbs2=1 mbs2=6 dbs3=1 mbs3=14 bls1=1 bls2=1.62 bls3=1.35 sex=1.47 i=4
mdy1=11236 mdy2=11109 mdy3=. maxbls=1.62 maxdte=11109 agemax=4415 ageest=12 agecat=3 _ERROR_=1
_N_=1

_n_ =1 tells you that its occurring on the first record of the data you're trying to read.

Super User
Posts: 23,700

## Re: SAS Homework Troubleshooting: Arrays and Do Loops

I think the i part is actually wrong that's the end of the loop but it gives you an idea of where to start looking.

Did you change anything from previous runs as well? when I ran your code previously I didn't get any of those errors.
Occasional Contributor
Posts: 13

## Re: SAS Homework Troubleshooting: Arrays and Do Loops

Thanks! I found out that it was because I was using list input and did not consistently enter id as a variable in each row. SAS therefore read part of the ID as the date. Thanks for all of your help!

☑ This topic is solved.