Helping with SAS Programming (Reading in Data with Dates and Arrays and Looping)

Reply
Occasional Contributor
Posts: 13

Helping with SAS Programming (Reading in Data with Dates and Arrays and Looping)

I was working on a homework problem regarding using arrays and looping to create a new variable to identify the date of when the maximum blood lead value was obtained but got stuck.

For context, here is the homework problem:

In 1990 a study was done on the blood lead levels of children in Boston. The following variables for twenty-five children from the study have been entered on multiple lines per subject in the file lead_sum2018.txt in a list format:

 

Again, the data stored in a txt file called lead_sum2018.txt and the data was entered in a list format as shown below: 

 

1 04/30/78 6 10
1 -9 7
1 14 1
1 1.62 1.35 1.47 F
2 05/19/79 27 11
2 20 -9
2 5 6
2 1.71 1.31 1.76 F
3 01/03/80 11 7
3 6 6
3 27 2
3 3.24 3.4 3.83 M
4 08/01/80 5 12
4 28 -9
4 3 4
4 3.1 3.69 3.27 M
5 12/26/80 21 5
5 3 7
5 -9 12
5 4.35 4.79 5.14 M
6 06/20/81 7 10
6 11 3
6 22 1
6 1.24 1.16 0.71 F
7 06/22/81 19 6
7 3 12
7 29 8
7 3.1 3.21 3.58 F
8 05/24/82 26 7
8 31 1
8 9 10
8 2.99 2.37 2.4 M
9 10/11/82 2 7
9 25 5
9 28 3
9 2.4 1.96 2.71 F
10 . 10 8
10 30 12
10 28 2
10 2.72 2.87 1.97 F
11 11/16/83 19 4
11 15 11
11 7 -9
11 4.8 4.5 4.96 M
12 03/02/84 17 6
12 11 2
12 17 11
12 2.38 2.6 2.88 F
13 04/19/84 2 12
13 -9 6
13 1 7
13 1.99 1.20 1.21 M
14 02/07/85 4 5
14 17 5
14 21 11
14 1.61 1.93 2.32 F
15 07/06/85 5 2
15 16 1
15 14 6
15 3.93 4 4.08 M
16 09/10/85 12 10
16 11 -9
16 23 6
16 3.29 2.88 2.97 M
17 11/05/85 12 7
17 18 1
17 11 11
17 1.31 0.98 1.04 F
18 12/07/85 16 2
18 18 4
18 -9 6
18 2.56 2.78 2.88 M
19 03/02/86 19 4
19 11 3
19 19 2
19 0.79 0.68 0.72 M
20 08/19/86 21 5
20 15 12
20 -9 4
20 0.66 1.15 1.42 F
21 02/22/87 16 12
21 17 9
21 13 4
21 2.92 3.27 3.23 M
22 10/11/87 7 6
22 1 12
22 -9 3
22 1.43 1.42 1.78 F
23 05/12/88 12 2
23 21 4
23 17 12
23 0.55 0.89 1.38 M
24 08/07/88 17 6
24 27 11
24 6 2
24 0.31 0.42 0.15 F
25 01/12/89 4 7
25 15 -9
25 23 1
25 1.69 1.58 1.53 M

 

 

Line 1 ID Number (numeric, values 1-25) Date of Birth (mmddyy8. format) Day of Blood Sample 1 (numeric, initial possible range: -9 to 31) Month of Blood Sample 1 (numeric, initial possible range: -9 to 12)

Line 2 ID Number (numeric, values 1-25) Day of Blood Sample 2 (numeric, initial possible range: -9 to 31) Month of Blood Sample 2 (numeric, initial possible range: -9 to 12)

Line 3 ID Number (numeric, values 1-25) Day of Blood Sample 3 (numeric, initial possible range: -9 to 31) Month of Blood Sample 3 (numeric, initial possible range: -9 to 12)

Line 4 ID Number (numeric, values 1-25) Blood Lead Level Sample 1 (numeric, possible range: 0.01 – 20.00) Blood Lead Level Sample 2 (numeric, possible range: 0.01 – 20.00) Blood Lead Level Sample 3 (numeric, possible range: 0.01 – 20.00) Sex (character, ‘M’ or ‘F’)

All blood samples were drawn in 1990. However, during data entry the order of blood samples was scrambled so that the first blood sample in the data file (blood sample 1) may not correspond to the first blood sample taken on a subject, it could be the first, second or third. In addition, some of the months and days and days of blood sampling were not written on the forms. At data entry, missing month and missing day values were each coded as -9.

The team of investigators for this project has made the following decisions regarding the missing values. Any missing days are to set equal to 15, any missing months are to be set equal to 6. Any analyses that are done on this data set need to follow those decisions. Be sure to implement the SAS syntax as indicated for each question. For example, use SAS arrays and loops if the item states that these must be used.

A) Input the data and in the data step:

1) make sure that Date of Birth variable is recorded as a SAS date;

2) use SAS arrays and looping to create a SAS date variable for each of the three blood samples and to address the missing data in accordance to the decisions of the investigators. Hint: use a single array and do loop to recode the missing values for day and month, separately, and an array/do loop for creating the SAS date variable;

3) use a SAS function to create a variable for the highest, i.e., maximum, blood lead value for each child;

4) use SAS arrays and looping to identify the date on which this largest value was obtained and create a new variable for the date of the largest blood lead value;

5) determine the age of the child in years when the largest blood lead value was obtained (rounded to two decimal places);

6) create a new variable based on the age of the child in years when the largest lead value was obtained (call it, “agecat”) that takes on three levels: for children less than 4 years old, agecat should equal 1; for children at least 4 years old, but less than 8, agecat should equal 2; and for children at least 8 years of age, agecat should be 3.;

7) print out the variables for the date of birth, date of the largest lead level, age at blood sample for the largest blood lead level, agecat, sex, and the largest blood lead level (Only print out these requested variables). All dates should be formatted to use the mmddyy10. format on the output.

The code I used in response to this was:

libname HW3 'C:\Users\jackz\Desktop\SAS';
filename HW3new 'C:\Users\jackz\Desktop\SAS\lead_sum2018.txt';
data one; 
    infile HW3new;
    informat dob mmddyy8.; 
    input #1 id dob dbs1 mbs1 
            #2 dbs2 mbs2
            #3 dbs3 mbs3
            #4 bls1 bls2 bls3 sex $;
*Part A1: Using Arrays and Looping to Recode Missing Values for Day and Month; 
array dbs{3} dbs1 dbs2 dbs3;
array mbs{3} mbs1 mbs2 mbs3;
do i=1 to 3; 
    if dbs{i}=-9 then dbs{i}=15;
 end; 
do i=1 to 3;
    if mbs{i}=-9 then mbs{i}=6;
end; 
*Part A2: Using Arrays and Do Loops to Create the SAS Date Variable;
array dte{3} mdy1 mdy2 mdy3;
    do i=1 to 3;
    dte{i}=mdy(mbs{i}, dbs{i}, 1990);
end; 
*Part A3: Using SAS Function to Create a Variable for the Maximum Blood Lead Value for Each 
 Child; 
maxbls=max(of bls1-bls3);
*Part A4: Using Arrays and Looping to Identify Date On Which Maximum Blood Lead Value 
Obtained; 
array bls{3} bls1 bls2 bls3; 
    do i=1 to 3;
    if bls{i}=maxbls then maxdte=dte{i};
end;
*Part A5: Determining Age of Child (yrs) When Largest Blood Lead Value Was Obtained;
agemax=maxdte-dob; 
ageest=round(agemax/365.25,2);
 *Part A6: Creating New Variable:Agecat;
if agemax=. then agecat=.;
    else if agemax < 4 then agecat=1; 
    else if 4 <= agemax < 8 then agecat=2;
    else if agemax ge 8 then agecat=3; 
run;

 

 The errors I received were: 

 

NOTE: Copyright (c) 2016 by SAS Institute Inc., Cary, NC, USA.
NOTE: SAS (r) Proprietary Software 9.4 (TS1M5)
Licensed to BOSTON UNIVERSITY - SFA T&R, Site 70009029.
NOTE: This session is executing on the W32_10HOME platform.

 

NOTE: Updated analytical products:

SAS/STAT 14.3
SAS/ETS 14.3
SAS/OR 14.3
SAS/IML 14.3
SAS/QC 14.3

NOTE: Additional host information:

W32_10HOME WIN 10.0.16299 Workstation

NOTE: SAS initialization used:
real time 1.26 seconds
cpu time 1.15 seconds

1 libname HW3 'C:\Users\jackz\Desktop\SAS';
NOTE: Libref HW3 was successfully assigned as follows:
Engine: V9
Physical Name: C:\Users\jackz\Desktop\SAS
2 filename HW3new 'C:\Users\jackz\Desktop\SAS\lead_sum2018.txt';
3 data one;
4 infile HW3new;
5 informat dob mmddyy8.;
6 input #1 id dob dbs1 mbs1
7 #2 dbs2 mbs2
8 #3 dbs3 mbs3
9 #4 bls1 bls2 bls3 sex $;
10 *Part A1: Using Arrays and Looping to Recode Missing Values for Day and Month;
11 array dbs{3} dbs1 dbs2 dbs3;
12 array mbs{3} mbs1 mbs2 mbs3;
13 do i=1 to 3;
14 if dbs{i}=-9 then dbs{i}=15;
15 end;
16 do i=1 to 3;
17 if mbs{i}=-9 then mbs{i}=6;
18 end;
19 *Part A2: Using Arrays and Do Loops to Create the SAS Date Variable;
20 array dte{3} mdy1 mdy2 mdy3;
21 do i=1 to 3;
22 dte{i}=mdy(mbs{i}, dbs{i}, 1990);
23 end;
24 *Part A3: Using SAS Function to Create a Variable for the Maximum Blood Lead Value for Each
24 ! Child;
25 maxbls=max(of bls1-bls3);
26 *Part A4: Using Arrays and Looping to Identify Date On Which Maximum Blood Lead Value Obtained
26 ! ;
27 array bls{3} bls1 bls2 bls3;
28 do i=1 to 3;
29 if bls{i}=maxbls then maxdte=dte{i};
30 end;
31 *Part A5: Determining Age of Child (yrs) When Largest Blood Lead Value Was Obtained;
32 agemax=maxdte-dob;
33 ageest=round(agemax/365.25,2);
34 *Part A6: Creating New Variable:Agecat;
35 if agemax=. then agecat=.;
36 else if agemax < 4 then agecat=1;
37 else if 4 <= agemax < 8 then agecat=2;
38 else if agemax ge 8 then agecat=3;
39 run;

NOTE: The infile HW3NEW is:
Filename=C:\Users\jackz\Desktop\SAS\lead_sum2018.txt,
RECFM=V,LRECL=32767,File Size (bytes)=1374,
Last Modified=01Jun2018:19:47:47,
Create Time=01Jun2018:19:47:47

NOTE: Invalid argument to function MDY(14,1,1990) at line 22 column 12.
RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----
1 1 04/30/78 6 10 15
2 1 -9 7 6
3 1 14 1 6
4 1 1.62 1.35 1.47 F 18
dob=6694 id=1 dbs1=6 mbs1=10 dbs2=1 mbs2=6 dbs3=1 mbs3=14 bls1=1 bls2=1.62 bls3=1.35 sex=1.47 i=4
mdy1=11236 mdy2=11109 mdy3=. maxbls=1.62 maxdte=11109 agemax=4415 ageest=12 agecat=3 _ERROR_=1
_N_=1
NOTE: Invalid argument to function MDY(20,2,1990) at line 22 column 12.
5 2 05/19/79 27 11 16
6 2 20 -9 7
7 2 5 6 5
8 2 1.71 1.31 1.76 F 18
dob=7078 id=2 dbs1=27 mbs1=11 dbs2=2 mbs2=20 dbs3=2 mbs3=5 bls1=2 bls2=1.71 bls3=1.31 sex=1.76 i=4
mdy1=11288 mdy2=. mdy3=11079 maxbls=2 maxdte=11288 agemax=4210 ageest=12 agecat=3 _ERROR_=1 _N_=2
NOTE: Invalid argument to function MDY(27,3,1990) at line 22 column 12.
9 3 01/03/80 11 7 15
10 3 6 6 5
11 3 27 2 6
12 3 3.24 3.4 3.83 M 17
dob=7307 id=3 dbs1=11 mbs1=7 dbs2=3 mbs2=6 dbs3=3 mbs3=27 bls1=3 bls2=3.24 bls3=3.4 sex=3.83 i=4
mdy1=11149 mdy2=11111 mdy3=. maxbls=3.4 maxdte=. agemax=. ageest=. agecat=. _ERROR_=1 _N_=3
NOTE: Invalid argument to function MDY(28,4,1990) at line 22 column 12.
13 4 08/01/80 5 12 15
14 4 28 -9 7
15 4 3 4 5
16 4 3.1 3.69 3.27 M 17
dob=7518 id=4 dbs1=5 mbs1=12 dbs2=4 mbs2=28 dbs3=4 mbs3=3 bls1=4 bls2=3.1 bls3=3.69 sex=3.27 i=4
mdy1=11296 mdy2=. mdy3=11020 maxbls=4 maxdte=11296 agemax=3778 ageest=10 agecat=3 _ERROR_=1 _N_=4
NOTE: Invalid argument to function MDY(22,6,1990) at line 22 column 12.
21 6 06/20/81 7 10 15
22 6 11 3 6
23 6 22 1 6
24 6 1.24 1.16 0.71 F 18
dob=7841 id=6 dbs1=7 mbs1=10 dbs2=6 mbs2=11 dbs3=6 mbs3=22 bls1=6 bls2=1.24 bls3=1.16 sex=0.71 i=4
mdy1=11237 mdy2=11267 mdy3=. maxbls=6 maxdte=11237 agemax=3396 ageest=10 agecat=3 _ERROR_=1 _N_=6
NOTE: Invalid argument to function MDY(29,7,1990) at line 22 column 12.
25 7 06/22/81 19 6 15
26 7 3 12 6
27 7 29 8 6
28 7 3.1 3.21 3.58 F 17
dob=7843 id=7 dbs1=19 mbs1=6 dbs2=7 mbs2=3 dbs3=7 mbs3=29 bls1=7 bls2=3.1 bls3=3.21 sex=3.58 i=4
mdy1=11127 mdy2=11023 mdy3=. maxbls=7 maxdte=11127 agemax=3284 ageest=8 agecat=3 _ERROR_=1 _N_=7
NOTE: Invalid argument to function MDY(31,8,1990) at line 22 column 12.
29 8 05/24/82 26 7 15
30 8 31 1 6
31 8 9 10 6
32 8 2.99 2.37 2.4 M 17
dob=8179 id=8 dbs1=26 mbs1=7 dbs2=8 mbs2=31 dbs3=8 mbs3=9 bls1=8 bls2=2.99 bls3=2.37 sex=2.4 i=4
mdy1=11164 mdy2=. mdy3=11208 maxbls=8 maxdte=11164 agemax=2985 ageest=8 agecat=3 _ERROR_=1 _N_=8
NOTE: Invalid argument to function MDY(25,9,1990) at line 22 column 12.
NOTE: Invalid argument to function MDY(28,9,1990) at line 22 column 12.
33 9 10/11/82 2 7 14
34 9 25 5 6
35 9 28 3 6
36 9 2.4 1.96 2.71 F 17
dob=8319 id=9 dbs1=2 mbs1=7 dbs2=9 mbs2=25 dbs3=9 mbs3=28 bls1=9 bls2=2.4 bls3=1.96 sex=2.71 i=4
mdy1=11140 mdy2=. mdy3=. maxbls=9 maxdte=11140 agemax=2821 ageest=8 agecat=3 _ERROR_=1 _N_=9
NOTE: Invalid argument to function MDY(30,10,1990) at line 22 column 12.
NOTE: Invalid argument to function MDY(28,10,1990) at line 22 column 12.
37 10 . 10 8 9
38 10 30 12 8
39 10 28 2 7
40 10 2.72 2.87 1.97 F 19
dob=. id=10 dbs1=10 mbs1=8 dbs2=10 mbs2=30 dbs3=10 mbs3=28 bls1=10 bls2=2.72 bls3=2.87 sex=1.97 i=4
mdy1=11179 mdy2=. mdy3=. maxbls=10 maxdte=11179 agemax=. ageest=. agecat=. _ERROR_=1 _N_=10
NOTE: Invalid argument to function MDY(15,11,1990) at line 22 column 12.
41 11 11/16/83 19 4 16
42 11 15 11 8
43 11 7 -9 7
44 11 4.8 4.5 4.96 M 17
dob=8720 id=11 dbs1=19 mbs1=4 dbs2=11 mbs2=15 dbs3=11 mbs3=7 bls1=11 bls2=4.8 bls3=4.5 sex=4.96 i=4
mdy1=11066 mdy2=. mdy3=11149 maxbls=11 maxdte=11066 agemax=2346 ageest=6 agecat=3 _ERROR_=1 _N_=11
NOTE: Invalid argument to function MDY(17,12,1990) at line 22 column 12.
45 12 03/02/84 17 6 16
46 12 11 2 7
47 12 17 11 8
48 12 2.38 2.6 2.88 F 18
dob=8827 id=12 dbs1=17 mbs1=6 dbs2=12 mbs2=11 dbs3=12 mbs3=17 bls1=12 bls2=2.38 bls3=2.6 sex=2.88
i=4 mdy1=11125 mdy2=11273 mdy3=. maxbls=12 maxdte=11125 agemax=2298 ageest=6 agecat=3 _ERROR_=1
_N_=12
NOTE: Invalid argument to function MDY(17,14,1990) at line 22 column 12.
NOTE: Invalid argument to function MDY(21,14,1990) at line 22 column 12.
53 14 02/07/85 4 5 15
54 14 17 5 7
55 14 21 11 8
56 14 1.61 1.93 2.32 F 19
dob=9169 id=14 dbs1=4 mbs1=5 dbs2=14 mbs2=17 dbs3=14 mbs3=21 bls1=14 bls2=1.61 bls3=1.93 sex=2.32
i=4 mdy1=11081 mdy2=. mdy3=. maxbls=14 maxdte=11081 agemax=1912 ageest=6 agecat=3 _ERROR_=1 _N_=14
NOTE: Invalid argument to function MDY(16,15,1990) at line 22 column 12.
NOTE: Invalid argument to function MDY(14,15,1990) at line 22 column 12.
57 15 07/06/85 5 2 15
58 15 16 1 7
59 15 14 6 7
60 15 3.93 4 4.08 M 16
dob=9318 id=15 dbs1=5 mbs1=2 dbs2=15 mbs2=16 dbs3=15 mbs3=14 bls1=15 bls2=3.93 bls3=4 sex=4.08 i=4
mdy1=10993 mdy2=. mdy3=. maxbls=15 maxdte=10993 agemax=1675 ageest=4 agecat=3 _ERROR_=1 _N_=15
NOTE: Invalid argument to function MDY(23,16,1990) at line 22 column 12.
61 16 09/10/85 12 10 17
62 16 11 -9 8
63 16 23 6 7
64 16 3.29 2.88 2.97 M 19
dob=9384 id=16 dbs1=12 mbs1=10 dbs2=16 mbs2=11 dbs3=16 mbs3=23 bls1=16 bls2=3.29 bls3=2.88 sex=2.97
i=4 mdy1=11242 mdy2=11277 mdy3=. maxbls=16 maxdte=11242 agemax=1858 ageest=6 agecat=3 _ERROR_=1
_N_=16
NOTE: Invalid argument to function MDY(18,17,1990) at line 22 column 12.
65 17 11/05/85 12 7 16
66 17 18 1 7
67 17 11 11 8
68 17 1.31 0.98 1.04 F 19
dob=9440 id=17 dbs1=12 mbs1=7 dbs2=17 mbs2=18 dbs3=17 mbs3=11 bls1=17 bls2=1.31 bls3=0.98 sex=1.04
i=4 mdy1=11150 mdy2=. mdy3=11278 maxbls=17 maxdte=11150 agemax=1710 ageest=4 agecat=3 _ERROR_=1
_N_=17
NOTE: Invalid argument to function MDY(18,18,1990) at line 22 column 12.
69 18 12/07/85 16 2 16
70 18 18 4 7
71 18 -9 6 7
72 18 2.56 2.78 2.88 M 19
dob=9472 id=18 dbs1=16 mbs1=2 dbs2=18 mbs2=18 dbs3=18 mbs3=6 bls1=18 bls2=2.56 bls3=2.78 sex=2.88
i=4 mdy1=11004 mdy2=. mdy3=11126 maxbls=18 maxdte=11004 agemax=1532 ageest=4 agecat=3 _ERROR_=1
_N_=18
NOTE: Invalid argument to function MDY(19,19,1990) at line 22 column 12.
73 19 03/02/86 19 4 16
74 19 11 3 7
75 19 19 2 7
76 19 0.79 0.68 0.72 M 19
dob=9557 id=19 dbs1=19 mbs1=4 dbs2=19 mbs2=11 dbs3=19 mbs3=19 bls1=19 bls2=0.79 bls3=0.68 sex=0.72
i=4 mdy1=11066 mdy2=11280 mdy3=. maxbls=19 maxdte=11066 agemax=1509 ageest=4 agecat=3 _ERROR_=1
_N_=19
NOTE: Invalid argument to function MDY(15,20,1990) at line 22 column 12.
77 20 08/19/86 21 5 16
78 20 15 12 8
79 20 -9 4 7
80 20 0.66 1.15 1.42 F 19
dob=9727 id=20 dbs1=21 mbs1=5 dbs2=20 mbs2=15 dbs3=20 mbs3=6 bls1=20 bls2=0.66 bls3=1.15 sex=1.42
i=4 mdy1=11098 mdy2=. mdy3=11128 maxbls=20 maxdte=11098 agemax=1371 ageest=4 agecat=3 _ERROR_=1
_N_=20
NOTE: Invalid argument to function MDY(17,21,1990) at line 22 column 12.
NOTE: Invalid argument to function MDY(13,21,1990) at line 22 column 12.
81 21 02/22/87 16 12 17
82 21 17 9 7
83 21 13 4 7
84 21 2.92 3.27 3.23 M 19
dob=9914 id=21 dbs1=16 mbs1=12 dbs2=21 mbs2=17 dbs3=21 mbs3=13 bls1=21 bls2=2.92 bls3=3.27 sex=3.23
i=4 mdy1=11307 mdy2=. mdy3=. maxbls=21 maxdte=11307 agemax=1393 ageest=4 agecat=3 _ERROR_=1 _N_=21
NOTE: Invalid argument to function MDY(21,23,1990) at line 22 column 12.
NOTE: Invalid argument to function MDY(17,23,1990) at line 22 column 12.
WARNING: Limit set by ERRORS= option reached. Further errors of this type will not be printed.
89 23 05/12/88 12 2 16
90 23 21 4 7
91 23 17 12 8
92 23 0.55 0.89 1.38 M 19
dob=10359 id=23 dbs1=12 mbs1=2 dbs2=23 mbs2=21 dbs3=23 mbs3=17 bls1=23 bls2=0.55 bls3=0.89 sex=1.38
i=4 mdy1=11000 mdy2=. mdy3=. maxbls=23 maxdte=11000 agemax=641 ageest=2 agecat=3 _ERROR_=1 _N_=23
NOTE: 100 records were read from the infile HW3NEW.
The minimum record length was 5.
The maximum record length was 19.
NOTE: Missing values were generated as a result of performing an operation on missing values.
Each place is given by: (Number of times) at (Line)Smiley SadColumn).
2 at 32:14 2 at 33:8 2 at 33:20
NOTE: Mathematical operations could not be performed at the following places. The results of the
operations have been set to missing values.
Each place is given by: (Number of times) at (Line)Smiley SadColumn).
29 at 22:12
NOTE: The data set WORK.ONE has 25 observations and 21 variables.
NOTE: DATA statement used (Total process time):
real time 0.13 seconds
cpu time 0.12 seconds

 

 

The TL;DR for the errors is 1. NOTE: Invalid argument to function MDY(17,23,1990) at line 22 column 12. 2. WARNING: Limit set by ERRORS= option reached. Further errors of this type will not be printed. 3. NOTE: Mathematical operations could not be performed at the following places. The results of the operations have been set to missing values. Each place is given by: (Number of times) at (Line)Smiley SadColumn). 29 at 22:12

Errors such as these run throughout. What did I do wrong? Thanks for all of your help!

SAS Super FREQ
Posts: 9,368

Re: Helping with SAS Programming (Reading in Data with Dates and Arrays and Looping)

[ Edited ]

HI, the MDY function expects the MONTH to be the first argument. When you have MDY(17,23,1990), you are specifying month=17, day=23 and year = 1990 -- this does not seem correct to me, since there is no month 17 in the Gregorian calendar. As another example, you have this: MDY(30,10,1990) , which is telling the function that month=30, day=10 and year = 1990. Clearly, there's nothing wrong with a day value of 10, but a month value of 30, again, is problematic.

My guess is that you have a problem with understanding how the MDY function works and might also have a problem with how you're reading in some of your data. Until you read in the data correctly, it's going to be hard to make the rest of the logic work. Also, it looks like ID is on every row of INPUT data. When you're reading the data, I would expect you to need to have ID on every INPUT statement too, otherwise, you're reading the wrong values from every line.

 

Finally, when you calculate AGEMAX and AGEEST (after you correct the problem with reading the data), then I would expect that this section will give you issues:

*Part A5: Determining Age of Child (yrs) When Largest Blood Lead Value Was Obtained;
agemax=maxdte-dob; 
ageest=round(agemax/365.25,2);
 *Part A6: Creating New Variable:Agecat;
if agemax=. then agecat=.;
    else if agemax < 4 then agecat=1; 
    else if 4 <= agemax < 8 then agecat=2;
    else if agemax ge 8 then agecat=3; 

because AGEMAX will be the number of DAYS old that they are on the max blood value, not the number of years. So I would expect that everybody will be more than 8 and fall into AGECAT=3. I would expect you want to test the YEARS old that they are, which would be AGEEST, not AGEMAX.

 

Also, if DOB is missing, do you want that observation in the final dataset or not?

Cynthia

PROC Star
Posts: 8,164

Re: Helping with SAS Programming (Reading in Data with Dates and Arrays and Looping)

You're biggest problem is that you didn't account for the fact that EACH of the four records (for each subject) begins with the subject's ID number.

 

I haven't looked through all of your code, and it would run the way you had it, but (1) you assigned a libname, but never used it; (2) you used multiple loops where one would have sufficed; and (3) your code doesn't use indentation (thus making it harder to follow).

 

Take a look at, and run, the following code:

 

filename HW3new 'C:\Users\jackz\Desktop\SAS\lead_sum2018.txt';
data one (drop=i dummy); 
  infile HW3new;
  informat dob mmddyy8.; 
  input #1 id dob dbs1 mbs1 
        #2 dummy dbs2 mbs2
        #3 dummy dbs3 mbs3
        #4 dummy bls1 bls2 bls3 sex $;
  array dbs{3} dbs1 dbs2 dbs3;
  array mbs{3} mbs1 mbs2 mbs3;
  array dte{3} mdy1 mdy2 mdy3;
  do i=1 to 3; 
    *Part A1: Using Arrays and Looping to Recode Missing Values for Day and Month; 
    if dbs{i}=-9 then dbs{i}=15;
    if mbs{i}=-9 then mbs{i}=6;
    
    *Part A2: Using Arrays and Do Loops to Create the SAS Date Variable;
    dte{i}=mdy(mbs{i}, dbs{i}, 1990);
  end; 
  *Part A3: Using SAS Function to Create a Variable for the Maximum Blood Lead Value for Each Child; 
  maxbls=max(of bls1-bls3);
  *Part A4: Using Arrays and Looping to Identify Date On Which Maximum Blood Lead Value Obtained; 
  array bls{3} bls1 bls2 bls3; 
  do i=1 to 3;
    if bls{i}=maxbls then maxdte=dte{i};
  end;
  *Part A5: Determining Age of Child (yrs) When Largest Blood Lead Value Was Obtained;
  agemax=maxdte-dob; 
  ageest=round(agemax/365.25,2);
  *Part A6: Creating New Variable:Agecat;
  if agemax=. then agecat=.;
  else if agemax < 4 then agecat=1; 
  else if 4 <= agemax < 8 then agecat=2;
  else if agemax ge 8 then agecat=3; 
run;

Art, CEO, AnalystFinder.com

 

Super User
Posts: 13,528

Re: Helping with SAS Programming (Reading in Data with Dates and Arrays and Looping)

SAS has functions that let you search for elements of an array by value such as WHICHN and WHICHC for numeric and character values. So there is no reason to have a lot of loop code to find a maximum value in one array and then the associated value in a second such as implied with your

 

 *Part A4: Using Arrays and Looping to Identify Date On Which Maximum Blood Lead Value Obtained; 

A brief example:

data example;
   informat  mdy1-mdy3 date9.;
   input mdy1-mdy3  bls1-bls3 ;
   format mdy1-mdy3 datemax date9.;
   array dte mdy1-mdy3;
   array bls bls1-bls3;
   DateMax = dte[ whichn(max(of bls(*)), of bls(*)) ];
datalines;
01Jan2018  03Feb2018 06Mar2018  1.2 3.4 2.1
;
run;

MAX(of bls(*)) returns the largest value in the array bls. So Whichn(max(of bls(*)) , of bls(*) returns the index of the largest value in the array bls. So if you have the arrays aligned and populated correctly you use that index into the date array to get the date associated.

 

Ask a Question
Discussion stats
  • 3 replies
  • 154 views
  • 2 likes
  • 4 in conversation