BookmarkSubscribeRSS Feed
pransh
Calcite | Level 5

All blood samples were drawn in 1990. However, during data entry the order of blood samples was scrambled so that blood sample A may not correspond to the first blood sample taken on a subject, it could be the first, second or third. The same ordering concern may apply to blood samples B and C as well. In addition, some of the months and days for the blood sampling were not written on the forms. At data entry, missing month and missing day values were each coded as -1 or 13 for month and -1 or 32 for day. Be sure to write your code to account for either possibility.
The team of investigators for this project has made the following decisions regarding the missing values. Any missing days should be set to 15 and any missing months set to 6. Any analyses that follow are to be done on this data set. Be sure to implement the SAS syntax as indicated for each question. For example, use SAS arrays and loops if the item states that these must be used.
A) Using this saved SAS data set, create a new, temporary SAS data set and performing the following:
1) use SAS arrays and looping to create a SAS date variable for each of the three blood samples and to address the missing data in accordance to the decisions of the investigators. Use arrays and a loop to recode the missing values for day and month;
2) use a SAS function to create a new variable for the highest, i.e., maximum, blood lead value for each child;
3) use SAS arrays and looping to identify the date on which this highest value was obtained and create a new variable for the date of the highest blood lead value;

 

 

 

 

 

Code I have used

 

 

data temp2_lead_f2022;
set temp_lead_f2022;
array x {3} daybld_a daybld_b daybld_c;
array y {3} mthbld_a mthbld_b mthbld_c;
array dates {3} date1_a date2_b date3_c;
array maxleaddt {3} pblev_a pblev_b pblev_c;
do i = 1 to 3;
dates{i} = mdy( y{i},x{i},1990);
end;
do k= 1 to 3;
if maxleaddt{k} = maxlead then dates{k}= max_date;
end;
drop i k;
format date1_a date2_b date3_c dob mmddyy8. ;
maxlead= max(of pblev_a pblev_b pblev_c);
run;

 

 

 

 

I am trying to solve 3 one but not getting the correct syntax

10 REPLIES 10
Reeza
Super User
  • Order of operations. The maxlead is calculated AFTER you try and identify the date in the loop.
  • When finding the maximum, the array notation is not used, all variables are listed
  • #1 in the assignment is not answered in this piece of shown code, dealing with missing values.
  • If there are ties in the maximum date the last one is selected.
  • I would recommend using descriptive array names, not x/y makes it harder to follow
  • In the loop to identify the maximum date, the array is changed when the variable max_date should be modified
  • Add comments to the code to state what is happening in each section.
data temp2_lead_f2022;
set temp_lead_f2022;
array x {3} daybld_a daybld_b daybld_c;
array y {3} mthbld_a mthbld_b mthbld_c;
array dates {3} date1_a date2_b date3_c;
array maxleaddt {3} pblev_a pblev_b pblev_c;
do i = 1 to 3;
dates{i} = mdy( y{i},x{i},1990);
end;

maxlead= max(of pblev_a pblev_b pblev_c);


do k= 1 to 3;
if maxleaddt{k} = maxlead then dates{k}= max_date;
end;

drop i k;
format date1_a date2_b date3_c dob mmddyy8. ;

run;

 

 

I've highlighted problematic parts of the code and you can fix it from here. 

pransh
Calcite | Level 5

data temp_lead_f2022;
set in.lead_f2022;
run;
data temp_lead_f2022;
set temp_lead_f2022;
array a {3} daybld_a daybld_b daybld_c;
array b {3} mthbld_a mthbld_b mthbld_c;
do i=1 to 3;
if a{i} = -1 then a{i} = 15 ;
else if a{i} = 13 then a{i} = 15;
end;
do i=1 to 3;
if b{i} = -1 then b{i} = 6;
else if b{i} = 32 then b{i} = 6;
end;
run;

data temp2_lead_f2022;
set temp_lead_f2022;
array x {3} daybld_a daybld_b daybld_c;
array y {3} mthbld_a mthbld_b mthbld_c;
array dates {3} date1_a date2_b date3_c;
array maxleaddt {3} pblev_a pblev_b pblev_c;

do i = 1 to 3;
dates{i} = mdy( y{i},x{i},1990);
end;

maxlead= max(of pblev_a pblev_b pblev_c);
do k= 1 to 3;
if maxleaddt{k} = maxlead then dates{k}= max_date;
end;
drop i k;
format date1_a date2_b date3_c dob mmddyy8. ;

run;

 

 

 

 

ERROR Message

 


NOTE: Variable max_date is uninitialized.

Reeza
Super User
I didn't fix the code, I only highlighted the issues. I did mention what the issue was with max_date but I assume you can fix it.
pransh
Calcite | Level 5
can u explain it again , m not getting it pls
Reeza
Super User

I've bolded the statement in my initial response. 

Commenting your code will help you find the issues.

pransh
Calcite | Level 5

i know there is something wrong when i try to create a max_date variable 

But im not able to figure this out

Reeza
Super User
dates{k}= max_date;

What does that statement do? 

pransh
Calcite | Level 5

it will give the data on which the highest level of lead was collected from blood, but i dont know how do define it properly

Reeza
Super User

It's backwards. If you find the maximum date you're modifying the array variable not max_date so max_date is never created.

pransh
Calcite | Level 5

oh i understand now, thanks

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 1011 views
  • 0 likes
  • 2 in conversation