turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS Procedures
- /
- Array Processing (?)

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-13-2015 12:40 PM

Frankly, this problem is beyond my skill set, so I'm hoping for your help (and patience) as I struggle through this.

My PROC SQL step that joins these 2 tables:

proc sql;

create table join1 as select

a.*,t.absence_date, t.abs_count, t.abs_school, t.employee_name, t.employee_ssn from allsets as a

left join tchabs2 as t

on a.SSN=t.employee_SSN and a.schoolid=t.abs_school

order by SSN;

quit;

My objective: compare a teacher's absences (Absence_Date in TCHABS2) to a given interval (dateenrolled, dateleft in ALLSETS) and if the absence occurs within the interval, then count the absence. Otherwise, ignore it. However, when I join the tables I quickly see how the number of obs proliferates. If a teacher has 7 absences and her class roster has 50 students, then one teacher generates 350 obs.

Question: before I begin matching the two tables, should I consider using an array so that there is one record per teacher per school? I know enough about arrays to be dangerous.

I'm picturing the TCHABS2 table being transformed into something like so:

school_Id | Employee_SSN | Employee_Name | Job_Description | Absence_Date1 | Absence_Type1 | Absence_Date2 | Absence_Type2 | etc.. | Absence_Date7 | Absence_Type7 |

30 | 777 | Joni | Teacher Math | 7-Oct-14 | Full Day | 8-Oct-14 | Full Day | 9-Apr-15 | Half Day AM | |

26 | 777 | Joni | Teacher Math | 7-Oct-14 | Full Day | 8-Oct-14 | Full Day | 9-Apr-15 | Half Day AM |

TCHABS2

abs_school | Employee_SSN | Employee_Name | Job_Description | Absence_Date | Absence_Type |

30 | 777 | Joni | Teacher Math | 7-Oct-14 | Full Day |

30 | 777 | Joni | Teacher Math | 8-Oct-14 | Full Day |

30 | 777 | Joni | Teacher Math | 30-Oct-14 | Full Day |

30 | 777 | Joni | Teacher Math | 10-Feb-15 | Full Day |

30 | 777 | Joni | Teacher Math | 11-Feb-15 | Full Day |

30 | 777 | Joni | Teacher Math | 4-Mar-15 | Full Day |

30 | 777 | Joni | Teacher Math | 9-Apr-15 | Half Day AM |

26 | 777 | Joni | Teacher Math | 7-Oct-14 | Full Day |

26 | 777 | Joni | Teacher Math | 8-Oct-14 | Full Day |

26 | 777 | Joni | Teacher Math | 30-Oct-14 | Full Day |

26 | 777 | Joni | Teacher Math | 10-Feb-15 | Full Day |

26 | 777 | Joni | Teacher Math | 11-Feb-15 | Full Day |

26 | 777 | Joni | Teacher Math | 4-Mar-15 | Full Day |

26 | 777 | Joni | Teacher Math | 9-Apr-15 | Half Day AM |

ALLSETS

STUDENT | Dateenrolled | DateLeft | Schoolid | Course_Name | ssn |

able | 8/18/2014 | 10/14/2014 | 30 | Mathematics 7 | 777 |

baker | 8/18/2014 | 6/5/2015 | 30 | Mathematics 7 | 777 |

charlie | 8/18/2014 | 6/5/2015 | 30 | Mathematics 7 | 777 |

delta | 8/18/2014 | 6/5/2015 | 30 | Mathematics 7 | 777 |

echo | 8/18/2014 | 6/5/2015 | 30 | Mathematics 7 | 777 |

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-13-2015 01:46 PM

You probably do NOT want to transpose the data so that there are multiple date variables as any comparison logic gets very complicated very quickly.

If the problem is comparing the teacher with students and generating more records than desired you may want to summarize the STUDENT data down to a CLASS level before joining. If you need to know how many students are involved then you have what you need.

What is final report supposed to look like?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-13-2015 02:05 PM

The final report:

I will sort it by school, then teacher, then class.

STUDENT_NAME SCHOOLID DATEENROLLED DATELEFT DAYS_ENROLLED NUMBER_OF_TEACHER_ABSENCES NET_DAYS_ENROLLED (by subtracting teacher absences from days enrolled, only counting those that fell between dateenrolled and dateleft) PERCENT_OF_YEAR_ENROLLED (by dividing the net days enrolled by days enrolled)

Essentially, I'm wanting to show each teacher what percent of the school year a student was in her class. One of the business rules says that you can subtract the number of days that a teacher was absent from the number of days a student was enrolled to get the percent of the year enrolled. A school year = 180 days. Say that a student was enrolled for 100 days. Further, during those 100 days the teacher was absent 10 of those days. The net_days_enrolled is 100 - 10. The percent_of_year_enrolled would be (100 - 10)/180 = 50%. The teacher has to log into a portal and populate this percentage for each student she teaches in each of her classes. One data set I have contains all student info, including class information. The other data set contains teacher absence dates. The SSN and schoolid appear in both sets. This is what I'm matching on.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-13-2015 04:50 PM

Since you are reporting on student/class combinations then you don't want to try to reduce the number of results as you need each one.

If you haven't already, look into Proc Report as it will allow you to make these calculations and generate report tables at the same time. That assumes your dates are SAS date values, then number of days enrolled is date_left - dateenrolled. These should be doeable in Compute blocks.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-13-2015 07:17 PM

If I understand what you're saying, I'm really looking at student/teacher combinations. I will look into PROC REPORT. I have seen code involving a compute block but have never actually tried to use it.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-14-2015 03:18 AM

Could be a case for a hash iterator, where you walk through the teacher absences and output if absence_date is between date_enrolled and date_left ..

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-14-2015 09:10 AM

I'll have to research how to use a hash iterator. I've never used it before. Thanks.