I work with my state's public school data a little bit. As far as the School and grade issues I would really hope that there is an ENROLLMENT data set somewhere, even last year's would help as a start. That should let you pull School and grade information with students actually enrolled.
Gender is no probem.
Age is likely grade dependent but I would hesitate to recommend a limits an ages. I would expect that there might be a guideline about that. I also wonder about age as component of this report as children's ages change during the year.
The date time value they send you can be easily reduced to just the date with the DATEPART function. The reason I consider most likely for the datetime is lazy database admin or programmer. If they aren't doing something like absent between 10am and 1Pm then the time component is useless. You don't need it, get rid of it.
If you can get separate datasets of some of these elements, such as your date you can combine them relatively easily. Here's an example:
data school;
informat sname $10.;
input sname grade;
datalines;
Washington 1
Washington 2
Washington 3
Jefferson 3
Jefferson 4
Jefferson 5
Jefferson 6
Lincoln 5
Lincoln 6
Lincoln 7
Lincoln 8
Adams 9
Adams 10
Adams 11
Adams 12
;
data sex;
input sex $;
datalines;
Male
Female
;
run;
proc sql;
create table base as
select sname,sex,grade
from school, sex
order by sname,grade,sex;
quit;
You get a message from the Proc Sql along these lines:
NOTE: The execution of this query involves performing one or more Cartesian product joins that
can not be optimized.
That is basically what you are requesting combine everything and there really isn't a way to optimize that for this example.
... View more