11-20-2014 07:52 PM
I think this will be an easy question to solve. I'm using SAS EG 6.1.
I have IDs for students and I have all the classes these students attend, what I want is how many students attend each class. I need the data to join with another table that has the same information just a few days older, so I can't just use count and get the answer through summary tables (plus I have other numerical data that I also need the same trick for). I feel like I am over-looking a simple solution, so please feel free to say something you think should be obvious. My boss uses regular SAS while I've trained on EG, sometimes we have a hard time communicating.
What I want is a table something like this:
|Class||New Number of Students in class||Old Number of Students in class||New data||Old data||New data||Old data|
Thanks for your help!
11-20-2014 08:27 PM
You need to use the Query tool, but you haven't provided enough information about your data to provide help beyond that.
You'll join the two tables based on some criteria, again can't say without the data.
11-21-2014 12:19 PM
Okay, so there are a few hundred classes and thousands of students. We compare yesterdays data, to data from the day before (eg 20Nov2014 to 19Nov2014) to see how the classes are doing, what has changed, etc. So I need to count how many students are in each class along with a few other college stats like FTE. I'm rewriting an old program, taking it on as my own, and recreating it through enterprise guide. Looking at the SAS program doesn't help me very much because this old program uses functions I don't know how to use. I'm new to SAS programing and am better at EG. Here is a sample of what I am trying to recreate through EG:
PROC MEANS NOPRINT SUM N MEAN MAXDEC=2;
VAR CREDIT FTES; BY course;
OUTPUT OUT= YEAR2SUM
SUM = OLDCRHR OLDFTE
N = OLDCRED OLDSEAT
MEAN = OLD_CREDIT OLDMSEAT; RUN;
I know I'll need to use the query builder and use a full join to combine the needed data fully. I just need to know how to sum up categories, all the students in a particular class - on down the list of classes, along with all the FTE (full-time equivalent - basically a number that measures student credit hours) for that Class.
The data I'm using has student IDs, Classes, departments, divisions, FTE, credits, academic period. I cannot show the actual data because it is confidential.
I hope this is more clear. Thank you
11-21-2014 02:09 PM
That's why you generate sample data that's fake but illustrates your issues.
Anything in Proc Means should be reproducible via a Summary Task, you'll need to determine where to place each variable though. For example credits/ftes goes under analysis, course under group by, under stats requested you can select the stats you need.
11-24-2014 04:25 PM
Thank you Reeza for your help in this. I think I've figured out what I was trying to ask. My problem was a mixture of trying to do too much at once and forgetting how powerful the 'group by' section is in the query builder. Without properly using the group by function I was getting crazy numbers. Next time I have a question I'll be sure to create some sample data.
12-05-2014 11:15 AM
Also, don't forget the ability of PROC MEANS (Summary Statistics task) to group output by CLASS variables. This can be more efficient than BY groups (doesn't require sorted data) and it's a little more intuitive than accomplishing the same thing in the Query Builder.
Here's what it looks like with sample CLASS data in the task, using "Sex" as a classification:
And here's the output:
There are several conference papers about this approach -- here's a link to one of them.