03-29-2012 04:58 AM
Newbie to the SAS DI world and working with Summary Statistics to return sums, mean and median for a given group of data. I need to calculate the summary stats which works fine and group by two separate variables - FYear and DIAG but then also include 2 additional columns against the DIAG code for description and gender - I have put these into the ID statement to return as additional columns in the output table but the summary stats table that is returned puts a description against the total row - _WAY = 0 and _TYPE = 0 that has nothing to do with that row.
For example - the _WAY = 0 and _TYPE = 0 total row should have no entry in the DIAG column or the Description column as this contains the totals for the rest of the variables but when I run the query in SAS DI - it returns a description into the top total row that has nothing to do with the total.
Interestingly, the value that it returns is the description that is the last alphabetical one from the main list of DIAG description.
Any advice would be helpful.
03-29-2012 09:35 AM
It sounds like the ID statement is working like it should. Here's what it does.
For every record in the summarized output, the ID variables should be copied from a single observation that went into computing that summary record. Which observation's ID statements get used? As you noted, there is an element of "maximum value" in making the selection. With a single ID variable, it is the maximum value. With 2 ID variables, it is the maximum value of the first variable, but not necessarily the maximum value of the second variable. Since the rule is to take both ID variables from a single observation, "maximum" means take the maximum value of the first ID variable. For all those observations having the maximum value of the first ID variable, take the maximum value of the second ID variable.
The only choice you have is that you are allowed to switch to "minimum" instead of "maximum". But the basic rule stays in place: all ID variables come from a single observation that went into computing statistics for the summary row.