Dear All, I have a question about SAS behaviour to understand how a query is executed. Is there anyone who could help me understand what SAS thinks/judges like, in this case? I added the input and output test data if it helps. I would like to understand what are the steps in SAS judgement in this case e.g. I think he first does the groupping on DISTINCT values on column1: a group for anna a group for jim a group for ben etc. For each group, it calculates the maximum value on Column2. e.g. 50 For each row in the group, e.g. anna 50 4g, if the value on Column2 = maximum, it keeps that row. e.g. anna 50 2g, if the value on Column2 = maximum, it keeps that row. e.g. anna 50 4g, if the value on Column2 = maximum, it keeps that row. e.g. anna 20 1g, if the value on Column2 not = maximum, it turns that row into: anna 50 BLANK. Through this logic, if I apply the DISTINCT at this stage (at the end), I do get the output that SAS gets. Is this correct judgment? Especially the logic I am guessing for the usage of DISTINCT. Is DISTINCT meant to be applied generally as a last step in queries that have group by, or can I not make a general statement about that? On the other hand, I see the DISTINCT is also applied in the beginning to have groups for each distinct value in Column1. Is distinct really being applied twice? Any generally statements that we can issue based on this case? Can I alter the code to have DISTINCT applied only once, in the end, for the sake of the exersize and for a better comprehension of SAS? Thank you very much in advance! Anna PROC SQL; CREATE TABLE WORK.TABLE1 AS SELECT DISTINCT t1.'Column1'n, /* MAX Value */ (MAX(t1.Column2'n)) AS 'MAX Value'n, /* PL */ (CASE WHEN t1.'Column2'n = MAX(t1.'Column2'n) THEN t1.'Column3'n ELSE "" END) AS PL FROM WORK.TEST AS t1 GROUP BY t1.'Column1'n; QUIT; Input: jim 1788 5F jim 2000 9F jim 500 9F ben 190 4H matt 400 46 matt 20 3G jim 2000 52 jim 2000 9F ben2 190 4H ben2 190 4H ben2 34 2g jess 190 4h jess 190 4h jess 120 4h jess 190 4h jess 13 35 jerr 11 3g jerr 11 3g jerr 11 3g ben 190 4H a 400 9f a 400 9f a 300 6g b 400 9f b 400 8f b 300 6g b 20 3n Output: a 400 BLANK a 400 9f b 400 BLANK b 400 8f b 400 9f ben 190 4H ben2 190 BLANK ben2 190 4H jerr 11 3g jess 190 BLANK jess 190 4h jim 2000 BLANK jim 2000 52 jim 2000 9F matt 400 BLANK matt 400 46 For me this topic is of particular interest because I am doing conversions to and from another database system environment and I am sometimes "appalled" at the differences ins yntax /logic. In another environment (can I name it here?) the same query triggers an errors prompting to put all the 3 columns in the group by clause. Thank you!
... View more