Hi There,
I'm new to SAS, using SAS Enterprise Guide 7.12 and running this code. I can't seem to get a distinct list based on the dttm_create. I tried using group by and having, sub-query and all other different variations but nothing seem to work. The SQL runs fine (no sql error) but just not returning the correct results.
Can someone please help? I've been at this for 2 days now Thank you so much in advance!
PROC SQL;
CREATE TABLE WORK.TABLE2 AS
SELECT DISTINCT ID1, ID2, CYCLE_DATE, AMT, STATUS,
MAX(DTTM_CREATE) FORMAT=DATETIME25.6 AS MAX_of_DTTM_CREATE
FROM WORK.TABLE1
WHERE ID1 = '123456789'
GROUP BY ID1,
ID2, CYCLE_DATE,
AMT, STATUS, DTTM_CREATE;
QUIT;
This is the result I'm getting but the SQL should only give me the first row.
ID1 | ID2 | CYCLE_DATE | AMOUNT | STATUS | MAX_of_DTTM_CREATE |
123456789 | 987654321 | 31-Dec-19 | 1.67 | Active | 03JAN2020:03:04:35.353335 |
123456789 | 987654321 | 31-Dec-19 | 2.01 | Active | 06DEC2019:02:48:35.810525 |
@Mary001 wrote:
@PaigeMiller I need the latest and greatest dttm_create per record per cycle date.
Try this:
PROC SQL;
CREATE TABLE WORK.TABLE2 AS
SELECT ID1, ID2, CYCLE_DATE, AMT, STATUS,
DTTM_CREATE FORMAT=DATETIME25.6
FROM WORK.TABLE1
WHERE ID1 = '123456789'
GROUP BY ID1,ID2, CYCLE_DATE
having dttm_create=max(dttm_create);
QUIT;
Why are you including all of those extra variables in the GROUP BY? Which variables do you actually want to use to define the group that you want to use to find max of DTTM_CREATE in the group ?
Here is simpler query you might try to see the difference that changing the GROUP BY makes.
This is like your query:
select sex,max(age) as max_age
from sashelp.class
group by sex, age
;
This is more appropriate:
select sex,max(age) as max_age
from sashelp.class
group by sex
;
Distinct works only on the entire collection of variables in the CREATE TABLE. It doesn't work if you want one or two variables to be distinct and not the other.
What is the criteria that lets you determine that you want the first record here and not the other records?
HI @Mary001 First off
1. Are you sure DTTM_CREATE should be part of GROUP BY Clause
2. Are you sure AMT should be part of GROUP BY clause
If the above questions make sense, it appears the SQL you are prolly after is
PROC SQL;
CREATE TABLE WORK.TABLE2 AS
SELECT DISTINCT ID1, ID2, CYCLE_DATE, AMT, STATUS,
MAX(DTTM_CREATE) FORMAT=DATETIME25.6 AS MAX_of_DTTM_CREATE
FROM WORK.TABLE1
WHERE ID1 = '123456789'
GROUP BY ID1,ID2, CYCLE_DATE, STATUS
having MAX_of_DTTM_CREATE=DTTM_CREATE;
QUIT;
/*Or with the corrected GROUP BY clauses knowing there isn't any column that is specified in the SELECT Clause not part of the Group by that may cause potential remerging*/
PROC SQL;
CREATE TABLE WORK.TABLE2 AS
SELECT DISTINCT ID1, ID2, CYCLE_DATE, AMT, STATUS,
MAX(DTTM_CREATE) FORMAT=DATETIME25.6 AS MAX_of_DTTM_CREATE
FROM WORK.TABLE1
WHERE ID1 = '123456789'
GROUP BY ID1,ID2, CYCLE_DATE, STATUS;
QUIT;
The above assumes there are no ties in MAX of DTTM_CREATE and secondly status is either ACTIVE or NOT forming a group. Please let us know. Thank you
Hello Again and thank you so much for the quick responses!
@Tom I updated the query twice and re-run (1st with Group by ID1 and 2nd without Group by) It ran successfully but still not the correct results ... also the log says, "NOTE: The query requires remerging summary statistics back with the original data." What does this mean, please?
Here is the result but again it should only return with the 1st row:
ID1 | ID2 | CYCLE_DATE | AMOUNT | STATUS | MAX_of_DTTM_CREATE |
123456789 | 987654321 | 31-Dec-19 | 1.67 | Active | 03JAN2020:03:04:35.353335 |
123456789 | 987654321 | 31-Dec-19 | 2.01 | Active | 03JAN2020:03:04:35.353335 |
@PaigeMiller I need the latest and greatest dttm_create per record per cycle date.
@novinosrin My apologies on the group by clause, I'm not really sure what I'm doing 😞
@Mary001 wrote:
@PaigeMiller I need the latest and greatest dttm_create per record per cycle date.
Try this:
PROC SQL;
CREATE TABLE WORK.TABLE2 AS
SELECT ID1, ID2, CYCLE_DATE, AMT, STATUS,
DTTM_CREATE FORMAT=DATETIME25.6
FROM WORK.TABLE1
WHERE ID1 = '123456789'
GROUP BY ID1,ID2, CYCLE_DATE
having dttm_create=max(dttm_create);
QUIT;
THANK YOU so much everyone!!! You guys are amazing!
I re-ran with the updated sql from Paige and received the correct results 🙂
Thanks again!
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.