Re: selecting maximum value for multiple variables on a table

michaela · Posted 12-18-2007 09:34 AM

I have a table I want to select the maximum id for mulitple records and am having problems figuring out the code. For example what I would like from the list below is

123 b
456 d
678 f

id value

123 a
123 b
456 c
456 d
678 e
678 f

I tried to do a proc sql statement that just brought back 678 f for example because it had the maximum values of all values on the table.

Any assistance would be appreciated.

deleted_user · Posted 12-18-2007 10:05 AM

something like (assuming key is number and letter is ID )

data data;
input key id $ ;
cards;
123 a
123 b
456 c
456 d
678 e
678 f
;
proc sql ;
create table top_keys as
select * from data
group by key
having ID= max(ID) ;
quit;

Won't work on many (non-SAS) platforms and SAS refers to remerging summary with detail.
You may have to be concerned about cases where the largest ID is not unique within a key.

Reeza · Posted 07-12-2011 05:49 PM

Add the group by statement to your sql code.

If you're doing in PROC SQL without pass through

proc sql;

create table top_keys as

select key, max(id) as max_value

from have

group by id;

quit;

I don't know why Peter is using the having clause, but I don't think you need it unless you want the max for each group in each row as well. If you do then can remerge in a second sql step to avoid the issue Peter has mentioned.

Generally, any variables in the group by statement can be included in the select statement.

saspert · Posted 07-18-2013 12:06 PM

Hi,

I think I have the same question and similar code but it does not seem to be working.

PROC SQL;

CREATE TABLE WORK.TD_TST AS

SELECT t1.id,

t1.read_time,

(MAX(t1.read_time)) FORMAT=DATETIME22.3 AS MAX_of_read_time,

t1.pmax_derated,

/* start_date */

(intnx('HOUR',TODAY(),-12)

) LABEL="start_date" AS start_date

FROM PBGMES.td_header t1

WHERE

t1.id ='1...4'

GROUP BY t1.sub_id,

t1.pmax_derated,

(CALCULATED start_date)

/* having t1.read_time = (MAX(t1.read_time))*/

;

QUIT;

id read_time max_read_time pmax_derated start_date

130716041724 17JUL2013:13:07:49.000 17JUL2013:13:07:49.000 0 -25200

130716041724 17JUL2013:14:54:01.000 17JUL2013:14:54:01.000 89.420417786 -25200

Thanks,

saspert

LarryWorley · Posted 07-18-2013 12:44 PM

I am not sure why you believe the query is not working, I see two records with the same sub_id and different pmax_derated values, so group by could be working. But I wonder if the calculation of start_date is what you intended.

I see that you use 'hour' as the interval in intnx and then specify today's date as the start-from.. The today() function returns a date value; so there is a mis-match here.

If you are trying to specify 12:00 noon yesterday as the start time, then you will need something like (intnx('HOUR',dhms(TODAY(),0,0,),-12) The DHMS converts today's date value to a datetime value which the 'HOUR' interval will work on.

Hope this helps.

saspert · Posted 07-18-2013 01:56 PM

I just followed my colleagues without thinking completely. I think if I remove the field pmax_derated in the group by field, I get what I want - single record for sub_id.

selecting maximum value for multiple variables on a table