selecting maximum value for multiple variables on a table

Reply
New Contributor
Posts: 4

selecting maximum value for multiple variables on a table

I have a table I want to select the maximum id for mulitple records and am having problems figuring out the code. For example what I would like from the list below is

123 b
456 d
678 f

id value

123 a
123 b
456 c
456 d
678 e
678 f

I tried to do a proc sql statement that just brought back 678 f for example because it had the maximum values of all values on the table.

Any assistance would be appreciated.
N/A
Posts: 0

Re: selecting maximum value for multiple variables on a table

something like (assuming key is number and letter is ID )

data data;
input key id $ ;
cards;
123 a
123 b
456 c
456 d
678 e
678 f
;
proc sql ;
create table top_keys as
select * from data
group by key
having ID= max(ID) ;
quit;

Won't work on many (non-SAS) platforms and SAS refers to remerging summary with detail.
You may have to be concerned about cases where the largest ID is not unique within a key.
Grand Advisor
Posts: 17,312

selecting maximum value for multiple variables on a table

Add the group by statement to your sql code.

If you're doing in PROC SQL without pass through

proc sql;

create table top_keys as

select key, max(id) as max_value

from have

group by id;

quit;

I don't know why Peter is using the having clause, but I don't think you need it unless you want the max for each group in each row as well. If you do then can remerge in a second sql step to avoid the issue Peter has mentioned.

Generally, any variables in the group by statement can be included in the select statement.

Super Contributor
Posts: 268

Re: selecting maximum value for multiple variables on a table

Hi,

I think I have the same question and similar code but it does not seem to be working.

PROC SQL;

   CREATE TABLE WORK.TD_TST AS

   SELECT t1.id,

          t1.read_time,

            (MAX(t1.read_time)) FORMAT=DATETIME22.3 AS MAX_of_read_time,

          t1.pmax_derated,

          /* start_date */

            (intnx('HOUR',TODAY(),-12)

            ) LABEL="start_date" AS start_date

      FROM PBGMES.td_header t1

    WHERE

    t1.id ='1...4'

      GROUP BY t1.sub_id,

               t1.pmax_derated,

               (CALCULATED start_date)

/*      having t1.read_time = (MAX(t1.read_time))*/

;

QUIT;

id                   read_time                              max_read_time                         pmax_derated      start_date

130716041724      17JUL2013:13:07:49.000      17JUL2013:13:07:49.000      0                          -25200

130716041724      17JUL2013:14:54:01.000      17JUL2013:14:54:01.000      89.420417786 -25200

Thanks,

saspert

Frequent Contributor
Posts: 129

Re: selecting maximum value for multiple variables on a table

I am not sure why you believe the query is not working, I see two records with the same sub_id and different pmax_derated values, so group by could be working.  But I wonder if the calculation of start_date is what you intended.

I see that you use 'hour' as the interval in intnx and then specify today's date as the start-from..  The today() function returns a date value; so there is a mis-match here.

If you are trying to specify 12:00 noon yesterday as the start time, then you will need something like  (intnx('HOUR',dhms(TODAY(),0,0,),-12) The DHMS converts today's date value to a datetime value which the 'HOUR' interval will work on.


Hope this helps.

Super Contributor
Posts: 268

Re: selecting maximum value for multiple variables on a table

Smiley Happy I just followed my colleagues without thinking completely. I think if I remove the field pmax_derated in the group by field, I get what I want - single record for sub_id.

Ask a Question
Discussion stats
  • 5 replies
  • 17587 views
  • 0 likes
  • 5 in conversation