BookmarkSubscribeRSS Feed
hopeless
Calcite | Level 5
Let's say I have four samples: id=1, 2, 3, and 4, with one or more measurements on each of those samples

id value
1 1
1 2
2 3
2 -4
3 -5
4 6

I want to remove duplicates, keeping only one entry per ID - the one having the largest absolute value of the "value" column. I.e., this is what I want:

id value
1 2
2 -4
3 -5
4 6

How might I do this on 8.3 eg?

3 REPLIES 3
Tom
Super User Tom
Super User

I doubt Enterprise Guide has any built in module to generate the SAS code to do this.  You will probably have to actually write the SAS code yourself instead of pointing and clicking your way through Enterprise Guide.

 

You can order by the absolute value and pick the largest one.  What do you do do if there are ties, especially ties of opposite signs?  This code will pick the positive one.

proc sql;
create table step1 as 
select *,abs(value) as absolute
order by id,absolute,value
;
quit;

data want;
  set step1;
  by id absolulte value;
  if last.id;
run;
s_lassen
Meteorite | Level 14

Given the input and output data shown (input data sorted by ID), you can get the result you want like this:

data want;
  do until(last.id);
    set have;
    by id;
    if abs(max)<abs(value) then
      max=value;
    end;
  value=max;
  drop max;
run;

In this loop we just remember the value that was largest in absolute terms, and use that in the final output.

 

However, the title of your request says "keeping entry with largest absolute value" which is not exactly the same. If your desired entry has other variables that you also want to keep, e.g.

 

data have;
input id value name $;
cards;
1 1 Claus
1 2 Claudio
2 3 Peter
2 -4 Sam
3 -5 uncle
4 6 Joe
;run;

and the data you want look like this

 

 

1 2 Claudio
2 -4 Sam
3 -5 uncle
4 6 Joe

it can be done in a single data step this way:

 

 

data want;
  set have;
  by id;
  if first.id or max<abs(value) then do;
	max=abs(value);
    obsno=_N_;
	end;
  retain max;
  if last.id;
  set have point=obsno; /* get relevant record */
  drop max obsno;
run;

Here, instead of just remembering the value we want, we remember the maximal absolute value, and a pointer to the first observation with that absolute value. And at the end we use the pointer to retrieve the relevant record.

 

 

Patrick
Opal | Level 21

For below sample data which value would you want to keep for id=3? 5 or minus 5 or both?

data have;
  infile datalines truncover;
  input id value;
  datalines;
1 1
1 2
2 3
2 -4
3 -5
3 1
3 5
4 6
;

proc sql;
/*  create table want as*/
  select distinct *
  from have
  group by id
  having max(abs(value))=abs(value)
  ;
quit;

 

Patrick_0-1669705266291.png

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 509 views
  • 2 likes
  • 4 in conversation