BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ilikesas
Barite | Level 11

Hi,

suppose I have the following file:

NameDatesomeinfo
A2010a
A2010a
A2009b
A2009b

What I would like to get is:

NameDatesomeinfo
A2010a
A2009b

That is, I would like to get all the columns from my table but with the name/date combination being distinct

Thank you

1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

Hope that helps:

data have;

  infile datalines truncover;

  input name $ date someinfo $;

  datalines;

A 2010 a

A 2010 a

A 2009 b

A 2009 b

;

run;

proc sql;

  create table want1 as

  select distinct *

  from have;

quit;

proc sort data=have out=want2 nodupkey;

  by _all_;

run;

proc sort data=have out=inter3;

  by name date someinfo;

run;

data want3;

  set inter3;

  by name date someinfo;

  if first.someinfo;

run;

And a fourth option:

data _null_;

  if 0 then set have;

  dcl hash h1(dataset:'have');

  _rc=h1.defineKey(all:'y');

  _rc=h1.defineData(all:'y');

  _rc=h1.defineDone();

  _rc=h1.output(dataset:'Want4');

  stop;

run;

View solution in original post

8 REPLIES 8
Patrick
Opal | Level 21

Proc sql; select distinct ....

ilikesas
Barite | Level 11

Hi Patrick and thanks for the reply.

I did the select distinct but it only gave me the 2 selected columns, but what I would like to get if possible is all the columns (like in the small example that I put in the question), in other words to delete all the name/date combination duplicates

I also tried to do

data board_summary6;

set board_summary6;  

by comp_name date;

if first.comp_name and first.date;

run;

but got an error message ...

Thank you

ilikesas
Barite | Level 11

I actually could distinct select all the columns of interest but the thing is that since there are many of them I thought if its possible to make a shortcut by selecting the entire row corresponding to the name/date combination...

Patrick
Opal | Level 21

Hope that helps:

data have;

  infile datalines truncover;

  input name $ date someinfo $;

  datalines;

A 2010 a

A 2010 a

A 2009 b

A 2009 b

;

run;

proc sql;

  create table want1 as

  select distinct *

  from have;

quit;

proc sort data=have out=want2 nodupkey;

  by _all_;

run;

proc sort data=have out=inter3;

  by name date someinfo;

run;

data want3;

  set inter3;

  by name date someinfo;

  if first.someinfo;

run;

And a fourth option:

data _null_;

  if 0 then set have;

  dcl hash h1(dataset:'have');

  _rc=h1.defineKey(all:'y');

  _rc=h1.defineData(all:'y');

  _rc=h1.defineDone();

  _rc=h1.output(dataset:'Want4');

  stop;

run;

Ksharp
Super User

And the fifth option:

proc summary data=have nway;
 class _all_;
 output out=want(drop=_:);
run;

And the sixth optioin:

proc freq data=have noprint;
 table name*date*someinfo/list out=want1(drop=count percent) nofreq norow nocol nopercent nocum;
run;

Xia Keshan

ilikesas
Barite | Level 11

Thanks a lot to all, now I have a lot to choose from!!!

fengyuwuzu
Pyrite | Level 9

I think you need to sort with the by variables first.

Tom
Super User Tom
Super User

Not sure what you mean.

Sounds like you want to keep only one record for each distinct by group, but that you have extra non-key variables.

You can just use PROC SORT with the NODUPKEY option.

proc sort data=have out=want NODUPKEY ;

  by name date;

run;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 8 replies
  • 10743 views
  • 9 likes
  • 5 in conversation