Quartz | Level 8

## How to select all columns but filter out on certain columns

Hi,

I have table_1 with columns (A, B, C, D, E).

For the rows within table_1, I would like to select rows of data where A, B, C are unique, but I also plan to keep information for D & E.

How can I do this?

Will this one work?

Proc Sql;

Create Table table_2 as select * from table_1

left join

(select distinct A, B, C from table_1) m

on table_1. A = m.A and table_1.B = m.B and table_1.C = m.C;

Quit;

Thank you!

8 REPLIES 8
Super User

## Re: How to select all columns but filter out on certain columns

You really should provide some example input data and the desired result.

"Unique combination" tends to imply a single result but when you say "plan to keep information for D & E." which values do you want?

if the example data looked like

A          B                  D                      E

Avalue Bvalue Cvalue somethingforD somethingforE

Avalue Bvalue Cvalue otherDvalue      otherEvalue

Which values from D and E would you want?

Quartz | Level 8

## Re: How to select all columns but filter out on certain columns

Thanks for reminding.

For example,

Table_1:

A   B    C   D   E

1   2     2   4    5

1   2     3   6    7

1   2     3   8    2

In table_2:    we only have first two rows. Since the third row has same (A, B, C) value with the second one.

Super User

## Re: How to select all columns but filter out on certain columns

If table_1 is sorted by A B C then this may do what you want:

```data table_2;
set table_1;
by a b c;
if first.c;
run;```

If there is some actual criteria involved for choice of D and E  you need to show it.

Quartz | Level 8

## Re: How to select all columns but filter out on certain columns

Sorry I may not provide a good example.

Table_1, example as below:

A   B    C   D   E

1   2     2   4    5

1   2     3   6    7

2   1     4   5    8

1   2     3   8    2

Table_2, we will have the following:

A   B    C   D   E

1   2     2   4    5

1   2     3   6    7

2   1     4   5    8

Because the last row in table_1 is same as the third one because they have same information on (A, B and C)

I was wondering in your code,

will this part 'first.c' filter on 'A, B and C' information?

Thanks!

Opal | Level 21

## Re: How to select all columns but filter out on certain columns

@Crubal

Why don't you use your sample data and run it with the code @ballardw provided?

In doing so: Is the result what you want? And if not how should the desired result using your sample data look like?

Super User

## Re: How to select all columns but filter out on certain columns

@Crubal wrote:

Sorry I may not provide a good example.

Table_1, example as below:

A   B    C   D   E

1   2     2   4    5

1   2     3   6    7

2   1     4   5    8

1   2     3   8    2

Table_2, we will have the following:

A   B    C   D   E

1   2     2   4    5

1   2     3   6    7

2   1     4   5    8

Because the last row in table_1 is same as the third one because they have same information on (A, B and C)

I was wondering in your code,

will this part 'first.c' filter on 'A, B and C' information?

Thanks!

If the data is sorted by the variables A, B and C and you use a BY A B C statement in a data step when each record is processed the SAS adds automatic values that are true for the First and Last for each variable in combination on the By statement. Using "if first.C" checks if the current record has the first of a group of values for the variable C within the current combination of values of A and B and only keeps records where first.C is true.

Try sorting your data and running the example code and check to see if the result is as needed. If you don't like the resulting values of D and E you'll have to explain why not with a rule.

PROC Star

## Re: How to select all columns but filter out on certain columns

If you sort the data by ABC, but tell proc sort to segregate the unique keys, you will solve this problem:

``````proc sort data=have out=_null_ nouniquekey  uniout=want;
by a b c;
run;``````
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Quartz | Level 8

## Re: How to select all columns but filter out on certain columns

Thanks!

Discussion stats
• 8 replies
• 842 views
• 1 like
• 4 in conversation