BookmarkSubscribeRSS Feed
Crubal
Quartz | Level 8

Hi,

 

I have table_1 with columns (A, B, C, D, E). 

For the rows within table_1, I would like to select rows of data where A, B, C are unique, but I also plan to keep information for D & E. 

How can I do this?

 

Will this one work? 

Proc Sql;

   Create Table table_2 as select * from table_1 

      left join 

      (select distinct A, B, C from table_1) m

   on table_1. A = m.A and table_1.B = m.B and table_1.C = m.C;

Quit;

 

Thank you! 

8 REPLIES 8
ballardw
Super User

You really should provide some example input data and the desired result.

 

"Unique combination" tends to imply a single result but when you say "plan to keep information for D & E." which values do you want?

if the example data looked like

A          B                  D                      E

Avalue Bvalue Cvalue somethingforD somethingforE

Avalue Bvalue Cvalue otherDvalue      otherEvalue

 

Which values from D and E would you want?

Crubal
Quartz | Level 8

Thanks for reminding. 

 

For example, 

 

Table_1: 

 

A   B    C   D   E

1   2     2   4    5

1   2     3   6    7 

1   2     3   8    2 

 

In table_2:    we only have first two rows. Since the third row has same (A, B, C) value with the second one. 

ballardw
Super User

If table_1 is sorted by A B C then this may do what you want:

data table_2;
   set table_1;
   by a b c;
   if first.c;
run;

If there is some actual criteria involved for choice of D and E  you need to show it.

 

Crubal
Quartz | Level 8

Hi @ballardw

 

Sorry I may not provide a good example. 

 

Table_1, example as below:

A   B    C   D   E

1   2     2   4    5

1   2     3   6    7 

2   1     4   5    8  

1   2     3   8    2 

 

 

Table_2, we will have the following:

A   B    C   D   E

1   2     2   4    5

1   2     3   6    7 

2   1     4   5    8  

 

Because the last row in table_1 is same as the third one because they have same information on (A, B and C)

 

I was wondering in your code, 

will this part 'first.c' filter on 'A, B and C' information? 

 

Thanks!

 

Patrick
Opal | Level 21

@Crubal

Why don't you use your sample data and run it with the code @ballardw provided?

In doing so: Is the result what you want? And if not how should the desired result using your sample data look like?

ballardw
Super User

@Crubal wrote:

Hi @ballardw

 

Sorry I may not provide a good example. 

 

Table_1, example as below:

A   B    C   D   E

1   2     2   4    5

1   2     3   6    7 

2   1     4   5    8  

1   2     3   8    2 

 

 

Table_2, we will have the following:

A   B    C   D   E

1   2     2   4    5

1   2     3   6    7 

2   1     4   5    8  

 

Because the last row in table_1 is same as the third one because they have same information on (A, B and C)

 

I was wondering in your code, 

will this part 'first.c' filter on 'A, B and C' information? 

 

Thanks!

 


If the data is sorted by the variables A, B and C and you use a BY A B C statement in a data step when each record is processed the SAS adds automatic values that are true for the First and Last for each variable in combination on the By statement. Using "if first.C" checks if the current record has the first of a group of values for the variable C within the current combination of values of A and B and only keeps records where first.C is true.

Try sorting your data and running the example code and check to see if the result is as needed. If you don't like the resulting values of D and E you'll have to explain why not with a rule.

mkeintz
PROC Star

If you sort the data by ABC, but tell proc sort to segregate the unique keys, you will solve this problem:

 

proc sort data=have out=_null_ nouniquekey  uniout=want;
  by a b c;
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 784 views
  • 1 like
  • 4 in conversation