Hi,
I have table_1 with columns (A, B, C, D, E).
For the rows within table_1, I would like to select rows of data where A, B, C are unique, but I also plan to keep information for D & E.
How can I do this?
Will this one work?
Proc Sql;
Create Table table_2 as select * from table_1
left join
(select distinct A, B, C from table_1) m
on table_1. A = m.A and table_1.B = m.B and table_1.C = m.C;
Quit;
Thank you!
You really should provide some example input data and the desired result.
"Unique combination" tends to imply a single result but when you say "plan to keep information for D & E." which values do you want?
if the example data looked like
A B C D E
Avalue Bvalue Cvalue somethingforD somethingforE
Avalue Bvalue Cvalue otherDvalue otherEvalue
Which values from D and E would you want?
Thanks for reminding.
For example,
Table_1:
A B C D E
1 2 2 4 5
1 2 3 6 7
1 2 3 8 2
In table_2: we only have first two rows. Since the third row has same (A, B, C) value with the second one.
If table_1 is sorted by A B C then this may do what you want:
data table_2; set table_1; by a b c; if first.c; run;
If there is some actual criteria involved for choice of D and E you need to show it.
Hi @ballardw
Sorry I may not provide a good example.
Table_1, example as below:
A B C D E
1 2 2 4 5
1 2 3 6 7
2 1 4 5 8
1 2 3 8 2
Table_2, we will have the following:
A B C D E
1 2 2 4 5
1 2 3 6 7
2 1 4 5 8
Because the last row in table_1 is same as the third one because they have same information on (A, B and C)
I was wondering in your code,
will this part 'first.c' filter on 'A, B and C' information?
Thanks!
@Crubal wrote:
Hi @ballardw
Sorry I may not provide a good example.
Table_1, example as below:
A B C D E
1 2 2 4 5
1 2 3 6 7
2 1 4 5 8
1 2 3 8 2
Table_2, we will have the following:
A B C D E
1 2 2 4 5
1 2 3 6 7
2 1 4 5 8
Because the last row in table_1 is same as the third one because they have same information on (A, B and C)
I was wondering in your code,
will this part 'first.c' filter on 'A, B and C' information?
Thanks!
If the data is sorted by the variables A, B and C and you use a BY A B C statement in a data step when each record is processed the SAS adds automatic values that are true for the First and Last for each variable in combination on the By statement. Using "if first.C" checks if the current record has the first of a group of values for the variable C within the current combination of values of A and B and only keeps records where first.C is true.
Try sorting your data and running the example code and check to see if the result is as needed. If you don't like the resulting values of D and E you'll have to explain why not with a rule.
If you sort the data by ABC, but tell proc sort to segregate the unique keys, you will solve this problem:
proc sort data=have out=_null_ nouniquekey uniout=want;
by a b c;
run;
Thanks!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.