SELECT DISTINCT ON MULTIPLE COLUMNS, SPECIFY COLUMNS TO BE RETAINED

Accepted Solution Solved
Reply
Contributor
Posts: 40
Accepted Solution

SELECT DISTINCT ON MULTIPLE COLUMNS, SPECIFY COLUMNS TO BE RETAINED

This is a simple question :

I want to create a table selecting distinct observations  from an existing table based on combination of three columns but I want to retain some more variables from the original table.

something like :

proc sql;

    CREATE table want as

    SELECT DISTINCT on (var1, var2, var3) (keep = var1, var2, var3, var6, var7, var8)

    FROM have

    ORDER by var1;

quit;

var6, var7 and var8 contain the values I want to keep in my output table. what is the correct format for doing this ?

thanks,

Nikhil


Accepted Solutions
Solution
‎02-10-2015 12:53 PM
Super User
Posts: 10,500

Re: SELECT DISTINCT ON MULTIPLE COLUMNS, SPECIFY COLUMNS TO BE RETAINED

Depending on how you created the summaries I might be tempted to go back a step or two as it would seem that you went to extra work to get the summaries attached to the VAR1, 2 and 3 list.

Since apparently the value for VAR6 etc are duplicated then the simplest SQL solution would look something like:

proc sql;

     create table want as

     select distinct var1,var2,var3, var6,var7,var8

     from have;

quit;

View solution in original post


All Replies
Super User
Posts: 10,500

Re: SELECT DISTINCT ON MULTIPLE COLUMNS, SPECIFY COLUMNS TO BE RETAINED

Your looking at a join but if VAR1, 2 and 3 are repeated WHICH values of VAR6, 7 and 8 do you want?

Or possibly

proc sort data=have out= want nodupkey, by var1 var2 var3;run;

which will have some value for var6, 7 and 8 but which ones?

Contributor
Posts: 40

Re: SELECT DISTINCT ON MULTIPLE COLUMNS, SPECIFY COLUMNS TO BE RETAINED

They way I have summarized them, the vals in var6, var7 and var8 are all same for distinct combinations of var1, var2 and var3. So I don't care which value it retains . .

Solution
‎02-10-2015 12:53 PM
Super User
Posts: 10,500

Re: SELECT DISTINCT ON MULTIPLE COLUMNS, SPECIFY COLUMNS TO BE RETAINED

Depending on how you created the summaries I might be tempted to go back a step or two as it would seem that you went to extra work to get the summaries attached to the VAR1, 2 and 3 list.

Since apparently the value for VAR6 etc are duplicated then the simplest SQL solution would look something like:

proc sql;

     create table want as

     select distinct var1,var2,var3, var6,var7,var8

     from have;

quit;

Contributor
Posts: 40

Re: SELECT DISTINCT ON MULTIPLE COLUMNS, SPECIFY COLUMNS TO BE RETAINED

Yes you are right. I am wasting computing resources doing the same calculations again & again. I'll change the order of sub-setting & calculations.

Contributor
Posts: 40

Re: SELECT DISTINCT ON MULTIPLE COLUMNS, SPECIFY COLUMNS TO BE RETAINED

Okay it was a silly question I think. I've just chosen all the columns i want to keep in the select distinct statement.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 23829 views
  • 3 likes
  • 2 in conversation