- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
This is a simple question :
I want to create a table selecting distinct observations from an existing table based on combination of three columns but I want to retain some more variables from the original table.
something like :
proc sql;
CREATE table want as
SELECT DISTINCT on (var1, var2, var3) (keep = var1, var2, var3, var6, var7, var8)
FROM have
ORDER by var1;
quit;
var6, var7 and var8 contain the values I want to keep in my output table. what is the correct format for doing this ?
thanks,
Nikhil
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Depending on how you created the summaries I might be tempted to go back a step or two as it would seem that you went to extra work to get the summaries attached to the VAR1, 2 and 3 list.
Since apparently the value for VAR6 etc are duplicated then the simplest SQL solution would look something like:
proc sql;
create table want as
select distinct var1,var2,var3, var6,var7,var8
from have;
quit;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Your looking at a join but if VAR1, 2 and 3 are repeated WHICH values of VAR6, 7 and 8 do you want?
Or possibly
proc sort data=have out= want nodupkey, by var1 var2 var3;run;
which will have some value for var6, 7 and 8 but which ones?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
They way I have summarized them, the vals in var6, var7 and var8 are all same for distinct combinations of var1, var2 and var3. So I don't care which value it retains . .
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Depending on how you created the summaries I might be tempted to go back a step or two as it would seem that you went to extra work to get the summaries attached to the VAR1, 2 and 3 list.
Since apparently the value for VAR6 etc are duplicated then the simplest SQL solution would look something like:
proc sql;
create table want as
select distinct var1,var2,var3, var6,var7,var8
from have;
quit;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Okay it was a silly question I think. I've just chosen all the columns i want to keep in the select distinct statement.