SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

Create Dataset using DISTINCT values only

Accepted Solution Solved
Reply
Contributor
Posts: 71
Accepted Solution

Create Dataset using DISTINCT values only

[ Edited ]

Hi,

I am using this command to get a count of distinct ids, but how can I leverage this command to create a dataset with  (all the variables - there are more than 100) based on the distinct ID.

 

proc sql;

create table WANT as

select id, count( distinct id) as count

from HAVE

where qtr=1

AND TEST=0

group by id

;

quit;

 

Thanks

 


Accepted Solutions
Solution
‎06-10-2016 02:07 PM
Super User
Posts: 11,343

Re: Create Dataset using DISTINCT values only

 

The * in SQL is short for all variables. So you get distinct combinations of the remaining variables as well. Some of those variables apparently have more than one value for a given ID.

 

If you only want distinct values of one variable which value(s) of other variables do you want?

 

 

View solution in original post


All Replies
Super User
Posts: 11,343

Re: Create Dataset using DISTINCT values only

Proc sql;

   create table want as

   select distinct *

   from have;

run;

 

or name the desired variables after the distinct predicate.

 

Contributor
Posts: 71

Re: Create Dataset using DISTINCT values only

Thank you it did not work.
Super Contributor
Posts: 441

Re: Create Dataset using DISTINCT values only

[ Edited ]

AZIQ1 wrote:
Thank you it did not work.


What happened? Code and log may help.

 

- Jan.

Super User
Posts: 11,343

Re: Create Dataset using DISTINCT values only

If you have a specific requirement the post the input data and the desired output otherwise there is only guessing.

 

"Doesn't work" is not very helpful. Did it produce no output?Unexpected output? Generate an error? Post log including the code and any notes or error messages.

Contributor
Posts: 71

Re: Create Dataset using DISTINCT values only

[ Edited ]

data have;

input id var1 var2 var3 var4 var5 var6 var7 var8 var9 ;

datalines;

1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1

2 1 1 1 1 1 1 1 1 1

2 1 0 1 0 1 1 0 0 0

2 1 0 1 0 1 1 0 0 0

2 1 0 1 0 1 1 0 0 0

2 1 0 1 0 1 1 0 0 0

3 1 1 1 1 1 1 1 1 1

3 1 1 1 1 1 1 1 1 1

3 1 1 1 1 1 1 1 1 1

3 1 1 1 1 1 1 1 1 1

3 1 1 1 1 1 1 1 1 1

run;

proc sql;

create table want as

select id, count( distinct id) as count

from have

group by id

;

quit;

This gives me 3 distinct count.

 

proc sql;

create table _1 as

select distinct id, *

from have

group by id

;

quit;

 

This gives me 4 ( I need 3 as in the first code)

Solution
‎06-10-2016 02:07 PM
Super User
Posts: 11,343

Re: Create Dataset using DISTINCT values only

 

The * in SQL is short for all variables. So you get distinct combinations of the remaining variables as well. Some of those variables apparently have more than one value for a given ID.

 

If you only want distinct values of one variable which value(s) of other variables do you want?

 

 

Contributor
Posts: 71

Re: Create Dataset using DISTINCT values only

I selected a second var just like you said and it worked.
Thank you
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 1962 views
  • 0 likes
  • 3 in conversation