01-12-2018 12:17 AM
Simple question probably, but I keep arranging and rearranging distinct and group by in the code and I'm not getting anything different.
I have two tables that share an ID between the two of them. Let's say table a has revenue data associated with each ID and table two has an instance where two titles are associated with the same ID. I use a left join of table an and ask for a distinct ID, and even group by, but I keep getting two row back for that ID, one for each name from table
proc sql; create table tableau.metadata_match as select distinct a.product_id, a.region, b.product_name, a.transactions, a.revenue from table a left join table b on a.product_id = b.product_id;
group by 1, 2 quit;
It returns these rows repeating the revenue. What I really want is only row for this.
01-12-2018 12:30 AM
I see few problem with the sql query.
You should remove group by statement becuase I do not find any aggregate function being used in the select clause.
Then try again.
01-12-2018 11:59 AM
I posted fake data to show an example of the scenario and what output I'm receiving. However, in the real scenario, If I need to specify just one of these, how do I do it by choosing either randomly or based off of occurrence using a date field (first occurrence, most recent occurrence)?
01-12-2018 03:48 AM
The problem is that you still have product in your query and there are 2 different values which gives you 2 rows of data.
If you remove it you will get 1 row.
But if you still need product in your result you need to descide which one to keep, as Andreas says
01-12-2018 11:59 AM
If I need to specify just one of these, how do I do it by choosing either randomly or based off of occurrence using a date field (first occurrence, most recent occurrence)?
01-12-2018 11:39 AM
I think that you may be misunderstanding what the DISTINCT actually attempts to do. It gets distinct combinations of ALL variables on the select statement not just the first variable.