Have embedded my COMMENTS in the codes, to get an understanding of my requirement.
/*8 observation from Shoes dataset*/
data shoes;
set sashelp.shoes;
if region='Canada' and Subsidiary='Montreal';
run ;
/*8 observations from org dataset since same data replicated*/
data org;
set sashelp.shoes;
if region='Canada' and Subsidiary='Montreal';
run ;
/*TASK: JOINING both the datasets to get the count of PRODUCT variable and storing the result in CNT variable using COUNT FUNCTION. Since there
are only 8 observation the CNT should also be 8,but because of CARTESIAN product CNT variable is showing 64 . Different results
with different usage of DISTINCT function*/
/*Scenario 1 Not using DISTINCT getting 64 observations in the dataset and CNT varaible
has 64 observations*/
proc sql noprint;
create table want1 as select org.region ,shoes.product ,shoes.Subsidiary,shoes.stores,
count(shoes.product) as cnt
from org as org left join
shoes as shoes on org.region = shoes.region
group by org.region
order by org.region;
quit;
/*Scenario 2 using DISTINCT getting 8 observations and CNT variable has 64 observations */
proc sql noprint;
create table want2 as select distinct org.region ,shoes.product ,shoes.Subsidiary,shoes.stores,
count(shoes.product) as cnt
from org as org left join
shoes as shoes on org.region = shoes.region
group by org.region
order by org.region;
quit;
/*Scenario 3: using DISTINCT at COUNT level getting 64 observations and CNT variable has 8 observations */
proc sql noprint;
create table want3 as select org.region ,shoes.product ,shoes.Subsidiary,shoes.stores,
count(distinct(shoes.product)) as cnt
from org as org left join
shoes as shoes on org.region = shoes.region
group by org.region
order by org.region;
quit;
/*Scenario 4: using DISTINCT at COUNT level getting 8 observations and CNT variable has 64 observations */
proc sql noprint;
create table want4 as select distinct org.region ,shoes.product ,shoes.Subsidiary,shoes.stores,
count(distinct(shoes.product)) as cnt
from org as org left join
shoes as shoes on org.region = shoes.region
group by org.region
order by org.region;
quit;
My requirement is NOT TO use DISTINCT at COUNT function but to get the COUNT(CNT) value as 8 since the actual count is
only 8 in the dataset. If we use distinct at COUNT FUNCTION we will get 8 . But i have customer requirements to use both
DISTINCT and without DISTINCT but my current issues is not to USE DISTINCT at COUNT function
to get 8 as CNT value though 64 observations are present. Any other alternative/solution for this.
Your join will always end up as a cartesian join, because that's how SQL works; when you have M tuples for a key in one dataset and N tuples in the other, the join will give you M*N tuples as result. Unless you prevent that with DISTINCT.
Depending on the actual form of the join, you should contemplate using a data step merge, or prepare your datasets in a way that you'l have a M to 1 join instead of a M to N.
Do you have to use SQL?
I don't think the fix is the count/distinct, it's in the joins most likely - you need to make it a distinct join which usually means adding a join condition. Can you post an example of your org table?
I think what you want to do is something like this:
proc sql noprint;
create table want as select distinct org.region ,shoes.product ,shoes.Subsidiary,shoes.stores,shoes.cnt
from org as org left join
(select *,count(*) as cnt from shoes) shoes
on org.region = shoes.region
group by org.region
order by org.region;
quit;
In other words: do the counting before the join by creating an inner query for shoes.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.