BookmarkSubscribeRSS Feed
keen_sas
Quartz | Level 8
Have embedded my COMMENTS in the codes, to get an understanding of my requirement.

/*8 observation from Shoes dataset*/ data shoes; set sashelp.shoes; if region='Canada' and Subsidiary='Montreal'; run ; /*8 observations from org dataset since same data replicated*/ data org; set sashelp.shoes; if region='Canada' and Subsidiary='Montreal'; run ; /*TASK: JOINING both the datasets to get the count of PRODUCT variable and storing the result in CNT variable using COUNT FUNCTION. Since there are only 8 observation the CNT should also be 8,but because of CARTESIAN product CNT variable is showing 64 . Different results with different usage of DISTINCT function*/ /*Scenario 1 Not using DISTINCT getting 64 observations in the dataset and CNT varaible has 64 observations*/ proc sql noprint; create table want1 as select org.region ,shoes.product ,shoes.Subsidiary,shoes.stores, count(shoes.product) as cnt from org as org left join shoes as shoes on org.region = shoes.region group by org.region order by org.region; quit; /*Scenario 2 using DISTINCT getting 8 observations and CNT variable has 64 observations */ proc sql noprint; create table want2 as select distinct org.region ,shoes.product ,shoes.Subsidiary,shoes.stores, count(shoes.product) as cnt from org as org left join shoes as shoes on org.region = shoes.region group by org.region order by org.region; quit; /*Scenario 3: using DISTINCT at COUNT level getting 64 observations and CNT variable has 8 observations */ proc sql noprint; create table want3 as select org.region ,shoes.product ,shoes.Subsidiary,shoes.stores, count(distinct(shoes.product)) as cnt from org as org left join shoes as shoes on org.region = shoes.region group by org.region order by org.region; quit; /*Scenario 4: using DISTINCT at COUNT level getting 8 observations and CNT variable has 64 observations */ proc sql noprint; create table want4 as select distinct org.region ,shoes.product ,shoes.Subsidiary,shoes.stores, count(distinct(shoes.product)) as cnt from org as org left join shoes as shoes on org.region = shoes.region group by org.region order by org.region; quit; My requirement is NOT TO use DISTINCT at COUNT function but to get the COUNT(CNT) value as 8 since the actual count is
only 8 in the dataset. If we use distinct at COUNT FUNCTION we will get 8 . But i have customer requirements to use both
DISTINCT and without DISTINCT but my current issues is not to USE DISTINCT at COUNT function
to get 8 as CNT value though 64 observations are present. Any other alternative/solution for this.
3 REPLIES 3
Kurt_Bremser
Super User

Your join will always end up as a cartesian join, because that's how SQL works; when you have M tuples for a key in one dataset and N tuples in the other, the join will give you M*N tuples as result. Unless you prevent that with DISTINCT.

Depending on the actual form of the join, you should contemplate using a data step merge, or prepare your datasets in a way that you'l have a M to 1 join instead of a M to N.

Reeza
Super User

Do you have to use SQL?

I don't think the fix is the count/distinct, it's in the joins most likely - you need to make it a distinct join which usually means adding a join condition. Can you post an example of your org table?

 

 

s_lassen
Meteorite | Level 14

I think what you want to do is something like this:

proc sql noprint;
   create table want as select distinct org.region ,shoes.product ,shoes.Subsidiary,shoes.stores,shoes.cnt
  	 	from org as org left join 
      (select *,count(*) as cnt from shoes) shoes
   on org.region = shoes.region	    	     
     group by org.region
     order by org.region;
quit;

In other words: do the counting before the join by creating an inner query for shoes.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 539 views
  • 0 likes
  • 4 in conversation