BookmarkSubscribeRSS Feed
geds133
Calcite | Level 5

I am trying to group the towns and Area_segment so that I have a number of customers for each area segment and town. However there are duplicate results in my table which means the group by has not grouped the desired variables. The code is a s follows:

 

proc sql;

---------------------------------------------------------------------------------------------
CREATE TABLE WORK.Qopenbranch AS

---------------------------------------------------------------------------------------------
SELECT
case when areasegment = ' ' then 'Unknown' else areasegment end as Area_Segment,
COUNT( calculated Area_Segment) AS nocustomers,
town

---------------------------------------------------------------------------------------------
FROM WORK.QProductList AS a LEFT JOIN Cn.Branch AS b

---------------------------------------------------------------------------------------------
ON
a.branchcode = b.branchcode

---------------------------------------------------------------------------------------------
GROUP BY
town, Area_Segment;

---------------------------------------------------------------------------------------------
quit;

 

Could someone explain?

 

Thanks

4 REPLIES 4
geds133
Calcite | Level 5

I cannot provide example data as it is confidential. Apologies, I am new to SAS so am unsure how this should be laid out.

Kurt_Bremser
Super User

Take small subsets of your data, just enough to illustrate the issue.

Anonymize the necessary columns, then post here.

 

SAS code is data driven, without knowing the data it's next to impossible to diagnose anything.

FreelanceReinh
Jade | Level 19

@geds133 wrote:

However there are duplicate results in my table which means the group by has not grouped the desired variables.


Maybe the "duplicates" only look like duplicates, but in fact there are minor differences between the (character) GROUP BY variable values, e.g. due to leading blanks or invisible characters. (With numeric grouping variables tiny rounding errors could cause the differences.)

 

Example:

proc print data=test;
run;

Result:

Obs        x    state      town

 1      -0.1     NY      New York
 2      22.2     NY      New York
 3      22.8     NY      New York
proc sql;
select state, town, sum(x) as total
from test
group by state, town;
quit;

Result:

state  town                     total
-------------------------------------
NY     New York                 -0.1
NY     New York                  22.8
NY     New York                  22.2

No aggregation has occurred, but aren't there duplicate state-town combinations?

 

Solution:

Spoiler
data test;
length x 8 state $3 town $20;
input x 4. town $char20.;
if _n_=2 then substr(town,4,1)='A0'x;
state='NY'||byte(x+10);
cards;
-0.1 New York
22.2New York
22.8New York
;

proc print data=test;
format state $hex6. town $hex18.;
run;  

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1187 views
  • 0 likes
  • 3 in conversation