BookmarkSubscribeRSS Feed
geds133
Calcite | Level 5

I am trying to group the towns and Area_segment so that I have a number of customers for each area segment and town. However there are duplicate results in my table which means the group by has not grouped the desired variables. The code is a s follows:

 

proc sql;

---------------------------------------------------------------------------------------------
CREATE TABLE WORK.Qopenbranch AS

---------------------------------------------------------------------------------------------
SELECT
case when areasegment = ' ' then 'Unknown' else areasegment end as Area_Segment,
COUNT( calculated Area_Segment) AS nocustomers,
town

---------------------------------------------------------------------------------------------
FROM WORK.QProductList AS a LEFT JOIN Cn.Branch AS b

---------------------------------------------------------------------------------------------
ON
a.branchcode = b.branchcode

---------------------------------------------------------------------------------------------
GROUP BY
town, Area_Segment;

---------------------------------------------------------------------------------------------
quit;

 

Could someone explain?

 

Thanks

4 REPLIES 4
geds133
Calcite | Level 5

I cannot provide example data as it is confidential. Apologies, I am new to SAS so am unsure how this should be laid out.

Kurt_Bremser
Super User

Take small subsets of your data, just enough to illustrate the issue.

Anonymize the necessary columns, then post here.

 

SAS code is data driven, without knowing the data it's next to impossible to diagnose anything.

FreelanceReinh
Jade | Level 19

@geds133 wrote:

However there are duplicate results in my table which means the group by has not grouped the desired variables.


Maybe the "duplicates" only look like duplicates, but in fact there are minor differences between the (character) GROUP BY variable values, e.g. due to leading blanks or invisible characters. (With numeric grouping variables tiny rounding errors could cause the differences.)

 

Example:

proc print data=test;
run;

Result:

Obs        x    state      town

 1      -0.1     NY      New York
 2      22.2     NY      New York
 3      22.8     NY      New York
proc sql;
select state, town, sum(x) as total
from test
group by state, town;
quit;

Result:

state  town                     total
-------------------------------------
NY     New York                 -0.1
NY     New York                  22.8
NY     New York                  22.2

No aggregation has occurred, but aren't there duplicate state-town combinations?

 

Solution:

Spoiler
data test;
length x 8 state $3 town $20;
input x 4. town $char20.;
if _n_=2 then substr(town,4,1)='A0'x;
state='NY'||byte(x+10);
cards;
-0.1 New York
22.2New York
22.8New York
;

proc print data=test;
format state $hex6. town $hex18.;
run;  

 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 1145 views
  • 0 likes
  • 3 in conversation