I am trying to group the towns and Area_segment so that I have a number of customers for each area segment and town. However there are duplicate results in my table which means the group by has not grouped the desired variables. The code is a s follows:
proc sql;
---------------------------------------------------------------------------------------------
CREATE TABLE WORK.Qopenbranch AS
---------------------------------------------------------------------------------------------
SELECT
case when areasegment = ' ' then 'Unknown' else areasegment end as Area_Segment,
COUNT( calculated Area_Segment) AS nocustomers,
town
---------------------------------------------------------------------------------------------
FROM WORK.QProductList AS a LEFT JOIN Cn.Branch AS b
---------------------------------------------------------------------------------------------
ON
a.branchcode = b.branchcode
---------------------------------------------------------------------------------------------
GROUP BY
town, Area_Segment;
---------------------------------------------------------------------------------------------
quit;
Could someone explain?
Thanks
Please supply example data for your datasets. Use data steps with datalines, so we can easily recreate your datasets for testing.
I cannot provide example data as it is confidential. Apologies, I am new to SAS so am unsure how this should be laid out.
Take small subsets of your data, just enough to illustrate the issue.
Anonymize the necessary columns, then post here.
SAS code is data driven, without knowing the data it's next to impossible to diagnose anything.
@geds133 wrote:
However there are duplicate results in my table which means the group by has not grouped the desired variables.
Maybe the "duplicates" only look like duplicates, but in fact there are minor differences between the (character) GROUP BY variable values, e.g. due to leading blanks or invisible characters. (With numeric grouping variables tiny rounding errors could cause the differences.)
Example:
proc print data=test;
run;
Result:
Obs x state town 1 -0.1 NY New York 2 22.2 NY New York 3 22.8 NY New York
proc sql;
select state, town, sum(x) as total
from test
group by state, town;
quit;
Result:
state town total ------------------------------------- NY New York -0.1 NY New York 22.8 NY New York 22.2
No aggregation has occurred, but aren't there duplicate state-town combinations?
Solution:
data test;
length x 8 state $3 town $20;
input x 4. town $char20.;
if _n_=2 then substr(town,4,1)='A0'x;
state='NY'||byte(x+10);
cards;
-0.1 New York
22.2New York
22.8New York
;
proc print data=test;
format state $hex6. town $hex18.;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.