BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
anandmgjsa
Fluorite | Level 6

Hi ,

 

I am getting duplicate rows in data. The result is correct but it's generating same rows multiple times.

The code is :

proc sql;

create table final as

select month_year , count(distinct ccsid) as unique_visitor

from login_old

where ccsid ne '  ' 

group by month_year;

quit;

 

I am getting result as below :

2020-04-01    12989

2020-04-01    12989

2020-04-01    12989

2020-04-01    12989

2020-05-01    17894

2020-05-01    17894

2020-05-01    17894

2020-05-01    17894

I want only one row for each result. Duplication is not needed.

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
Add 'distinct' keyword.

select DISTINCT month_year , count(distinct ccsid) as unique_visitor

View solution in original post

8 REPLIES 8
Reeza
Super User
Check if month_year has a format on it and that's likely causing the issue. You're seeing the dates as the same but they're not really.
You usually don't see that with the day component though, but it's the only reason I can think of for SQL to behave this way. SQL aggregation does not honour SAS formats. You need to apply the format instead.

proc sql;

create table final as

select put(t1.month_year, yymmn6.) as month_year , count(distinct ccsid) as unique_visitor

from login_old t1

where ccsid ne ' '

group by put(t1.month_year, yymmn6.);

quit;
anandmgjsa
Fluorite | Level 6
It's throwing error
error : Numeric format YYMMN in put function requires a numeric argument.
Reeza
Super User
What's the type of your month_year variable?
anandmgjsa
Fluorite | Level 6
This is character.
Reeza
Super User
Then that shouldn't be happening. To confirm, this is your full query and your data is exactly as shown? I would expect this for example if you had other columns included or an * somewhere but I don't see that. So it seems like SQL isn't considering your values to be the same group for some reason. It could be you have invisible characters on the date field somehow messing it up.
Ksharp
Super User
Add 'distinct' keyword.

select DISTINCT month_year , count(distinct ccsid) as unique_visitor
Tom
Super User Tom
Super User

Check the field month_year for leading spaces.

Or invisible characters like TAB ('09'x) CR ('0D'x) LF ('0A'x) non-breaking space ('A0'x) null ('00'x)

Or some of the hyphens are not hyphens but other character like en-dash or em-dash.

Tom
Super User Tom
Super User

You sure you didn't include some third variable in the SELECT list of variables?

If you include non-grouping and non-aggregate values then SAS will remerge the aggregate values back on the results.

SAS will include a note in the log:

1512  proc sql;
1513  select sex, mean(age) as mean_age, height
1514  from sashelp.class
1515  group by sex
1516  ;
NOTE: The query requires remerging summary statistics back with the original data.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 566 views
  • 6 likes
  • 4 in conversation