Feeling really frustrated this morning. Trying to do a very basic proc sql query and my output is not what I want it to be. Instead of having one row for each group, I am getting multiple rows with the same group. In my experience, this issue is a result of remerging - but I am not getting any errors in the log.
Code:
proc sql;
create table disease2 as
select
date_new,
count(*) as diseased
from disease1
group by date_new
order by date_new;
QUIT;
My output:
date_new diseased
202012 1
202101 1
202101 1
202101 1
202101 1
202101 1
202101 1
202101 1
202101 1
202101 2
202101 2
202102 2
202102 2
202102 1
202102 2
202102 2
202102 5
202102 4
202102 3
202102 6
202102 2
Desired output:
date_new diseased
202012 1
202101 12
202102 29
date_new is likely formatted.
SAS SQL doesn't honour formats on your data.
proc sql;
create table disease2 as
select
put(date_new, yymmn6.) as date_new,
count(*) as diseased
from disease1
group by put(date_new, yymmn6.)
order by calculated date_new;
QUIT;
or use PROC FREQ which does use formats.
proc freq data=disease1 noprint;
table date_new / out=disease_counts_by_date nocum nopercent;
run;
@Krysia24 wrote:
Feeling really frustrated this morning. Trying to do a very basic proc sql query and my output is not what I want it to be. Instead of having one row for each group, I am getting multiple rows with the same group. In my experience, this issue is a result of remerging - but I am not getting any errors in the log.
Code:
proc sql;
create table disease2 as
select
date_new,
count(*) as diseased
from disease1
group by date_new
order by date_new;QUIT;
My output:
date_new diseased
202012 1
202101 1
202101 1
202101 1
202101 1
202101 1
202101 1
202101 1
202101 1
202101 2
202101 2
202102 2
202102 2
202102 1
202102 2
202102 2
202102 5
202102 4
202102 3
202102 6
202102 2
Desired output:
date_new diseased
202012 1
202101 12
202102 29
date_new is likely formatted.
SAS SQL doesn't honour formats on your data.
proc sql;
create table disease2 as
select
put(date_new, yymmn6.) as date_new,
count(*) as diseased
from disease1
group by put(date_new, yymmn6.)
order by calculated date_new;
QUIT;
or use PROC FREQ which does use formats.
proc freq data=disease1 noprint;
table date_new / out=disease_counts_by_date nocum nopercent;
run;
@Krysia24 wrote:
Feeling really frustrated this morning. Trying to do a very basic proc sql query and my output is not what I want it to be. Instead of having one row for each group, I am getting multiple rows with the same group. In my experience, this issue is a result of remerging - but I am not getting any errors in the log.
Code:
proc sql;
create table disease2 as
select
date_new,
count(*) as diseased
from disease1
group by date_new
order by date_new;QUIT;
My output:
date_new diseased
202012 1
202101 1
202101 1
202101 1
202101 1
202101 1
202101 1
202101 1
202101 1
202101 2
202101 2
202102 2
202102 2
202102 1
202102 2
202102 2
202102 5
202102 4
202102 3
202102 6
202102 2
Desired output:
date_new diseased
202012 1
202101 12
202102 29
Thank you! I did not know formats were not honored in proc sql. Without an error or warning in the log, I was unsure what I did wrong. Thank you for your help.
I suspect @Reeza is right, date_new values are dates and you want to summarize by month. You must transform the daily dates into month dates (e.g. the date of the first of the month) to calculate the counts.
proc sql;
create table disease2 as
select
intnx("Month", date_new, 0) format=yymmn6. as month_new,
count(*) as diseased
from disease1
group by calculated month_new;
quit;
Note INTNX("Month", date_new, 0) returns the date of the first day of the month.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.