Help using Base SAS procedures

PROC SQL - Using a calculated column in WHERE

Reply
Occasional Contributor
Posts: 15

PROC SQL - Using a calculated column in WHERE

I am trying to use calculated columns in my WHERE statement. From what I researched online, the below should work but I keep getting this error:


ERROR: Summary functions are restricted to the SELECT and HAVING clauses only.
NOTE: PROC SQL set option NOEXEC and will continue to check the syntax of statements.


Below is my code. Did I code wrong? I want to aggregate the transaction amount for each bank between the two dates. Thank you for your suggestions!
========================================================
proc sql;
create table TEMP2 as
select
bank_id /* bank id */
, MAX(tran_dt) as max_tran_dt /* latest transaction date */
, MIN(tran_dt) as min_tran_dt /* earliest transaction date */
, SUM(tran_am) as sum_tran_am /* total transaction amount */

from TEMP2
where tran_dt between CALCULATED min_tran_dt and CALCULATED max_tran_dt
group by bank_id
order by bank_id;
quit;
PROC Star
Posts: 7,356

Re: PROC SQL - Using a calculated column in WHERE

Could you just use the suggestion SAS provided in your log, namely use having rather than where?

Of course, I, personally, wouldn't try to create temp2 from temp2 thus in my example, below, create temp3:

data temp2;
set sashelp.class (rename=(
age=bank_id
height=tran_dt
weight=tran_am
));
run;
proc sql;
create table TEMP3 as
select
bank_id /* bank id */
, MAX(tran_dt) as max_tran_dt /* latest transaction date */
, MIN(tran_dt) as min_tran_dt /* earliest transaction date */
, SUM(tran_am) as sum_tran_am /* total transaction amount */

from TEMP2
group by bank_id
having tran_dt between CALCULATED min_tran_dt
and CALCULATED max_tran_dt
order by bank_id
;
quit;

HTH,
Art
Regular Contributor
Posts: 241

Re: PROC SQL - Using a calculated column in WHERE

@SAS_user_n: You are trying to do something quite odd -- It seems that you are calculating the range of transaction dates just to select transactions whose date is between the two. Which is an unusually round about way of calculating the amount sub-totals by id.

The only possible reason why you may be doing this is to exclude amounts whose date is missing. If this is the case, then I would do this directly as below. Better yet, you can use proc means/summary. Hope this helps.
[pre]
data one;
input id tx am;
cards;
1 1 10
1 2 20
1 . 30
2 3 40
2 4 50
2 5 60
;
run;

/* proc sql */
proc sql;
select id, sum(am) as totalAm
from one
where not missing(tx)
group by id
order by id;
quit;
/*
id totalAm
------------------
1 30
2 150
*/


/* get sum of amounts by id */
proc sort data=one;
by id;
run;
proc summary data=one;
var am;
by id;
output out=stats(drop=_Smiley Happy sum(am)=totalAm;
where not missing(tx);
run;
/* check */
proc print data=stats noobs;
run;
/* on lst
total
id Am
1 30
2 150
*/
[/pre]
Ask a Question
Discussion stats
  • 2 replies
  • 3862 views
  • 0 likes
  • 3 in conversation