Hello,
I have a very strange issue. I am using %DO loops to generate nearly 800 variables in a SQL statement, but SAS is complaining of a syntax error. The strange part is that when I copy and paste the code from the log that is generated as a result of the %DO loops, the code runs perfectly fine! Here is the code:
%macro one_year_ru(pref=pb,end_dt='13OCT2013',end=71,values=%str(30,90,180,270,365,545,730));
%let count = 7;
proc sql;
create table &pref._sales_ru as
select party_id,
%do i = 1 %to &count;
%let value=%qscan(&values,&i,%str(,));
sum(case when tran_date between &end_dt.d-&value. and &end_dt.d then num_interactions else 0 end) as num_interactions_&value ,
sum(case when tran_date between &end_dt.d-&value. and &end_dt.d then num_trans else 0 end) as num_trans_&value ,
sum(case when tran_date between &end_dt.d-&value. and &end_dt.d then num_rtrns else 0 end) as num_rtns_&value ,
sum(case when tran_date between &end_dt.d-&value. and &end_dt.d then num_store_trans else 0 end) as num_store_trans_&value ,
sum(case when tran_date between &end_dt.d-&value. and &end_dt.d then num_store_rtrns else 0 end) as num_store_rtrns_&value ,
sum(case when tran_date between &end_dt.d-&value. and &end_dt.d then num_dir_trans else 0 end) as num_dir_trans_&value ,
sum(case when tran_date between &end_dt.d-&value. and &end_dt.d then num_dir_rtrns else 0 end) as num_dir_rtrns_&value ,
sum(case when tran_date between &end_dt.d-&value. and &end_dt.d then items_purchased else 0 end) as items_purchased_&value ,
sum(case when tran_date between &end_dt.d-&value. and &end_dt.d then num_store_items else 0 end) as num_store_items_&value ,
sum(case when tran_date between &end_dt.d-&value. and &end_dt.d then num_dir_items else 0 end) as num_dir_items_&value ,
sum(case when tran_date between &end_dt.d-&value. and &end_dt.d then net_sales else 0 end) as net_sales_&value ,
sum(case when tran_date between &end_dt.d-&value. and &end_dt.d then net_store_sales else 0 end) as net_store_sales_&value ,
sum(case when tran_date between &end_dt.d-&value. and &end_dt.d then net_dir_sales else 0 end) as net_dir_sales_&value ,
%do j=1 %to &end.;
max(case when tran_date between &end_dt.d-&value. and &end_dt.d then if&j. else 0 end) as if&j._&value ,
sum(case when tran_date between &end_dt.d-&value. and &end_dt.d then iq&j. else 0 end) as iq&j._&value ,
%end;
%end;
max(tran_date) as last_purch_date
from &pref..sales_oct14
where tran_date between &end_dt.d-730 and &end_dt.d
group by 1
order by 1;
quit;
%mend;
After invoking the macro, with default or changed parameters, SAS gives this error in the log:
num_interactions_30
--
22
--
76
ERROR 22-322: Syntax error, expecting one of the following: ',', LEN, LENGTH.
ERROR 76-322: Syntax error, statement will be ignored.
When I look at the SQL statement generated in the log, there is clearly a COMMA:
MPRINT(ONE_YEAR_RU): create table pb_sales_ru as select party_id, sum(case when tran_date between
'13OCT2013'd-30 and '13OCT2013'd then num_interactions else 0 end) as num_interactions_30 ,
Any help is greatly appreciated!! Thank you!
I suspect it's the quoting character introduced by %QSCAN. SQL sometimes has trouble with parsing when the quoting characters are present. Switch to %SCAN instead and see if that takes care of the problem.
Good luck.
(partial) log would be helpful.
And why are you creating 800 variables? How do yo intend to use them next?
These are windowed variables to be passed into a feature selection algorithm for creating a logistic model.
Do you want more of a partial log than what is at the bottom of the post, or of a different part of the log? If so, here is some of it:
MPRINT(ONE_YEAR_RU): proc sql;
NOTE: Line generated by the macro variable "VALUE".
1 num_interactions_30
--
22
--
76
MPRINT(ONE_YEAR_RU): create table pb_sales_ru as select party_id, sum(case when tran_date between
'13OCT2013'd-30 and '13OCT2013'd then num_interactions else 0 end) as num_interactions_30 , sum(case
when tran_date between '13OCT2013'd-30 and '13OCT2013'd then num_trans else 0 end) as num_trans_30 ,
sum(case when tran_date between '13OCT2013'd-30 and '13OCT2013'd then num_rtrns else 0 end) as
num_rtns_30 , sum(case when tran_date between '13OCT2013'd-30 and '13OCT2013'd then num_store_trans
else 0 end) as num_store_trans_30 , sum(case when tran_date between '13OCT2013'd-30 and '13OCT2013'd
then num_store_rtrns else 0 end) as num_store_rtrns_30 , sum(case when tran_date between
'13OCT2013'd-30 and '13OCT2013'd then num_dir_trans else 0 end) as num_dir_trans_30 , sum(case when
tran_date between '13OCT2013'd-30 and '13OCT2013'd then num_dir_rtrns else 0 end) as num_dir_rtrns_30
, sum(case when tran_date between '13OCT2013'd-30 and '13OCT2013'd then items_purchased else 0 end)
as items_purchased_30 , sum(case when tran_date between '13OCT2013'd-30 and '13OCT2013'd then
num_store_items else 0 end) as num_store_items_30 , sum(case when tran_date between '13OCT2013'd-30
and '13OCT2013'd then num_dir_items else 0 end) as num_dir_items_30 , sum(case when tran_date between
'13OCT2013'd-30 and '13OCT2013'd then net_sales else 0 end) as net_sales_30 , sum(case when tran_date
between '13OCT2013'd-30 and '13OCT2013'd then net_store_sales else 0 end) as net_store_sales_30 ,
sum(case when tran_date between '13OCT2013'd-30 and '13OCT2013'd then net_dir_sales else 0 end) as
net_dir_sales_30 , max(case when tran_date between '13OCT2013'd-30 and '13OCT2013'd then if1 else 0
end) as if1_30 , sum(case when tran_date between '13OCT2013'd-30 and '13OCT2013'd then iq1 else 0
end) as iq1_30 , max(case when tran_date between '13OCT2013'd-30 and '13OCT2013'd then if2 else 0
end) as if2_30 , sum(case when tran_date between '13OCT2013'd-30 and '13OCT2013'd then iq2 else 0
end) as iq2_30 , max(case when tran_date between '13OCT2013'd-30 and '13OCT2013'd then if3 else 0
end) as if3_30 , sum(case when tran_date between '13OCT2013'd-30 and '13OCT2013'd then iq3 else 0
end) as iq3_30 , max(case when tran_date between '13OCT2013'd-30 and '13OCT2013'd then if4 else 0
end) as if4_30 , sum(case when tran_date between '13OCT2013'd-30 and '13OCT2013'd then iq4 else 0
end) as iq4_30 , max(case when tran_date between '13OCT2013'd-30 and '13OCT2013'd then if5 else 0
end) as if5_30 , sum(case when tran_date between '13OCT2013'd-30 and '13OCT2013'd then iq5 else 0
end) as iq5_30 , max(case when tran_date between '13OCT2013'd-30 and '13OCT2013'd then if6 else 0
end) as if6_30 , sum(case when tran_date between '13OCT2013'd-30 and '13OCT2013'd then iq6 else 0
....
{TONS MORE LOG...}
....
ERROR 22-322: Syntax error, expecting one of the following: ',', LEN, LENGTH.
ERROR 76-322: Syntax error, statement will be ignored.
Yes, is 800 variables the best way to proceed? How are you going to control such a vast amount of variables. A brief look at your code, assign a code value to each row, depending on whether the date is between the required e.g.:
case when tran_date between &end_dt.d-&value. and &end_dt.d then num_interactions else 0 end)
You then have a categorised dataset, which you can simply have one sum() and use a group by on the groups you have just created. Personally I would look again at what you want to acheive as there is likely to be better methods.
I appreciate the input--this is a multi-hundred gigabyte permanent dataset and each operation on it takes roughly an hour or more--a single PROC SQL with windowed variables has proven, over our work with our nearly 20 TB of data, to be the most efficient approach.
We have models that we build which will have starting variable cardinality in the thousands, so the number of variables is not the issue, though the feedback is appreciated.
If anyone has an answer as to why the QUERY in the log that is generated as a result of the macro works perfectly as is, but the macro fails, that would be most appreciated. Thank you.
To further clarify, I have found a workaround by copying/pasting the code generated in the LOG, so I'm not trying to find alternative solutions.
I'm just dying to know why PROC SQL is having an issue processing the %DO loop generated statements. We use this technique frequently in our code and have not had this issue before, so we're just curious if anyone knows why. Thanks!
I suspect it's the quoting character introduced by %QSCAN. SQL sometimes has trouble with parsing when the quoting characters are present. Switch to %SCAN instead and see if that takes care of the problem.
Good luck.
You, good SAS user, are a genius! I typically use a CALL EXECUTE in a DATA _NULL_ to control iterative variables but in this case I moved it inside. Thank you for your helpfulness!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.