BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
D8tAm1n3r_BEN
Calcite | Level 5

Hello,

I have a very strange issue.  I am using %DO loops to generate nearly 800 variables in a SQL statement, but SAS is complaining of a syntax error.  The strange part is that when I copy and paste the code from the log that is generated as a result of the %DO loops, the code runs perfectly fine!  Here is the code:

%macro one_year_ru(pref=pb,end_dt='13OCT2013',end=71,values=%str(30,90,180,270,365,545,730));

%let count = 7;

proc sql;

create table &pref._sales_ru as

select party_id,

  %do i = 1 %to &count;

  %let value=%qscan(&values,&i,%str(,));

  sum(case when tran_date between &end_dt.d-&value. and &end_dt.d then num_interactions else 0 end) as num_interactions_&value ,

  sum(case when tran_date between &end_dt.d-&value. and &end_dt.d then num_trans else 0 end) as num_trans_&value ,

  sum(case when tran_date between &end_dt.d-&value. and &end_dt.d then num_rtrns else 0 end) as num_rtns_&value ,

  sum(case when tran_date between &end_dt.d-&value. and &end_dt.d then num_store_trans else 0 end) as num_store_trans_&value ,

  sum(case when tran_date between &end_dt.d-&value. and &end_dt.d then num_store_rtrns else 0 end) as num_store_rtrns_&value ,

  sum(case when tran_date between &end_dt.d-&value. and &end_dt.d then num_dir_trans else 0 end) as num_dir_trans_&value ,

  sum(case when tran_date between &end_dt.d-&value. and &end_dt.d then num_dir_rtrns else 0 end) as num_dir_rtrns_&value ,

  sum(case when tran_date between &end_dt.d-&value. and &end_dt.d then items_purchased else 0 end) as items_purchased_&value ,

  sum(case when tran_date between &end_dt.d-&value. and &end_dt.d then num_store_items else 0 end) as num_store_items_&value ,

  sum(case when tran_date between &end_dt.d-&value. and &end_dt.d then num_dir_items else 0 end) as num_dir_items_&value ,

  sum(case when tran_date between &end_dt.d-&value. and &end_dt.d then net_sales else 0 end) as net_sales_&value ,

  sum(case when tran_date between &end_dt.d-&value. and &end_dt.d then net_store_sales else 0 end) as net_store_sales_&value ,

  sum(case when tran_date between &end_dt.d-&value. and &end_dt.d then net_dir_sales else 0 end) as net_dir_sales_&value ,

  %do j=1 %to &end.;

  max(case when tran_date between &end_dt.d-&value. and &end_dt.d then if&j. else 0 end) as if&j._&value ,

  sum(case when tran_date between &end_dt.d-&value. and &end_dt.d then iq&j. else 0 end) as iq&j._&value ,

  %end;

  %end;

  max(tran_date) as last_purch_date

from &pref..sales_oct14

where tran_date between &end_dt.d-730 and &end_dt.d

group by 1

order by 1;

quit;

%mend;

After invoking the macro, with default or changed parameters, SAS gives this error in the log:

num_interactions_30

                          --

                          22

                           --

                           76

ERROR 22-322: Syntax error, expecting one of the following: ',', LEN, LENGTH.

ERROR 76-322: Syntax error, statement will be ignored.

When I look at the SQL statement generated in the log, there is clearly a COMMA:

MPRINT(ONE_YEAR_RU):   create table pb_sales_ru as select party_id, sum(case when tran_date between

'13OCT2013'd-30 and '13OCT2013'd then num_interactions else 0 end) as num_interactions_30 ,

Any help is greatly appreciated!!  Thank you!

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

I suspect it's the quoting character introduced by %QSCAN.  SQL sometimes has trouble with parsing when the quoting characters are present.  Switch to %SCAN instead and see if that takes care of the problem.

Good luck.

View solution in original post

7 REPLIES 7
LinusH
Tourmaline | Level 20

(partial) log would be helpful.

And why are you creating 800 variables? How do yo intend to use them next?

Data never sleeps
D8tAm1n3r_BEN
Calcite | Level 5

These are windowed variables to be passed into a feature selection algorithm for creating a logistic model.

Do you want more of a partial log than what is at the bottom of the post, or of a different part of the log?  If so, here is some of it:

MPRINT(ONE_YEAR_RU):   proc sql;

NOTE: Line generated by the macro variable "VALUE".

1        num_interactions_30

                          --

                          22

                           --

                           76

MPRINT(ONE_YEAR_RU):   create table pb_sales_ru as select party_id, sum(case when tran_date between

'13OCT2013'd-30 and '13OCT2013'd then num_interactions else 0 end) as num_interactions_30 , sum(case

when tran_date between '13OCT2013'd-30 and '13OCT2013'd then num_trans else 0 end) as num_trans_30 ,

sum(case when tran_date between '13OCT2013'd-30 and '13OCT2013'd then num_rtrns else 0 end) as

num_rtns_30 , sum(case when tran_date between '13OCT2013'd-30 and '13OCT2013'd then num_store_trans

else 0 end) as num_store_trans_30 , sum(case when tran_date between '13OCT2013'd-30 and '13OCT2013'd

then num_store_rtrns else 0 end) as num_store_rtrns_30 , sum(case when tran_date between

'13OCT2013'd-30 and '13OCT2013'd then num_dir_trans else 0 end) as num_dir_trans_30 , sum(case when

tran_date between '13OCT2013'd-30 and '13OCT2013'd then num_dir_rtrns else 0 end) as num_dir_rtrns_30

, sum(case when tran_date between '13OCT2013'd-30 and '13OCT2013'd then items_purchased else 0 end)

as items_purchased_30 , sum(case when tran_date between '13OCT2013'd-30 and '13OCT2013'd then

num_store_items else 0 end) as num_store_items_30 , sum(case when tran_date between '13OCT2013'd-30

and '13OCT2013'd then num_dir_items else 0 end) as num_dir_items_30 , sum(case when tran_date between

'13OCT2013'd-30 and '13OCT2013'd then net_sales else 0 end) as net_sales_30 , sum(case when tran_date

between '13OCT2013'd-30 and '13OCT2013'd then net_store_sales else 0 end) as net_store_sales_30 ,

sum(case when tran_date between '13OCT2013'd-30 and '13OCT2013'd then net_dir_sales else 0 end) as

net_dir_sales_30 , max(case when tran_date between '13OCT2013'd-30 and '13OCT2013'd then if1 else 0

end) as if1_30 , sum(case when tran_date between '13OCT2013'd-30 and '13OCT2013'd then iq1 else 0

end) as iq1_30 , max(case when tran_date between '13OCT2013'd-30 and '13OCT2013'd then if2 else 0

end) as if2_30 , sum(case when tran_date between '13OCT2013'd-30 and '13OCT2013'd then iq2 else 0

end) as iq2_30 , max(case when tran_date between '13OCT2013'd-30 and '13OCT2013'd then if3 else 0

end) as if3_30 , sum(case when tran_date between '13OCT2013'd-30 and '13OCT2013'd then iq3 else 0

end) as iq3_30 , max(case when tran_date between '13OCT2013'd-30 and '13OCT2013'd then if4 else 0

end) as if4_30 , sum(case when tran_date between '13OCT2013'd-30 and '13OCT2013'd then iq4 else 0

end) as iq4_30 , max(case when tran_date between '13OCT2013'd-30 and '13OCT2013'd then if5 else 0

end) as if5_30 , sum(case when tran_date between '13OCT2013'd-30 and '13OCT2013'd then iq5 else 0

end) as iq5_30 , max(case when tran_date between '13OCT2013'd-30 and '13OCT2013'd then if6 else 0

end) as if6_30 , sum(case when tran_date between '13OCT2013'd-30 and '13OCT2013'd then iq6 else 0

....

{TONS MORE LOG...}

....

ERROR 22-322: Syntax error, expecting one of the following: ',', LEN, LENGTH.

ERROR 76-322: Syntax error, statement will be ignored.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Yes, is 800 variables the best way to proceed?  How are you going to control such a vast amount of variables.  A brief look at your code, assign a code value to each row, depending on whether the date is between the required e.g.:

case when tran_date between &end_dt.d-&value. and &end_dt.d then num_interactions else 0 end)


You then have a categorised dataset, which you can simply have one sum() and use a group by on the groups you have just created.  Personally I would look again at what you want to acheive as there is likely to be better methods.

D8tAm1n3r_BEN
Calcite | Level 5

I appreciate the input--this is a multi-hundred gigabyte permanent dataset and each operation on it takes roughly an hour or more--a single PROC SQL with windowed variables has proven, over our work with our nearly 20 TB of data, to be the most efficient approach.

We have models that we build which will have starting variable cardinality in the thousands, so the number of variables is not the issue, though the feedback is appreciated.

If anyone has an answer as to why the QUERY in the log that is generated as a result of the macro works perfectly as is, but the macro fails, that would be most appreciated.  Thank you.

D8tAm1n3r_BEN
Calcite | Level 5

To further clarify, I have found a workaround by copying/pasting the code generated in the LOG, so I'm not trying to find alternative solutions.

I'm just dying to know why PROC SQL is having an issue processing the %DO loop generated statements.  We use this technique frequently in our code and have not had this issue before, so we're just curious if anyone knows why.  Thanks!

Astounding
PROC Star

I suspect it's the quoting character introduced by %QSCAN.  SQL sometimes has trouble with parsing when the quoting characters are present.  Switch to %SCAN instead and see if that takes care of the problem.

Good luck.

D8tAm1n3r_BEN
Calcite | Level 5

You, good SAS user, are a genius!  I typically use a CALL EXECUTE in a DATA _NULL_ to control iterative variables but in this case I moved it inside.  Thank you for your helpfulness!

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1043 views
  • 1 like
  • 4 in conversation