BookmarkSubscribeRSS Feed
Ronein
Meteorite | Level 14

Hello friends,

I run the following query in order to calculate max for each group.

Please note that the raw data set is very big (500 million rows).

First question:

I ask to calculate max only for group that is not null or zero.

In resulted data set I see one row with value null (for field Agreement_Account_Id)-WHY???

Second question:

Here are warnings I recieve:

NOTE: SQL generation will be used to perform the initial summarization.
WARNING: The format F was not located on the database. In-database processing will proceed without it.
WARNING: Syntax error or access violation Syntax error: expected something between '(' and the string 'F'.
WARNING: The format F was not located on the database. In-database processing will proceed without it.

What should I do to prevent these warnings?Why these warnings exist?

proc summary data=TeraData.VBM410_ACTIVE_LOAN_BAL(Where=(Agreement_Account_Id not in (0,.)));
var Fund_Actual_Payment_Count;
class Agreement_Account_Id;
output out=VBM410_ACTIVE_LOAN_BAL max=Mis_Tash_Paid;
run;
6 REPLIES 6
JosvanderVelden
SAS Super FREQ
Have you tried to run proc contents to verify columns and formats for the table?
Are you using special options for the NULL teradata table values like for instance dbnull=, nullchar= or nullcharval=?
PaigeMiller
Diamond | Level 26

First question: look at a part of your data WITH YOUR OWN EYES and see if you can figure out why you only get one row in the output. Please look at the log and see what it is telling you.

 

Second question: please learn that we can't help you at all if you show us a partial log, where we cannot see the code as it appears in the log.

--
Paige Miller
ballardw
Super User

It really looks like you are accessing an external DBMS table. Which means that perhaps it doesn't understand what your intent was when passing the . for missing to it.

 

Also, does Teradata (have guess this is the DBMS as you haven't stated) support formats natively? The warning about the F format not found sounds likely to be a pretty fundamental problem especially with this

Syntax error: expected something between '(' and the string 'F'

as there is no place that you are passing a string F  after ( so that may be the symptom from the

Agreement_Account_Id not in (0,.)

 

Just how many unique values of Agreement_Account_Id are there? 500 million "rows" may be a bit much for Class processing and may want to be sorted and use By processing instead.

Reeza
Super User

In a RDMS, null is different than in SAS. 

 

Are you using a Viya installation or some other type that will pass a proc summary to SQL summarization? I cannot recall ever seeing that note for a proc summary/means before.

 

 

Spoiler

@Ronein wrote:

Hello friends,

I run the following query in order to calculate max for each group.

Please note that the raw data set is very big (500 million rows).

First question:

I ask to calculate max only for group that is not null or zero.

In resulted data set I see one row with value null (for field Agreement_Account_Id)-WHY???

Second question:

Here are warnings I recieve:

NOTE: SQL generation will be used to perform the initial summarization.
WARNING: The format F was not located on the database. In-database processing will proceed without it.
WARNING: Syntax error or access violation Syntax error: expected something between '(' and the string 'F'.
WARNING: The format F was not located on the database. In-database processing will proceed without it.

What should I do to prevent these warnings?Why these warnings exist?

proc summary data=TeraData.VBM410_ACTIVE_LOAN_BAL(Where=(Agreement_Account_Id not in (0,.)));
var Fund_Actual_Payment_Count;
class Agreement_Account_Id;
output out=VBM410_ACTIVE_LOAN_BAL max=Mis_Tash_Paid;
run;

Tom
Super User Tom
Super User

You should be able to replace the CLASS statement with a BY statement.

The only difference in the output would be that the CLASS statement without the NWAY option on the PROC SUMMARY statement will include and extra observation with the overall maximum.  If you need it you can generate it by taking the MAX of the resulting summary dataset.

 

So see if that fixes your issue.

 

You might also want to just move the WHERE filtering to apply to the output instead of the input.  There might be a performance hit on the generated Teradata query to summarize those two extra groups, but it should be simple to remove them from the resulting output dataset instead.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 1113 views
  • 0 likes
  • 7 in conversation