Currently, just a few options (they don't all appear when running proc options group= errorhandling allow to decide how message are displayed in the LOG (I am not too sure what the last 2 do...), like:
Issues a warning message for an unresolved macro reference.
Issues a warning message when a macro variable reference does not match a macro variable.
Specifies the maximum size that user-created formats and informat names can be before an error or warning is issued.
SAS issues an error message and stops processing if the SORT procedure attempts to sort a _NULL_ data set.
Specifies the error level to report when a variable is missing from an input data set during the processing of a DROP=, KEEP=, or RENAME= data set option.
Specifies the error level to report when a variable is missing from an output data set during the processing of a DROP=, KEEP=, or RENAME= data set option.
Issues an error message and stops processing when a SAS data set cannot be found.
Specifies the type of message that is issued when MERGE processing occurs without an associated BY statement.
Specifies the type of message to write to the SAS log when a variable is not initialized.
Specifies the type of message to write to the SAS log when the length of the variable that is being read is longer than the length that is defined for the variable.
Specifies the type of message that PROC DS2 generates.
Displays a transcoding error when illegal values are read from a remote application.
SAS issues an error message when a BY variable exists in one data set but not another when the other data set is _NULL_.
We need more ways to customise how messages are displayed in the LOG when some events are flagged.
Some of the messages we consider worthy of a warning (some signal a fact that can't even be fixed!) include:
WARNING: The intervals on the axis labeled XXX are not evenly spaced
WARNING: The enhanced date axis procedure has failed
WARNING: Compression was disabled for data set
They give us a hard time for monitoring automated production jobs.
On the contrary notes like these are flagged as show-stoppers:
NOTE: MERGE statement has more than one data set
NOTE: Library XXX does not exist.
NOTE: SAS went to a new line
NOTE: DATA STEP stopped due...
It would be nice to be able to raise valid alarms and avoid false alarms by setting the type of message for the various messages that SAS produces.
When I make a format, about 99.9% of the time I expect it to be exhaustive of the values that will be encountered. It would be nice to be able to define a format that would throw an error to the log if a value was not found.
proc format; value sex 1='F' 2='M' other=_ERROR_ /*pseudo code*/ ; run; data have; input sex ; cards; 1 1 3 ; run;
When the format is used, it would throw an error to the log. So below PROC step and DATA step would both throw errors, ideally listing the invalid value:
proc freq data=have; tables sex;
format sex sex.; run; data want; set have; sexC=put(sex,sex.); run;
I know there are alternatives, e.g.,
data want; set have; sexC=put(sex,sex.); if sexC="_ERROR_" then put "ERROR: invalid value " sex=; run;
But it would be nice to have a way to say when a format is created, "if any value is outside the domain of values defined in the format, throw an error [or warning or note]."
Using EG is like walking on glass shards, so I will not attempt to list all the deep cuts and all the paper cuts one has to continuously deal with. Still, one is annoying and repetitive enough that I must take the time to raise it here:
When deleting a table, one is sometimes told that the table is locked, as seen above.
The table is not visible in the process tree even though it is opened (without being identified) by one of the programs, and the easiest way is then to go back to the process flow view, search for the table somewhere in the flow, delete its shortcut there, and then one can delete the table proper.
Please provide a prompt instead of this ludicrous "unexpected error".
The TERMSTR option is used to define the record delimiter (End of record marker) on flat files.
This options does not accept hex values, which is very odd as record delimiters normally aren't standard characters.
The syntax TERMSTR='0c'x should be supported.
The example above was chosen because '0c'x is often used by Hadoop files.
As an aside, this whole option could do with a cleanup as in the following example,
data t; file "%sysfunc(pathname(work))/t.txt" termstr='0c'x; put 'A' / 'B' / 'C' ; run;
data _null_; infile "%sysfunc(pathname(work))/t.txt" termstr='0c'; input; putlog _infile_; run;
1- The x is suffix ignored in the first data step's file statement and the delimiter is set to string '0c'.
So the file is 9 bytes long and contains:
2- In the second data step, SAS somehow finds these 2 nonsensical records:
I noticed this for the first time the other day, although it must have been happening for ages. I tried importing a dataset that is an extract from our database, but I ran into an issue (one row of corrupt data threw the whole thing off). I had selected "Replace File if it already exists" and realised that when the import process starts, it deletes the already existing dataset before checking the new data set. This seems backwards to me - shouldn't the new file be checked and verified first, then if everything is Ok, delete the previous SAS dataset? I wanted to see if the corrupt row had been previously imported or if the issue was a new one, and I couldn't do that.
I will now be keeping archived copies of my extracts so I can go back in and check, and will also be renaming my previous SAS dataset to a new name before replacing.
(Tagged with Importing and Data Management as it doesn't really fall under the other tags....)
Currently, one can only quote a string using single or double quotes, ' or " .
Sometimes, other characters may be needed, such as ` for variable names in cloudera's hadoop.
Let's go wild: while we are at it, the function could also be able to trim the quoted string.
A=quote(B, '`', 't');
Let's go wilder: we could make the function even more useful and allow symetrical symbols.
This enables different opening and closing characters, to make it easy to encase a string in brackets or parentheses.
A=quote(B, '<', 's');
In MySQL, you can use the group_concat function to concatenate across rows with a by group. I think this would be a valuable addition to the functions available in proc sql.
create table EXAMPLE as
select ID, STRING, group_concat(STRING) as ALL_STRINGS
group by ID;
It would be great if users could edit shell scripts within Enterprise Guide, as they can be used to automatically kick off SAS programs. If I double-click one such file in my code folder, SAS opens an import text data window.
This would impact .sh, .ksh, and other files. Perhaps other community members can weigh in on extensions they might use? I could see people using .txt files for notes they want to store separate from code.
The datetime output is identical to that of the E8601DTw.d format except for the T separator.
"yyyy-mm-dd hh:mm:ss.ffffff" instead of
A format generating the same output for date (and time?) variables would be useful too as some tools like Impala can only use timestamps, not dates.
I know this can be done using a picture format. Doing so makes the code a lot less portable though.
If I edit a program entry in EG, I have find, replace, and line/column numbers (and perhaps other features).
If I edit the SAS code for a stored process in EG, I get a crippled editor with none of these features.
I have a stored process that's 1000+ lines long. If I get an error in the log, it would be handy to be able to search the code for the offending line. If I'm trying to get the indentation right on a block of code that doesn't fit in the editor, having the column number would help.
In short, have *consistent* editor functionality in both scenarios (as well as any other place in EG where you can edit code).
Currently VA doesn't allow users to use existing aggregated measures when creating new ones (i.e. it cannot perform an aggregation on an aggregation). The only solution is to perform these calculations prior to uploading the data to LASR, which can be prohibitively resource intensive when dealing with a large number of possible variable combinations.
This feature would give designers much more control when creating new measures within VA and provide workarounds for many complex data requests.
It's very helpful to name your graphic images in a consistent way when creating items in SAS/GRAPH with a name statement. Currently we are limited to 8 characters. It would be great to go to 32 characters as for variable and data set names. I'm old enough to remember when we were limited to 5 characters for variable names on some antiquated systems (i.e. UNIVAC) so I'm pretty good at trimming down names but I'd rather not have to!
I think it is long overdue that SAS adds support for at least the .ods file format.
Enterprise Guide for export and import
SAS Studio the same
PROC EXPORT and PROC IMPORT
LIBNAME ODS (working like libname xlsx)
Since the core of the ods file format is compressed xml, just as in the modern Microsoft Office formats, this should not be very hard to achieve.
It would be nice to have a feature in EG that allows you to skip the current data step/procedure while SAS is running entire projects/programs.
Reason being, a certain step might be taking long to run and the results of which don't affect the data steps that follow, so this feature would allow you to skip the current step/procedure and move on with the project/program.
Currently you would have to stop the entire run process, comment out the irrelevant data step(s) and start the run from the beginning again - especially in cases when datasets are being overwritten constantly, as in these cases you have no choice but to run the entire project again from the start if you decide to stop the run mid-way.
As per this thread, users or organizations housing code and git repositories on a remote server cannot integrate their revision control into EG.
While accessing git via the command-line works well and gives you more power, it would be nice to see committing and viewing commit history built-in to EG.
When selecting cells in a cross-tab, it should be possible to copy the selected cells and paste them elsewhere, e.g. Excel, without having to export the entire table. As it is right now, it is possible for report readers to select cells, thus highlighting the cells in blue, but as far as I know this does not give the user any new options. For me being able to select specific cells gives the reader the impression that it is possible to copy the selected cells, like in so many other programs, only to leave them confused or disappointed.
I am currently on version 7.2
If I assume that my RDBMS supports ISO8601 for date, time, and datetime literals (and if it doesn't, it should), then it would be convenient if SAS did the same.
%let date=%sysfunc(date(),e8601da.); %let time=%sysfunc(time(),e8601tm.); %let datetime=%sysfunc(datetime(),e8601dt.); %put &=date; %put &=time; %put &=datetime;
My results as of the time of this post:
27 %put &=date; DATE=2016-08-16 28 %put &=time; TIME=10:32:06 29 %put &=datetime; DATETIME=2016-08-16T10:32:06
I only have SQL Server to test against right now, but these values all "work" for SQL Server:
SELECT CAST('2016-08-16' AS DATE) ,CAST('10:32:06' AS TIME) ,CAST('2016-08-16T10:32:06' AS DATETIME)
But not for SAS:
data _null_; date="&date"d; time="&time"t; datetime="&datetime"dt; put date=date9. time=time. datetime=datetime. ; run;
If SAS supported the ISO8601 standard, this would make the interchange of date, time, and datetime literals easier when querying data between SAS, RDBMS (implicit pass-through) and RDBMS (explicit pass-through).
Right now the below code produces CONTENTS output with no variable labels (or, the var names are repeated in the labels, which is about the same). That's a shame because Teradata has a nice field (and even table) commenting feature that seems to me the exact analogue of SAS' variable labels.
It should be possible to have these labels go into & come back out of the server's metadata tables.
libname td teradata &td_goo ; data with_labels (label = "A nice descriptive label") ; x = "Hey" ; y = "Ho" ; z = "Let's go!" ; label x = "The first word of the chorus" y = "The second word of the chorus" z = "The third word of the chorus" ; run ; data td.labels_gone ; set with_labels ; run ; proc contents data = td.labels_gone ; run ;
I was exploring SAS VA, and created a crosstab section. But the moment i found that there are no data in any intersaction that row disappears from the crosstab. But our management prefers that whether there are data or not, all the rows should be there in the crosstab, either with zero or missing value. I was managed to finally do that by inserting empty rows for all the combination. But the file size becomes very big. Other software (Tableau) that our office is also exploring have this capability with a simple menu option. So, i would suggest that SAS VA should incorporate this functionality so that we can do this kind of crosstab without restructing data.
This idea supersedes if not altogether deprecates this previous suggestion :
Basically, we - I, for that matter - need some way to uniquely identify a SAS table (V7). This kind of checksum identification is already available at the operating system level, if not deeper at the storage stack level (modern mass storage systems sometimes can compute files hashes on the fly in order to optimize their cache memory) or even at the filesystem level (See next-gen https://en.wikipedia.org/wiki/Btrfs ) : think of md5sum or sha1sum Linux bash tools, for instance.
The V7 engine writes, recopies data only with a slight (random ?) variation between source and target. Therefore, computing a hash key of the corresponding files is useless : false negatives results will likely occur, with exact copies (SAS-wise) being misjudged as different (system-wise).
Wouldn't it be useful to be able to copy bit-perfect SAS datasets with Proc COPY, exactly like a 'blind' copy command at OS level (cp, copy, TSO COPY etc.) ?
This kind of feature, moreover, could be enabled by default with a corresponding System Option (COPYPGM, like SORTPGM for Proc Sort/SyncSort).
Therefore, this could perfectly align SAS tables with filesystem *.sas7bdat members, and possibly even speed up copy creation and duplicates identification; storage could also be optimized somehow.