BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
PatMalarkey
Obsidian | Level 7

All -

I've seen here in Communities that this error happens when using PROC EXPORT and when reading from an Excel file using LIBNAME, but I'm trying to write to Excel. 

 

Note that the Excel file was created and that I can open it.  The problem is that the shl script I use to run the program will send a "Job ended in error" email to the user.

 

Any ideas?  Thanks!

 

LOG Snippet:

 

NOTE: Libref XLOUT was successfully assigned as follows:
Engine: XLSX
Physical Name: /fmacdata/utility/npa/dev/Profiling/Monthly/Output/Profiling_1910_1936.xlsm
3217 /*****************************************************************************/
3218 /* Send the profile data to XL */
3219 /*****************************************************************************/
3220
3221 data xlout.NUM_ALL;
3222 length id_bus_execn $9.;
3223 set fat.NUM_ALL_&ID_PRIOR_EXECN
3224 fat.NUM_ALL_&id_bus_execn
3225 fat.rolling_&benchmark.mos;
3226 run;

ERROR: string is duplicated |><t/>|
NOTE: There were 15876 observations read from the data set FAT.NUM_ALL_1910.
NOTE: There were 16200 observations read from the data set FAT.NUM_ALL_1936.
NOTE: There were 16848 observations read from the data set FAT.ROLLING_12MOS.
NOTE: The data set XLOUT.NUM_ALL has . observations and 17 variables.
NOTE: The export data set has 48924 observations and 17 variables.
NOTE: DATA statement used (Total process time):
real time 3.46 seconds
cpu time 3.39 seconds

1 ACCEPTED SOLUTION

Accepted Solutions
PatMalarkey
Obsidian | Level 7

I stumbled onto what I think is the solution and thought I'd share it - very interesting and worth thinking about if you get the same error.

 

Log snippet (from above):

3221 data xlout.NUM_ALL;
3222 length id_bus_execn $9.;
3223 set fat.NUM_ALL_&ID_PRIOR_EXECN
3224 fat.NUM_ALL_&id_bus_execn
3225 fat.rolling_&benchmark.mos;
3226 run;

 

All three datasets have a column, disclosed_element, that is created by putting a column named variable into a format I created $LLD.  put(variable,$LLD.).  For the green datasets, $LLD. translates variable into a Yes/No flag; for the red dataset (from an earlier iteration of development), $LLD. translates variable into verbiage, for example:  NET SALES PROCEEDS.

 

When I went back and updated the red dataset so that it also had disclosed_element as a Yes/No flag, the problem went away.

 

Not sure what's going on here under the covers and I found the solution by accident.  😕

 

 

View solution in original post

12 REPLIES 12
Tom
Super User Tom
Super User

Is the FAT libref using the XLSX libname engine also?

 

Can you tell which of the three datasets is generating the error?  Try reading them individually instead of all at once.  Try writing them individually to the separate sheets.  Try to isolate which record is causing the issue.  Or which variable?  Or perhaps the individual value?

 

What encoding is the data using?

What encoding is your SAS session using?

 

Can you create a minimal example dataset that replicates the error?

PatMalarkey
Obsidian | Level 7

Hi, Tom!

FAT is a SAS library and the LIBNAME points to a Unix directory - it's not using the XLSX engine.

I will try reading the datasets one at a time and see if that tells me anything.

Not sure how to see what encoding either the data or SAS session are using?

 

Thanks!

 

Tom
Super User Tom
Super User

You can open a ticket with SAS support and they might be able to help.

 

You can try using binary search to see if you can isolate the data point that is causing the issue.

Try each dataset individually.  If one produces the issue then concentrate on that.

 

Divide the data in half and try each half. Concentrate on the that produces the error. Repeat until you have the point.

 

You can divide by observations using FIRSTOBS= and OBS= dataset options.

You can divide by variables using the KEEP= or DROP= dataset options.

PatMalarkey
Obsidian | Level 7
But I'm not using EXPORT. I'm writing to an Excel file that I've allocated with a LIBNAME statement. Unless they're the same under the covers...
SASKiwi
PROC Star

If you are using SAS 9.4M3 or M4 then I'd say its the same problem. You can always check this with SAS Tech Support.

 

Try PROC EXPORT as well and see if you get the same error. If so it will be the same bug.

PatMalarkey
Obsidian | Level 7

I stumbled onto what I think is the solution and thought I'd share it - very interesting and worth thinking about if you get the same error.

 

Log snippet (from above):

3221 data xlout.NUM_ALL;
3222 length id_bus_execn $9.;
3223 set fat.NUM_ALL_&ID_PRIOR_EXECN
3224 fat.NUM_ALL_&id_bus_execn
3225 fat.rolling_&benchmark.mos;
3226 run;

 

All three datasets have a column, disclosed_element, that is created by putting a column named variable into a format I created $LLD.  put(variable,$LLD.).  For the green datasets, $LLD. translates variable into a Yes/No flag; for the red dataset (from an earlier iteration of development), $LLD. translates variable into verbiage, for example:  NET SALES PROCEEDS.

 

When I went back and updated the red dataset so that it also had disclosed_element as a Yes/No flag, the problem went away.

 

Not sure what's going on here under the covers and I found the solution by accident.  😕

 

 

Tom
Super User Tom
Super User

So was the fix to the content of the variable?  What content did it have before instead of Yes/No?  Did you try just writing another file with that content to XLSX engine? Did it cause the same issue?

 

Or was the fix to something about the structure of the dataset?  The length of the variable?  The display format attached to it?

 

One thing to look at is to look at the lengths of that variable in the three input datasets and whether that variable has formats attached to it also.   So when you combine multiple datasets like that SAS will set the type and storage length based on how it first appears.  It will attach the first non-empty format that it sees to the variable.

 

Here are some of the problems that can cause.

If the first instance has a shorter length than the others then the values could get truncated.  

If the first format attached is shorter than the first variables length then the printing of the values could be truncated, even if the stored values are not.  I am not sure if the format attached to a variable impacts how the variable is converted into XML to store into the XLSX file when using the XLSX libname engine, but if if causes the engine to generate invalid XML that might be the source of that error message.

 

PatMalarkey
Obsidian | Level 7

The answers to your questions are, for the most part, in my post.

Tom
Super User Tom
Super User

@PatMalarkey wrote:

The answers to your questions are, for the most part, in my post.


Not really.  All you said is that you change the format used in the PUT() statement you used to populate the value of the variable.  Not much about how the actual variable are defined. We can assume they are character, but we don't know the length or if any formats where attached to them.  We don't really know that values changes.

 

Glad you got it to work, but I am not sure if we really know what it is we need to avoid to prevent it from happening to others.

PatMalarkey
Obsidian | Level 7
So was the fix to the content of the variable?
All three datasets have a column, disclosed_element,All three datasets have a column, disclosed_element, that is created by putting a column named variable into a format I created $LLD. put(variable,$LLD.). For the green datasets, $LLD. translates variable into a Yes/No flag; for the red dataset (from an earlier iteration of development), $LLD. translates variable into verbiage, for example: NET SALES PROCEEDS. So, yes, the fix was to the content of the variable.

What content did it have before instead of Yes/No? for example: NET SALES PROCEEDS

Did you try just writing another file with that content to XLSX engine? Did it cause the same issue? Why do this?

Let's peace out on this one.
Tom
Super User Tom
Super User

Like I said I am glad your problem is gone.  But I have not learned anything about how to avoid the issue.  I cannot recreate the issue from your description.  So I was hoping you might have the time to research it more so that others could benefit from what you discover.

828   libname out clear;
NOTE: Libref OUT has been deassigned.

829   data one;
830     x='NET SALES PROCEEDS';
831   run;

NOTE: The data set WORK.ONE has 1 observations and 1 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds


832   data two;
833     x='yes';
834   run;

NOTE: The data set WORK.TWO has 1 observations and 1 variables.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds


835   data three;
836     x='no';
837   run;

NOTE: The data set WORK.THREE has 1 observations and 1 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds


838
839   libname out xlsx 'c:\downloads\test_strings.xlsx';
NOTE: Libref OUT was successfully assigned as follows:
      Engine:        XLSX
      Physical Name: c:\downloads\test_strings.xlsx
840
841   data out.sheet1;
842     set one two three;
843   run;

NOTE: There were 1 observations read from the data set WORK.ONE.
NOTE: There were 1 observations read from the data set WORK.TWO.
NOTE: There were 1 observations read from the data set WORK.THREE.
NOTE: The data set OUT.sheet1 has 3 observations and 1 variables.
NOTE: The export data set has 3 observations and 1 variables.
NOTE: DATA statement used (Total process time):
      real time           0.06 seconds
      cpu time            0.03 seconds


844
845   data out.sheet2;
846     set three two one;
847   run;

WARNING: Multiple lengths were specified for the variable x by input data set(s). This can cause truncation of data.
NOTE: There were 1 observations read from the data set WORK.THREE.
NOTE: There were 1 observations read from the data set WORK.TWO.
NOTE: There were 1 observations read from the data set WORK.ONE.
NOTE: The data set OUT.sheet2 has 3 observations and 1 variables.
NOTE: The export data set has 3 observations and 1 variables.
NOTE: DATA statement used (Total process time):
      real time           0.05 seconds
      cpu time            0.03 seconds


848
849   libname out clear;
NOTE: Libref OUT has been deassigned.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 1011 views
  • 4 likes
  • 3 in conversation