My apologies to fellow Clue fans out there for the lame title but it describes perfectly a recent exchange I had with a long-time friend. My friend was working with a customer who receives data in the form of pipe-delimited ('|') values and when they dropped these files into their autoload directory, VA did not import them as expected. With a little sleuthing, we found an easy LASR library configuration change that allowed us to specify the delimiter to use when loading his data and all was well.
For testing purposes, we created a pipe delimited version of the infamous SASHELP.CLASS data set, named it PIPE.TXT, and dropped it into our autoload directory. When we examined the SAS data set that resulted from the autoload import, it was clear that the pipe character was not being recognized as a delimiter for we saw our data set had but one character variable that contained each line of data of our file as a single value.
We reviewed the autoload log, found where the file was being imported, and saw that the step expected the file to be TAB delimited.
A little more digging uncovered that when deciding what to do with external files dropped into the autoload directory, great importance is given to the file name extension. The processing code checks for some obvious values and assumes the following:
Clearly VA was using TAB as the default delimiter so we went looking for a place we could set that to use a pipe character instead of a TAB.
It turns out that there is an extended attribute on the LASR library associated with each autoload location that allows the administrator to specify the default delimiter. The attribute is VA.AutoLoad.Import.Delimiter.TXT and as the name implies, it determines which delimiter to assume for a .TXT file processed during autoload. The default setting was TAB so we simply edited the value to use a pipe as shown below and saved our setting.
We deleted the metadata and files from our first run and the next time autoload executed, we saw the results we wanted.
Because each autoload location is associated with a single LASR library and each LASR library can be configured with only one default delimiter, administrators of large systems with varying data requirements may have to configure separate autoload libraries for data that is delimited with non-standard separators. Fortunately, the SAS Deployment Manager makes the job of creating additional autoload directories quite easy to do.
Knowing a little more about how names of files dropped in the autoload location affect data import and how to set the default delimiter for .TXT files, administrators have the flexibility to make sure users get what they expect from self-service data loading.
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.