BookmarkSubscribeRSS Feed

Viya 3.4: BLOBs in CAS

Started ‎12-14-2018 by
Modified ‎12-17-2018 by
Views 3,148

You're working late, responding to a request for information (RFI). Having just answered 42,893 questions (that's what it feels like, right?) and you look to the next one on the list:

 

RFI Question #42894: Does the data storage solution support binary large objects (Blobs)?

Short answer: Yes
Long answer: Keep reading...

 

The VARBINARY Data Type

Though "BLOB" is common parlance, the actual CAS data type for large binary data is "VARBINARY." Theoretically the VARBINARY data type could hold any type of "file" data, e.g. an audio file, an image, a pdf, an archive file, an MP4, whatever. However, in Viya 3.4, you'll only be able to get certain kinds data into it.

 

Image BLOBs

With the 3.3 release, Viya introduced a set of image processing capabilities. Included was the loadimages CAS action. This action loads image files (e.g. jpg, png, dicom) from path CASLibs. So you'll need to transfer or mount your image files to the CAS controller to get them into CAS.

 

Using the loadImages CAS action looks like this:

 

proc cas;
   image.loadImages / caslib=”CASUSER”
     path=”list.txt”
     decode=TRUE
     pathIsList=TRUE
     casout={caslib=”CASUSER” name=”imageTable”, replication=0, replace=true};
   run;
quit;

 

The PATH parameter can point to a file that lists the images (as shown) or it can be left blank and CAS will load any image files it finds in the CASLib DataSource location.

 

Once loaded, the image blobs are placed into the _image_ field inside the target CAS table:

 

1CASImageBlob.png

 

Audio BLOBs

With the 3.4 release, Viya introduced a set of audio processing capabilities. Like the imaging functionality, this audio package contains its own load CAS action, loadAudio. Again the action uses only path CASlibs. So, like with image files, you'll need to transfer or mount your files to the CAS controller to get them into CAS.

 

Using the loadAudio CAS action looks similar to the loadImages action:

 

proc cas;
   audio.loadAudio / caslib=”CASUSER”
     path=”list.txt”
     casout={caslib=”CASUSER” name=”audioTable”, replication=0, replace=true};
   run;
quit;

 

Once loaded, the audio blobs are placed into a VARBINARY field inside the target CAS table like with images.

 

Non-Image, Non-Audio Blobs?

What about other file types, e.g. PDFs, xls, doc, ...? Can we load those?

 

Here, again, the short answer is yes but with some reservations.

 

Again, the longer answer is below.

 

Loaded "Documents" are saved as VARCHAR

In Viya 3.4, the loadTable CAS Action can load complex text document formats like PDFs, Word docs, and PPTs when used with the FileType="DOCUMENT" option. (This manifests in the User Interface as the Documents Directory Import)

 

You might think these would come in as BLOBs (VARBINARY fields). However, they are actually converted to text and stored as VARCHAR. This is great for text analytics but if your goal is to keep the document as is, then this will not help you. As any desktop app will tell you, you lose formatting and document metadata when you convert a document to text.

 

2SaveAsTextSmall.png

 

loadImages and loadAudio won't load non-Image and non-Audio files

Also, you can't bring in non-Image and non-Audio files with the respective loadImage or loadAudio CAS actions. These actions simply ignore any file types that don't meet their input requirements.

 

No Connector support for BLOBs

In Viya 3.4, no connector supports BLOBs -- images, audio, or otherwise. So if you want to bring in audio or image files from a database, you'll have to stage them on the CAS controller file system and load them using either the loadImage or loadAudio CAS actions.

 

Working with BLOBs in CAS

Only a limited set of CAS actions support VARBINARY columns in Viya 3.4 -- essentially only the imagining, audioBioMedImage, as well as the ASTORE action set, and possibly a few more.

 

The majority of CAS actions, procedures, and DATA Step do not support tables with VARBINARY fields and will error in various ways if they encounter them.

 

The Wider Story: Viya 3.4 Support for Binary Data

While CAS only offers limited support for BLOBs, Viya 3.4 (as a whole) offers a considerable amount of functionality for binary data. Considering CAS' BLOB capabilities along with Visual Analytics embedded content capabilities as well as CAS' document conversion capabilities, Viya 3.4's binary data support is robust.

 

Here are just some of the ways, Viya 3.4 can utilize binary (file) data. Many of which have been mentioned already:

So, while you might not be able to load all BLOB files into CAS, between Visual Analytics and CAS, Viya can meet most business requirements around binary data.

Version history
Last update:
‎12-17-2018 10:53 AM
Updated by:
Contributors

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Labels
Article Tags