Does data need to be ingested through SAS data loader in order to provide all the features listed here: (browse data, DQ profiling, etcC)
Or can it just be integrated on an existing hadoop platform with data in hdfs. We already have an ingestion/ETL tool for all our data in and out of hadoop so not looking for another one.
Short answer: yes.
More elaborately: ETL is usefully a tool for a centrally managed repository, such as a data warehouse- A place where everything is integrated, and possible to audit. Big Data on the other hand is a more free environment, where you should be able to quickly analyze different kind of data. Ad hoc analysis and, perhaps even ad hoc loading of data. So this is where I think Data Loader is positioned. It's not positioned as an ETL tool (but technically, it is kinda).
Assuming your Hadoop environment meets the specifications for SAS Data Loader, you don't actually need to ingest data through SAS Data Loader to use its features. SAS Data Loader can work with data that has been put in Hadoop by any tool as long as the data has been registered in Hive. More specifically, you need to have the data layout described in HCatalog in Hive. If you have data in HDFS that has not been described in Hive you can use tools like Hue to describe the data in HDFS.
Ron
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.