* proc delete / proc dataset - delete / proc sql drop table - none of them are deleting the underlying hdfs file * What I think is, there is an issue whe using proc sql -drop / proc delete / proc dataset with hive tables. My theory is below. According to Hive documentation. If you drop an EXTERNAL TABLE, the Hive engine will drop the table metadata and does not delete the hdfs data. If you drop a MANAGED TABLE, the Hive engine will drop the table metadata and deletes the hdfs data. According to SAS documentation. DBCREATE_TABLE_EXTERNAL=YES -> creates an external table—one that is stored outside of the Hive warehouse. DBCREATE_TABLE_EXTERNAL=NO -> creates a managed table—one that is managed within the Hive warehouse. Source : http://support.sas.com/documentation/cdl/en/acreldb/69580/HTML/default/viewer.htm#n0k3b8dw0vz3jxn1jjqouodozcqc.htm By default the DBCREATE_TABLE_EXTERNAL is NO, which means SAS will create a managed table i.e. Deleting the table should drop both metadata and deletes the hdfs data. But I think this is not the case (at least in my case), the default option is dropping the hive table structure and not the underlying hdfs file using sas procs. It works using "sql pass through" using "purge" option. Note : In the libname, I also have hive.warehouse.data.skipTrash to true and also tried setting the DBCREATE_TABLE_EXTERNAL=NO in the data step.
... View more