Hi, I have successfully loaded data into a hdfs file as a txt/ pipe delimited. My only problem is that the name of the file is not changed.
HDFS path: /home/user/data/
The name of the file is.....
sasdata-2018-09-28-12-32....dlv
I want it to be...
my_file_info.txt
Every time the code is ran, the former name changes, but still keeps the sasdata: prefix. How do I, from SAS, change the name of the .dlv file and rename it to my_file_info.txt, while accounting for the sasdata: name changes?
proc sql;
connect to hadoop (server="&serv" port=&portn. user="&user" password="&dbpass"
SCHEMA=lw subprotocol=hive2 DBMAX_TEXT=&mxtext.);
execute(create external table my_file_info(
num_cd string)
row format delimited fields terminated by '|'
LOCATION "/home/user/data/") by hadoop;
disconnect from hadoop;
quit;
libname lw hadoop server="&serv" port=&portn. user="&user" password="&dbpass"
SCHEMA=lw_research subprotocol=hive2 DBMAX_TEXT=&mxtext.
;
proc sql;
insert into lw.my_file_info select * from &processlib.mytable;
quit;
Thank you for your help!
I no longer have access to Hadoop, so my help will be limited.
When is the dlv file created? When you run insert into?
Why do you care what the file name is? This is internal Hadoop sausage-making isn't it? You only access the metadata layer (table my_file_info) don't you?
You are right. A lot of internal sausage-making. If it were up to me, I'd keep it as it is.
Sorry, I can't help further. It is a bit too long ago to remember if I went to look at the actual file names, though I suspect I was never interested. Ask tech support if no one here knows.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.