<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Reading Parquet file in Sas 9.4m6 locally in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Reading-Parquet-file-in-Sas-9-4m6-locally/m-p/683265#M206938</link>
    <description>I have local downloaded copy of parquet file on linux server. Instead of connecting to hadoop cluster, i want to read the local version. Is there any way to do this in SAS 9.4m6 ? I do have Sas access to hadoop licensed</description>
    <pubDate>Fri, 11 Sep 2020 15:36:15 GMT</pubDate>
    <dc:creator>vipinj765</dc:creator>
    <dc:date>2020-09-11T15:36:15Z</dc:date>
    <item>
      <title>Reading Parquet file in Sas 9.4m6 locally</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Reading-Parquet-file-in-Sas-9-4m6-locally/m-p/683265#M206938</link>
      <description>I have local downloaded copy of parquet file on linux server. Instead of connecting to hadoop cluster, i want to read the local version. Is there any way to do this in SAS 9.4m6 ? I do have Sas access to hadoop licensed</description>
      <pubDate>Fri, 11 Sep 2020 15:36:15 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Reading-Parquet-file-in-Sas-9-4m6-locally/m-p/683265#M206938</guid>
      <dc:creator>vipinj765</dc:creator>
      <dc:date>2020-09-11T15:36:15Z</dc:date>
    </item>
    <item>
      <title>Re: Reading Parquet file in Sas 9.4m6 locally</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Reading-Parquet-file-in-Sas-9-4m6-locally/m-p/683400#M206987</link>
      <description>&lt;P&gt;Parquet is a binary compressed columnar data storage format. SAS has no means of reading this format directly; SAS can only do it via other applications such as Hive or Impala. This is similar to SAS not being a able to read a SQL Server file directly, it can only do so by using the SQL Server APIs to communicate with SQL Server.&lt;/P&gt;</description>
      <pubDate>Sat, 12 Sep 2020 03:05:09 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Reading-Parquet-file-in-Sas-9-4m6-locally/m-p/683400#M206987</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2020-09-12T03:05:09Z</dc:date>
    </item>
    <item>
      <title>Re: Reading Parquet file in Sas 9.4m6 locally</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Reading-Parquet-file-in-Sas-9-4m6-locally/m-p/683401#M206988</link>
      <description>&lt;P&gt;How did you do the download?&amp;nbsp; Did you download an HDFS file or did you do some kind of export procedure?&amp;nbsp; If you just downloaded an HDFS file, you cannot read it directly with SAS.&amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;However, if the file is small enough to download, why not just read the data using Proc SQL from within SAS and save the data as a SAS dataset?&amp;nbsp; Typically a "Create Table As" (CTAS) style SQL query can be used for this.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;For example:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;PROC SQL NOPRINT;
  CREATE TABLE MyLib.SAS_Data_from_Hadoop AS
    SELECT * FROM Hdp.Some_Hadoop_Table;
QUIT;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;You may need to qualify the Select with a Where of course, and you may need to apply Length and Format statements just as you would with any other SAS dataset.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The above technique, the CTAS style query, is usually the best way to work with data using SAS if you need a local copy of the data.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Jim&lt;/P&gt;</description>
      <pubDate>Sat, 12 Sep 2020 03:18:56 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Reading-Parquet-file-in-Sas-9-4m6-locally/m-p/683401#M206988</guid>
      <dc:creator>jimbarbour</dc:creator>
      <dc:date>2020-09-12T03:18:56Z</dc:date>
    </item>
    <item>
      <title>Re: Reading Parquet file in Sas 9.4m6 locally</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Reading-Parquet-file-in-Sas-9-4m6-locally/m-p/683571#M207072</link>
      <description>&lt;P&gt;Hi Jim,&lt;/P&gt;&lt;P&gt;Thanks for your quick response&lt;/P&gt;&lt;P&gt;The file is a processed file with some ETL's done and then .parquet is created.The .parquet file lands on the landing zone which is connected to linux machine.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Within your give resolution you are trying to use HDP as a library which I think connects to Hadoop cluster.I dont want to connect to Hadoop cluster as I already have parquet file on my&amp;nbsp; storage.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks,Vipin&lt;/P&gt;</description>
      <pubDate>Mon, 14 Sep 2020 06:56:08 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Reading-Parquet-file-in-Sas-9-4m6-locally/m-p/683571#M207072</guid>
      <dc:creator>vipinj765</dc:creator>
      <dc:date>2020-09-14T06:56:08Z</dc:date>
    </item>
    <item>
      <title>Re: Reading Parquet file in Sas 9.4m6 locally</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Reading-Parquet-file-in-Sas-9-4m6-locally/m-p/683574#M207075</link>
      <description>Thanks for your reply!</description>
      <pubDate>Mon, 14 Sep 2020 06:57:53 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Reading-Parquet-file-in-Sas-9-4m6-locally/m-p/683574#M207075</guid>
      <dc:creator>vipinj765</dc:creator>
      <dc:date>2020-09-14T06:57:53Z</dc:date>
    </item>
    <item>
      <title>Re: Reading Parquet file in Sas 9.4m6 locally</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Reading-Parquet-file-in-Sas-9-4m6-locally/m-p/683575#M207076</link>
      <description>&lt;P&gt;&lt;EM&gt;&amp;gt; I dont want to connect to Hadoop cluster&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;You have no choice if you use SAS. SAS cannot read that file.&lt;/P&gt;</description>
      <pubDate>Mon, 14 Sep 2020 06:59:48 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Reading-Parquet-file-in-Sas-9-4m6-locally/m-p/683575#M207076</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2020-09-14T06:59:48Z</dc:date>
    </item>
    <item>
      <title>Re: Reading Parquet file in Sas 9.4m6 locally</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Reading-Parquet-file-in-Sas-9-4m6-locally/m-p/683619#M207083</link>
      <description>&lt;P&gt;My interpretation is that the OP doesn't have a SAS/ACCESS to Hadoop licence, hence using SAS to make the copy would not be an (simple) option.&lt;/P&gt;
&lt;P&gt;Work around would be to make an export to a flat file instead (of course the benefits of Parquet efficient storage is getting lost):&lt;/P&gt;
&lt;P&gt;&lt;A href="https://stackoverflow.com/questions/39419975/how-to-copy-and-convert-parquet-files-to-csv" target="_blank" rel="noopener"&gt;https://stackoverflow.com/questions/39419975/how-to-copy-and-convert-parquet-files-to-csv&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 14 Sep 2020 11:26:55 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Reading-Parquet-file-in-Sas-9-4m6-locally/m-p/683619#M207083</guid>
      <dc:creator>LinusH</dc:creator>
      <dc:date>2020-09-14T11:26:55Z</dc:date>
    </item>
    <item>
      <title>Re: Reading Parquet file in Sas 9.4m6 locally</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Reading-Parquet-file-in-Sas-9-4m6-locally/m-p/683806#M207154</link>
      <description>&lt;P&gt;In OP:&amp;nbsp;&lt;EM&gt;&amp;nbsp;I do have Sas access to hadoop licensed&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;Good link though.&lt;/P&gt;</description>
      <pubDate>Tue, 15 Sep 2020 00:01:22 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Reading-Parquet-file-in-Sas-9-4m6-locally/m-p/683806#M207154</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2020-09-15T00:01:22Z</dc:date>
    </item>
    <item>
      <title>Re: Reading Parquet file in Sas 9.4m6 locally</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Reading-Parquet-file-in-Sas-9-4m6-locally/m-p/684222#M207312</link>
      <description>&lt;P&gt;I jumped to a conclusion, and didn't read properly. Double sins!&lt;/P&gt;
&lt;P&gt;But then it makes no sense why to read local copy wen you can access the Hadoop file directly...?&lt;/P&gt;</description>
      <pubDate>Wed, 16 Sep 2020 14:06:32 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Reading-Parquet-file-in-Sas-9-4m6-locally/m-p/684222#M207312</guid>
      <dc:creator>LinusH</dc:creator>
      <dc:date>2020-09-16T14:06:32Z</dc:date>
    </item>
    <item>
      <title>Re: Reading Parquet file in Sas 9.4m6 locally</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Reading-Parquet-file-in-Sas-9-4m6-locally/m-p/684229#M207316</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13674"&gt;@LinusH&lt;/a&gt;,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If Hadoop's performance is slow, then it might make sense to have a local copy of some sort.&amp;nbsp; However, HDFS data cannot be read by SAS, so it does not make sense, in my opinion, to just copy the HDFS file to one's local machine.&amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Some ideas that might make sense:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;Use SAS and the Hadoop Libname engine to copy the Hadoop table into a local SAS dataset or table.&amp;nbsp; &lt;STRONG&gt;This is the best option&lt;/STRONG&gt; in terms of performance and convenience with SAS.&lt;/LI&gt;
&lt;LI&gt;Export the HDFS data from Hadoop into a csv or other delimited file and then copy the csv file to one's local machine.&amp;nbsp; I think this makes less sense because one then has to re-import the data into SAS, but this would at least work.&amp;nbsp; Trying to read raw HDFS data locally without a local Hadoop instance will not work at all.&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;Jim&lt;/P&gt;</description>
      <pubDate>Wed, 16 Sep 2020 14:17:04 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Reading-Parquet-file-in-Sas-9-4m6-locally/m-p/684229#M207316</guid>
      <dc:creator>jimbarbour</dc:creator>
      <dc:date>2020-09-16T14:17:04Z</dc:date>
    </item>
  </channel>
</rss>

