<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Parsing of a fix length File in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Parsing-of-a-fix-length-File/m-p/30492#M5826</link>
    <description>Hi All.&lt;BR /&gt;
&lt;BR /&gt;
I want to create a dataset by parsing a fix length file and where each dataline contains the value for specific variables and not for all always.&lt;BR /&gt;
&lt;BR /&gt;
e.g.&lt;BR /&gt;
The target dataset column structure is:&lt;BR /&gt;
&lt;BR /&gt;
Row# Type Name1 Name2 Name3 Amount1 Amount2 Amount3&lt;BR /&gt;
&lt;BR /&gt;
&lt;BR /&gt;
The corresponding datalines in the file appears as: &lt;BR /&gt;
&lt;BR /&gt;
A,John,1000&lt;BR /&gt;
B,Mac,1200&lt;BR /&gt;
C,Amar,1100&lt;BR /&gt;
B,Jack,1300&lt;BR /&gt;
B,Maria,1200&lt;BR /&gt;
A,Meg,1400&lt;BR /&gt;
C,Rohan,100&lt;BR /&gt;
.&lt;BR /&gt;
.&lt;BR /&gt;
.&lt;BR /&gt;
&lt;BR /&gt;
&lt;BR /&gt;
Rules:&lt;BR /&gt;
* Each line contains one record for a dataset.&lt;BR /&gt;
* Each line contains the data for only one Name variable and one Amount variable at the fix position.&lt;BR /&gt;
* Based on the value of first variable(Type), the next values should get assigned to the corresponding variables as follow -&lt;BR /&gt;
&lt;BR /&gt;
Type A --&amp;gt; Name1, Amout1 (rest all the variables in the dataset will be null)&lt;BR /&gt;
Type B --&amp;gt; Name2, Amout2 (rest all the variables in the dataset will be null)&lt;BR /&gt;
Type C --&amp;gt; Name3, Amout3 (rest all the variables in the dataset will be null)&lt;BR /&gt;
&lt;BR /&gt;
* The datalines are shuffled and not in sequence of A,B,c compulsarily.&lt;BR /&gt;
* The no. of dataliens are in millions.&lt;BR /&gt;
&lt;BR /&gt;
The final dataset must look like - &lt;BR /&gt;
&lt;BR /&gt;
Dataset:&lt;BR /&gt;
&lt;BR /&gt;
Row#	Type	Name1	Name2	Name3	Amount1	Amount2	Amount3&lt;BR /&gt;
1	A	John	&lt;NULL&gt;	&lt;NULL&gt;	1000	&lt;NULL&gt;	&lt;NULL&gt;&lt;BR /&gt;
2	B	&lt;NULL&gt;	Mac	&lt;NULL&gt;	&lt;NULL&gt;	1200	&lt;NULL&gt;&lt;BR /&gt;
3	C	&lt;NULL&gt;	&lt;NULL&gt;	Amar	&lt;NULL&gt;	&lt;NULL&gt;	1100&lt;BR /&gt;
4	B	&lt;NULL&gt;	Jack	&lt;NULL&gt;	&lt;NULL&gt;	1300	&lt;NULL&gt;&lt;BR /&gt;
5	B	&lt;NULL&gt;	Maria	&lt;NULL&gt;	&lt;NULL&gt;	1200	&lt;NULL&gt;&lt;BR /&gt;
6	A	Meg	&lt;NULL&gt;	&lt;NULL&gt;	1400	&lt;NULL&gt;	&lt;NULL&gt;&lt;BR /&gt;
7	C	&lt;NULL&gt;	&lt;NULL&gt;	Rohan	&lt;NULL&gt;	&lt;NULL&gt;	1000&lt;BR /&gt;
.&lt;BR /&gt;
.&lt;BR /&gt;
.&lt;BR /&gt;
&lt;BR /&gt;
&lt;BR /&gt;
&lt;BR /&gt;
Solution:&lt;BR /&gt;
In order to achieve above requirement, I have used the trailing "@" feature in the following code which is working quite ok.&lt;BR /&gt;
&lt;BR /&gt;
   Data &lt;LIBNAME&gt;.myDataset;&lt;BR /&gt;
      Infile &lt;FILENAME&gt;;&lt;BR /&gt;
      Input  Type $1 @;&lt;BR /&gt;
         If Type = "A" then&lt;BR /&gt;
            Input Name1 $ Amount1;&lt;BR /&gt;
         If Type = "B" then&lt;BR /&gt;
            Input Name2 $ Amount2;&lt;BR /&gt;
         If Type = "C" then&lt;BR /&gt;
            Input Name3 $ Amount3;&lt;BR /&gt;
   run;&lt;BR /&gt;
&lt;BR /&gt;
Query:&lt;BR /&gt;
Just want to confirm it it is the only option to deal with such scenario. Considering the processing of million records, is there any other better alternative to achieve the above task. &lt;BR /&gt;
&lt;BR /&gt;
Any help regarding this will be highly appreciated.&lt;BR /&gt;
&lt;BR /&gt;
regards&lt;BR /&gt;
Kapil Agrawal&lt;/FILENAME&gt;&lt;/LIBNAME&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;</description>
    <pubDate>Thu, 10 Jul 2008 14:37:58 GMT</pubDate>
    <dc:creator>deleted_user</dc:creator>
    <dc:date>2008-07-10T14:37:58Z</dc:date>
    <item>
      <title>Parsing of a fix length File</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Parsing-of-a-fix-length-File/m-p/30492#M5826</link>
      <description>Hi All.&lt;BR /&gt;
&lt;BR /&gt;
I want to create a dataset by parsing a fix length file and where each dataline contains the value for specific variables and not for all always.&lt;BR /&gt;
&lt;BR /&gt;
e.g.&lt;BR /&gt;
The target dataset column structure is:&lt;BR /&gt;
&lt;BR /&gt;
Row# Type Name1 Name2 Name3 Amount1 Amount2 Amount3&lt;BR /&gt;
&lt;BR /&gt;
&lt;BR /&gt;
The corresponding datalines in the file appears as: &lt;BR /&gt;
&lt;BR /&gt;
A,John,1000&lt;BR /&gt;
B,Mac,1200&lt;BR /&gt;
C,Amar,1100&lt;BR /&gt;
B,Jack,1300&lt;BR /&gt;
B,Maria,1200&lt;BR /&gt;
A,Meg,1400&lt;BR /&gt;
C,Rohan,100&lt;BR /&gt;
.&lt;BR /&gt;
.&lt;BR /&gt;
.&lt;BR /&gt;
&lt;BR /&gt;
&lt;BR /&gt;
Rules:&lt;BR /&gt;
* Each line contains one record for a dataset.&lt;BR /&gt;
* Each line contains the data for only one Name variable and one Amount variable at the fix position.&lt;BR /&gt;
* Based on the value of first variable(Type), the next values should get assigned to the corresponding variables as follow -&lt;BR /&gt;
&lt;BR /&gt;
Type A --&amp;gt; Name1, Amout1 (rest all the variables in the dataset will be null)&lt;BR /&gt;
Type B --&amp;gt; Name2, Amout2 (rest all the variables in the dataset will be null)&lt;BR /&gt;
Type C --&amp;gt; Name3, Amout3 (rest all the variables in the dataset will be null)&lt;BR /&gt;
&lt;BR /&gt;
* The datalines are shuffled and not in sequence of A,B,c compulsarily.&lt;BR /&gt;
* The no. of dataliens are in millions.&lt;BR /&gt;
&lt;BR /&gt;
The final dataset must look like - &lt;BR /&gt;
&lt;BR /&gt;
Dataset:&lt;BR /&gt;
&lt;BR /&gt;
Row#	Type	Name1	Name2	Name3	Amount1	Amount2	Amount3&lt;BR /&gt;
1	A	John	&lt;NULL&gt;	&lt;NULL&gt;	1000	&lt;NULL&gt;	&lt;NULL&gt;&lt;BR /&gt;
2	B	&lt;NULL&gt;	Mac	&lt;NULL&gt;	&lt;NULL&gt;	1200	&lt;NULL&gt;&lt;BR /&gt;
3	C	&lt;NULL&gt;	&lt;NULL&gt;	Amar	&lt;NULL&gt;	&lt;NULL&gt;	1100&lt;BR /&gt;
4	B	&lt;NULL&gt;	Jack	&lt;NULL&gt;	&lt;NULL&gt;	1300	&lt;NULL&gt;&lt;BR /&gt;
5	B	&lt;NULL&gt;	Maria	&lt;NULL&gt;	&lt;NULL&gt;	1200	&lt;NULL&gt;&lt;BR /&gt;
6	A	Meg	&lt;NULL&gt;	&lt;NULL&gt;	1400	&lt;NULL&gt;	&lt;NULL&gt;&lt;BR /&gt;
7	C	&lt;NULL&gt;	&lt;NULL&gt;	Rohan	&lt;NULL&gt;	&lt;NULL&gt;	1000&lt;BR /&gt;
.&lt;BR /&gt;
.&lt;BR /&gt;
.&lt;BR /&gt;
&lt;BR /&gt;
&lt;BR /&gt;
&lt;BR /&gt;
Solution:&lt;BR /&gt;
In order to achieve above requirement, I have used the trailing "@" feature in the following code which is working quite ok.&lt;BR /&gt;
&lt;BR /&gt;
   Data &lt;LIBNAME&gt;.myDataset;&lt;BR /&gt;
      Infile &lt;FILENAME&gt;;&lt;BR /&gt;
      Input  Type $1 @;&lt;BR /&gt;
         If Type = "A" then&lt;BR /&gt;
            Input Name1 $ Amount1;&lt;BR /&gt;
         If Type = "B" then&lt;BR /&gt;
            Input Name2 $ Amount2;&lt;BR /&gt;
         If Type = "C" then&lt;BR /&gt;
            Input Name3 $ Amount3;&lt;BR /&gt;
   run;&lt;BR /&gt;
&lt;BR /&gt;
Query:&lt;BR /&gt;
Just want to confirm it it is the only option to deal with such scenario. Considering the processing of million records, is there any other better alternative to achieve the above task. &lt;BR /&gt;
&lt;BR /&gt;
Any help regarding this will be highly appreciated.&lt;BR /&gt;
&lt;BR /&gt;
regards&lt;BR /&gt;
Kapil Agrawal&lt;/FILENAME&gt;&lt;/LIBNAME&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;</description>
      <pubDate>Thu, 10 Jul 2008 14:37:58 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Parsing-of-a-fix-length-File/m-p/30492#M5826</guid>
      <dc:creator>deleted_user</dc:creator>
      <dc:date>2008-07-10T14:37:58Z</dc:date>
    </item>
    <item>
      <title>Re: Parsing of a fix length File</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Parsing-of-a-fix-length-File/m-p/30493#M5827</link>
      <description>Hi Kapil.&lt;BR /&gt;
This chaning-style row structure can only be imported in the way you discribe. I can just add a little change to your program, saving a few seconds on running time : change the IF to a SELECT statement.&lt;BR /&gt;
[pre]&lt;BR /&gt;
Data &lt;LIBNAME&gt;.myDataset;&lt;BR /&gt;
Infile &lt;FILENAME&gt;;&lt;BR /&gt;
Input Type $1 @;&lt;BR /&gt;
SELECT (type) ;&lt;BR /&gt;
WHEN("A") Input Name1 $ Amount1;&lt;BR /&gt;
WHEN("B") Input Name2 $ Amount2;&lt;BR /&gt;
WHEN("C") Input Name3 $ Amount3;&lt;BR /&gt;
OTHERWISE INPUT ;&lt;BR /&gt;
run;&lt;BR /&gt;
[/pre]&lt;BR /&gt;
Regards.&lt;BR /&gt;
Olivier&lt;/FILENAME&gt;&lt;/LIBNAME&gt;</description>
      <pubDate>Fri, 11 Jul 2008 08:09:45 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Parsing-of-a-fix-length-File/m-p/30493#M5827</guid>
      <dc:creator>Olivier</dc:creator>
      <dc:date>2008-07-11T08:09:45Z</dc:date>
    </item>
  </channel>
</rss>

