<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Importing an unusual csv file - revisited in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Importing-an-unusual-csv-file-revisited/m-p/504892#M135176</link>
    <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/40779"&gt;@MFLoGrasso&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;So, I have a situation that is similar to &lt;A href="https://communities.sas.com/t5/General-SAS-Programming/How-can-I-import-data-from-this-unusual-csv-file/m-p/170472/highlight/true#M11099" target="_blank"&gt;the problem posted here&lt;/A&gt;.&amp;nbsp; The issue that I have is that whereas the OP has three different files that are already grouped together in the original csv, I have multiple "entries" of the same four files stacked on top of each other, and each time the next group of data for a particular file arises, the n for that particular group may be different.&amp;nbsp; I have attached one data file that is typical of what I am working with.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Within this csv, the first three variables (UNITID, SURVSECT, PART) are always the same.&amp;nbsp; &lt;FONT color="#ff0000"&gt;&lt;STRONG&gt;I need to create a separate file for each unique combination of SURVSECT and PART (in this case, COM A, COM B, COM C, and COM D).&amp;nbsp;&lt;/STRONG&gt; &lt;/FONT&gt;The file is set up to give all&amp;nbsp;four parts&amp;nbsp;for each UNITID before looping back to part A for the next UNITID.&amp;nbsp; I have a sense of how I THINK this needs to be approached, but I can't fully wrap my head around the code I would need to accomplish this task.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;What do you gain by creating separate "files" (you do mean SAS data sets don't you?).&lt;/P&gt;
&lt;P&gt;BY group processing with a single dataset would likely be easier.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;N of what? The records in the data set or a variable in the data set?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;It appears that what you need to do is parse the incoming line to see which read pattern you need to follow up with.&lt;/P&gt;
&lt;P&gt;You can do that with&lt;/P&gt;
&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/187967"&gt;@input&lt;/a&gt; @;&lt;/P&gt;
&lt;P&gt;If _infile_ =:&amp;nbsp; &amp;lt;a string of values from the data that uniquely identifies a read pattern&amp;gt; then&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp; input &amp;lt;the variables of that list&amp;gt;;&lt;/P&gt;
&lt;P&gt;Else if _infile_ =: &amp;lt;another string of values&amp;gt; then&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp; input &amp;lt;the other variables&amp;gt;;&lt;/P&gt;
&lt;P&gt;&amp;lt;repeat as needed&amp;gt;.&lt;/P&gt;
&lt;P&gt;The strings to compare would be easy to copy from the imbedded header lines using a &lt;STRONG&gt;TEXT EDITOR. &lt;/STRONG&gt;Under no circumstances attempt to copy from the csv file using Excel or other spreadsheet.&lt;/P&gt;</description>
    <pubDate>Tue, 16 Oct 2018 20:57:50 GMT</pubDate>
    <dc:creator>ballardw</dc:creator>
    <dc:date>2018-10-16T20:57:50Z</dc:date>
    <item>
      <title>Importing an unusual csv file - revisited</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Importing-an-unusual-csv-file-revisited/m-p/504871#M135167</link>
      <description>&lt;P&gt;So, I have a situation that is similar to &lt;A href="https://communities.sas.com/t5/General-SAS-Programming/How-can-I-import-data-from-this-unusual-csv-file/m-p/170472/highlight/true#M11099" target="_blank"&gt;the problem posted here&lt;/A&gt;.&amp;nbsp; The issue that I have is that whereas the OP has three different files that are already grouped together in the original csv, I have multiple "entries" of the same four files stacked on top of each other, and each time the next group of data for a particular file arises, the n for that particular group may be different.&amp;nbsp; I have attached one data file that is typical of what I am working with.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Within this csv, the first three variables (UNITID, SURVSECT, PART) are always the same.&amp;nbsp; I need to create a separate file for each unique combination of SURVSECT and PART (in this case, COM A, COM B, COM C, and COM D).&amp;nbsp; The file is set up to give all&amp;nbsp;four parts&amp;nbsp;for each UNITID before looping back to part A for the next UNITID.&amp;nbsp; I have a sense of how I THINK this needs to be approached, but I can't fully wrap my head around the code I would need to accomplish this task.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;Read the first line as the list of variables for a temp data set (named temp)&lt;/LI&gt;
&lt;LI&gt;Add lines to temp until the next instance of a line beginning with UNITID&lt;/LI&gt;
&lt;LI&gt;Assign the SURVSECT and PART values (which will be the same for every observation in temp) to macro variables, e.g. &amp;amp;surv and &amp;amp;part (%IF _N_ = 1 %THEN %DO; %LET SURV = SURVSECT; %LET PART = PART; %END;)&lt;/LI&gt;
&lt;LI&gt;Append the temp set to a table named &amp;amp;surv_&amp;amp;part (PROC APPEND BASE = &amp;amp;SURV_&amp;amp;PART DATA = TEMP; RUN;)&lt;/LI&gt;
&lt;LI&gt;Delete the temp set (PROC DELETE DATA = TEMP; RUN;)&lt;/LI&gt;
&lt;LI&gt;Repeat steps 1 - 5, beginning with the line that caused step 2 to end, until the end of the csv file&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;I am pretty comfortable that I have steps 3 - 5 set up properly.&amp;nbsp; I figure steps 1 and 2 begin by initializing some counter as 1 (%LET COUNTER = 1;) then bringing in my entire csv starting from the next&amp;nbsp;using PROC IMPORT and&amp;nbsp;DATAROW&amp;nbsp;= &amp;amp;COUNTER + 1, then updating &amp;amp;COUNTER to be the observation number that caused step 2 to end.&amp;nbsp; This seems to be the sticking point, though.&amp;nbsp; Using the attached data, I can't see it being overly difficult to create COM_A by using row 1 as the variable names and rows 2 - 17 as the data.&amp;nbsp; But now I need to create COM_B using the variable names in row 18 and rows 19 - 34 as the data, COM_C using row 35 for names and row 36 for data, COM_D using row 37 for names and rows 38 - 40 for data, etc.&lt;/P&gt;</description>
      <pubDate>Tue, 16 Oct 2018 20:25:54 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Importing-an-unusual-csv-file-revisited/m-p/504871#M135167</guid>
      <dc:creator>MFLoGrasso</dc:creator>
      <dc:date>2018-10-16T20:25:54Z</dc:date>
    </item>
    <item>
      <title>Re: Importing an unusual csv file - revisited</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Importing-an-unusual-csv-file-revisited/m-p/504892#M135176</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/40779"&gt;@MFLoGrasso&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;So, I have a situation that is similar to &lt;A href="https://communities.sas.com/t5/General-SAS-Programming/How-can-I-import-data-from-this-unusual-csv-file/m-p/170472/highlight/true#M11099" target="_blank"&gt;the problem posted here&lt;/A&gt;.&amp;nbsp; The issue that I have is that whereas the OP has three different files that are already grouped together in the original csv, I have multiple "entries" of the same four files stacked on top of each other, and each time the next group of data for a particular file arises, the n for that particular group may be different.&amp;nbsp; I have attached one data file that is typical of what I am working with.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Within this csv, the first three variables (UNITID, SURVSECT, PART) are always the same.&amp;nbsp; &lt;FONT color="#ff0000"&gt;&lt;STRONG&gt;I need to create a separate file for each unique combination of SURVSECT and PART (in this case, COM A, COM B, COM C, and COM D).&amp;nbsp;&lt;/STRONG&gt; &lt;/FONT&gt;The file is set up to give all&amp;nbsp;four parts&amp;nbsp;for each UNITID before looping back to part A for the next UNITID.&amp;nbsp; I have a sense of how I THINK this needs to be approached, but I can't fully wrap my head around the code I would need to accomplish this task.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;What do you gain by creating separate "files" (you do mean SAS data sets don't you?).&lt;/P&gt;
&lt;P&gt;BY group processing with a single dataset would likely be easier.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;N of what? The records in the data set or a variable in the data set?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;It appears that what you need to do is parse the incoming line to see which read pattern you need to follow up with.&lt;/P&gt;
&lt;P&gt;You can do that with&lt;/P&gt;
&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/187967"&gt;@input&lt;/a&gt; @;&lt;/P&gt;
&lt;P&gt;If _infile_ =:&amp;nbsp; &amp;lt;a string of values from the data that uniquely identifies a read pattern&amp;gt; then&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp; input &amp;lt;the variables of that list&amp;gt;;&lt;/P&gt;
&lt;P&gt;Else if _infile_ =: &amp;lt;another string of values&amp;gt; then&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp; input &amp;lt;the other variables&amp;gt;;&lt;/P&gt;
&lt;P&gt;&amp;lt;repeat as needed&amp;gt;.&lt;/P&gt;
&lt;P&gt;The strings to compare would be easy to copy from the imbedded header lines using a &lt;STRONG&gt;TEXT EDITOR. &lt;/STRONG&gt;Under no circumstances attempt to copy from the csv file using Excel or other spreadsheet.&lt;/P&gt;</description>
      <pubDate>Tue, 16 Oct 2018 20:57:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Importing-an-unusual-csv-file-revisited/m-p/504892#M135176</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2018-10-16T20:57:50Z</dc:date>
    </item>
    <item>
      <title>Re: Importing an unusual csv file - revisited</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Importing-an-unusual-csv-file-revisited/m-p/504917#M135185</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13884"&gt;@ballardw&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/40779"&gt;@MFLoGrasso&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;So, I have a situation that is similar to &lt;A href="https://communities.sas.com/t5/General-SAS-Programming/How-can-I-import-data-from-this-unusual-csv-file/m-p/170472/highlight/true#M11099" target="_blank"&gt;the problem posted here&lt;/A&gt;.&amp;nbsp; The issue that I have is that whereas the OP has three different files that are already grouped together in the original csv, I have multiple "entries" of the same four files stacked on top of each other, and each time the next group of data for a particular file arises, the n for that particular group may be different.&amp;nbsp; I have attached one data file that is typical of what I am working with.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Within this csv, the first three variables (UNITID, SURVSECT, PART) are always the same.&amp;nbsp; &lt;FONT color="#ff0000"&gt;&lt;STRONG&gt;I need to create a separate file for each unique combination of SURVSECT and PART (in this case, COM A, COM B, COM C, and COM D).&amp;nbsp;&lt;/STRONG&gt; &lt;/FONT&gt;The file is set up to give all&amp;nbsp;four parts&amp;nbsp;for each UNITID before looping back to part A for the next UNITID.&amp;nbsp; I have a sense of how I THINK this needs to be approached, but I can't fully wrap my head around the code I would need to accomplish this task.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;What do you gain by creating separate "files"&amp;nbsp;&lt;STRONG&gt;&lt;FONT color="#3366FF"&gt;I need to compare each SURVSECT/PART with a series of files created from a different source to verify matches and reconcile any discrepancies between them.&amp;nbsp;&lt;/FONT&gt;&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;(you do mean SAS data sets don't you?).&amp;nbsp;&amp;nbsp;&lt;STRONG&gt;&lt;FONT color="#3366FF"&gt;Yes.&amp;nbsp; Typing faster than I'm thinking. &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/FONT&gt;&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;BY group processing with a single dataset would likely be easier.&amp;nbsp;&lt;STRONG&gt;&lt;FONT color="#3366FF"&gt;I had considered that, but I wasn't sure how I would extract the variable names.&lt;/FONT&gt;&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;N of what? The records in the data set or a variable in the data set?&amp;nbsp;&lt;STRONG&gt;&lt;FONT color="#3366FF"&gt;N is records in&amp;nbsp;a part (see for example, the first part D has 3 observations while the second part D has only 2).&lt;/FONT&gt;&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;It appears that what you need to do is parse the incoming line to see which read pattern you need to follow up with.&lt;/P&gt;
&lt;P&gt;You can do that with&lt;/P&gt;
&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/187967"&gt;@input&lt;/a&gt; @;&lt;/P&gt;
&lt;P&gt;If _infile_ =:&amp;nbsp; &amp;lt;a string of values from the data that uniquely identifies a read pattern&amp;gt; then&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp; input &amp;lt;the variables of that list&amp;gt;;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;&lt;FONT color="#3366FF"&gt;Since the unique combinations are based on the second and third variables of the list, would I need to do something like the following?&lt;/FONT&gt;&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;&lt;FONT color="#3366FF"&gt;IF SCAN(_INFILE_,2,',',M) = 'COM' AND&amp;nbsp;SCAN(_INFILE_,3,',',M) = 'B' THEN&lt;/FONT&gt;&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;&lt;FONT color="#3366FF"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;INPUT&amp;nbsp;UNITID SURVSECT PART MAJORNUM CIPCODE AWLEVEL DISTANCEED;&lt;/FONT&gt;&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Else if _infile_ =: &amp;lt;another string of values&amp;gt; then&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp; input &amp;lt;the other variables&amp;gt;;&lt;/P&gt;
&lt;P&gt;&amp;lt;repeat as needed&amp;gt;.&lt;/P&gt;
&lt;P&gt;The strings to compare would be easy to copy from the imbedded header lines using a &lt;STRONG&gt;TEXT EDITOR. &lt;/STRONG&gt;Under no circumstances attempt to copy from the csv file using Excel or other spreadsheet.&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 16 Oct 2018 21:51:30 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Importing-an-unusual-csv-file-revisited/m-p/504917#M135185</guid>
      <dc:creator>MFLoGrasso</dc:creator>
      <dc:date>2018-10-16T21:51:30Z</dc:date>
    </item>
    <item>
      <title>Re: Importing an unusual csv file - revisited</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Importing-an-unusual-csv-file-revisited/m-p/504961#M135205</link>
      <description>&lt;P&gt;The file you attached is very easy to read.&lt;/P&gt;
&lt;PRE&gt;UNITID,SURVSECT,PART,MAJORNUM,CIPCODE,AWLEVEL,CRACE01,CRACE02,CRACE25,CRACE26,CRACE27,CRACE28,CRACE29,CRACE30,CRACE31,CRACE32,CRACE33,CRACE34,CRACE35,CRACE36,CRACE37,CRACE38,CRACE13,CRACE14,CRACE15,CRACE16
188517,COM,A,1,11.1002,02,,,,,,,,,1,,,,,,,,,,1,0
188517,COM,A,1,11.1002,03,0,0,1,0,0,0,0,0,1,0,0,0,3,0,0,1,0,0,5,1
UNITID,SURVSECT,PART,MAJORNUM,CIPCODE,AWLEVEL,DISTANCEED
188517,COM,B,1,11.1002,02,2
188517,COM,B,1,11.1002,03,2
UNITID,SURVSECT,PART,CRACE01,CRACE02,CRACE25,CRACE26,CRACE27,CRACE28,CRACE29,CRACE30,CRACE31,CRACE32,CRACE33,CRACE34,CRACE35,CRACE36,CRACE37,CRACE38,CRACE13,CRACE14,CRACE15,CRACE16
188517,COM,C,,,10,11,,3,,3,9,25,,,12,54,1,7,1,,33,103
UNITID,SURVSECT,PART,CTLEVEL,CRACE15,CRACE16,CRACE24,CRACE17,CRACE41,CRACE42,CRACE43,CRACE44,CRACE45,CRACE46,CRACE47,CRACE23,CRACE48,AGE1,AGE2,AGE3,AGE4,AGE5,AGETOTAL
188517,COM,D,2,13,13,26,0,6,0,0,3,0,14,2,1,26,0,15,5,6,0,26
188517,COM,D,3,24,88,112,0,17,3,3,29,0,54,6,0,112,0,36,64,12,0,112&lt;/PRE&gt;
&lt;P&gt;The number of data rows in a group doesn't matter.&lt;/P&gt;
&lt;P&gt;You can read the first column into a character variable (looks like it is a character variable since it is an ID) and throw away the header rows. Then read the next two column&amp;nbsp;and decide based on the value of the third column what other variables to read from the line and which output dataset to write into.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Here is the start of a program for you.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data
  partA(keep=UNITID SURVSECT PART MAJORNUM CIPCODE AWLEVEL CRACE01 CRACE02 CRACE25-CRACE38 CRACE13-CRACE16)
  partB(keep=UNITID SURVSECT PART MAJORNUM CIPCODE AWLEVEL DISTANCEED)
  partC(keep=UNITID SURVSECT PART CRACE01 CRACE02 CRACE25-CRACE38 CRACE13-CRACE16)
  partD(keep=UNITID SURVSECT PART CTLEVEL CRACE15 CRACE16 CRACE24 CRACE17 CRACE41-CRACE47 CRACE23 CRACE48 AGE1-AGE5 AGETOTAL)
;
  infile 'filename' dsd truncover ;
  length UNITID $20 SURVSECT $5 PART $1 MAJORNUM 8 ..... ;
  input unitid @;
  if unitid='UNITID' then delete ;
  input SURVSECT PART @ ;
  select (part);
    when ('A') do;
      input MAJORNUM CIPCODE AWLEVEL CRACE01 CRACE02 CRACE25-CRACE38 CRACE13-CRACE16;
      ouput partA;
      end;
    when ('B') do;
      input MAJORNUM CIPCODE AWLEVEL DISTANCEED ;
      ouput partB;
      end;
    when ('C') do;
      input CRACE01 CRACE02 CRACE25-CRACE38 CRACE13-CRACE16;
      ouput partC;
      end;
    when ('D') do;
      input CTLEVEL CRACE15 CRACE16 CRACE24 CRACE17 CRACE41-CRACE47 CRACE23 CRACE48 AGE1-AGE5 AGETOTAL ;
      ouput partD;
      end;
    otherwise put 'Illegal part ' part= / _infile_;
  end;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Fill in the details about the variables definitions and any additional "PART"s if needed.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 17 Oct 2018 04:05:57 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Importing-an-unusual-csv-file-revisited/m-p/504961#M135205</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2018-10-17T04:05:57Z</dc:date>
    </item>
  </channel>
</rss>

