<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Irregular format of csv files and combing large number of data in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Irregular-format-of-csv-files-and-combing-large-number-of-data/m-p/798967#M314104</link>
    <description>&lt;P&gt;You did not show us the&amp;nbsp;&lt;EM&gt;real&lt;/EM&gt; contents of the file. Add the DSD option to the INFILE statement.&lt;/P&gt;</description>
    <pubDate>Sun, 27 Feb 2022 14:19:40 GMT</pubDate>
    <dc:creator>Kurt_Bremser</dc:creator>
    <dc:date>2022-02-27T14:19:40Z</dc:date>
    <item>
      <title>Irregular format of csv files and combing large number of data</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Irregular-format-of-csv-files-and-combing-large-number-of-data/m-p/798931#M314091</link>
      <description>&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;A blood&amp;nbsp; test was done on 120 patients. The output is in csv format and individually (120 spreadsheets). The output starts with few lines describing the procedure then gives&amp;nbsp; table with values.&amp;nbsp; They are 20 columns. I need only column1 (X) and column4 (Normal). &amp;nbsp;Column1 has numeric&amp;nbsp; values but actually these are the variables and they the same for all patients and in the same order. The real data has more than 200 variables (let’s say &amp;nbsp;200). When keeping only column1 and column4 the individual data is 200x2.&lt;/P&gt;
&lt;P&gt;Column1 describes the peaks in ppm I want to change it &amp;nbsp;to var001 to var200.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The few lines describing procedure are not needed. The final data I need to get should have &amp;nbsp;dimension of&amp;nbsp; 120 (patients) x 200 (variables).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I have these spreadsheets on a file and I usually use this code to import data in sas:&lt;/P&gt;
&lt;P&gt;proc import datafile="C:\Users\hp\Desktop\data\T1.002.csv"&lt;BR /&gt;dbms=csv out=out;&lt;BR /&gt;run;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;But this is a situation where all data needs to be imported and rearranged it together.&lt;/P&gt;
&lt;P&gt;I generated same structure data for 2 patients.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thank you so much&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;patient1:&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;Peak Pick / Integral Database,,,,,,,,,,,,,,,,,,,&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;Extracted from ANNOTATION on 15-FEB-2022 8:43:16,,,,,,,,,,,,,,,,,,,&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;,,,,,,,,,,,,,,,,,,,&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;Peak Pick / Integral Database,,,,,,,,,,,,,,,,,,,&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;Extracted from ANNOTATION on 15-FEB-2022 8:43:16,,,,,,,,,,,,,,,,,,,&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;,,,,,,,,,,,,,,,,,,,&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;X,X Fold,Class,Normal,Integral,Intensity,Slope,Offset,G/L Mix,Protons,Molarity,X Diam/J,Color,Spin X,Group X,Residue X,Seq1 X,Seq2 X,Seq3 X,Type X&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;17.3[ppm],0,MI,2.74775,0.0001,0,0.000[abn/ppm],0,0,1,2.25,0.017[ppm],2,0,,,0,0,0,&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;15.3[ppm],0,MI,6.5584,0.0002,0,0.000[abn/ppm],0,0,1,3.54,0.0015[ppm],2,0,,,0,0,0,&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;12[ppm],0,MI,44.64217,0.00003,0,0.000[abn/ppm],0,0,1,17.3,0.225[ppm],2,0,,,0,0,0,&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;10.49[ppm],0,MI,8.38014,0.00005,0,0.000[abn/ppm],0,0,1,2.6,0.040[ppm],2,0,,,0,0,0,&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;8.16[ppm],0,MI,8.48936,0.0006,0,0.000[abn/ppm],0,0,1,3.2,0.030[ppm],2,0,,,0,0,0,&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;5.34[ppm],0,MI,11.64487,0.00015,0,0.000[abn/ppm],0,0,1,7.7,0.030[ppm],2,0,,,0,0,0,&lt;/EM&gt;&lt;STRONG&gt;&lt;BR /&gt;&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;patient2:&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Peak Pick / Integral Database,,,,,,,,,,,,,,,,,,,&lt;BR /&gt;Extracted from ANNOTATION on 15-FEB-2022 15:27:43,,,,,,,,,,,,,,,,,,,&lt;BR /&gt;,,,,,,,,,,,,,,,,,,,&lt;BR /&gt;Peak Pick / Integral Database,,,,,,,,,,,,,,,,,,,&lt;BR /&gt;Extracted from ANNOTATION on 15-FEB-2022 15:27:43,,,,,,,,,,,,,,,,,,,&lt;BR /&gt;,,,,,,,,,,,,,,,,,,,&lt;BR /&gt;X,X Fold,Class,Normal,Integral,Intensity,Slope,Offset,G/L Mix,Protons,Molarity,X Diam/J,Color,Spin X,Group X,Residue X,Seq1 X,Seq2 X,Seq3 X,Type X&lt;BR /&gt;17.3[ppm],0,MI,2.98273,0.0001,0,0.000[abn/ppm],0,0,1,2.25,0.017[ppm],2,0,,,0,0,0,&lt;BR /&gt;15.3[ppm],0,MI,6.73352,0.0002,0,0.000[abn/ppm],0,0,1,3.54,0.0015[ppm],2,0,,,0,0,0,&lt;BR /&gt;12[ppm],0,MI,48.3615,0.00003,0,0.000[abn/ppm],0,0,1,17.3,0.225[ppm],2,0,,,0,0,0,&lt;BR /&gt;10.49[ppm],0,MI,8.39814,0.00005,0,0.000[abn/ppm],0,0,1,2.6,0.040[ppm],2,0,,,0,0,0,&lt;BR /&gt;8.16[ppm],0,MI,8.19936,0.0006,0,0.000[abn/ppm],0,0,1,3.2,0.030[ppm],2,0,,,0,0,0,&lt;BR /&gt;5.34[ppm],0,MI,12.04487,0.00015,0,0.000[abn/ppm],0,0,1,7.7,0.030[ppm],2,0,,,0,0,0,&lt;/P&gt;</description>
      <pubDate>Sun, 27 Feb 2022 08:00:08 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Irregular-format-of-csv-files-and-combing-large-number-of-data/m-p/798931#M314091</guid>
      <dc:creator>Job04</dc:creator>
      <dc:date>2022-02-27T08:00:08Z</dc:date>
    </item>
    <item>
      <title>Re: Irregular format of csv files and combing large number of data</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Irregular-format-of-csv-files-and-combing-large-number-of-data/m-p/798941#M314092</link>
      <description>&lt;P&gt;First of all, this is by far not a task for PROC IMPORT. Instead you will need to write a DATA step that handles everything in one step.&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Do the filenames have a certain structure, so we can extract a patient id from themß&lt;/LI&gt;
&lt;LI&gt;Does the data always start on the same line, or must we determine that from context?&lt;/LI&gt;
&lt;LI&gt;Is the order of variables consistent, and do we always take columns 1 and 4?&lt;/LI&gt;
&lt;/UL&gt;</description>
      <pubDate>Sun, 27 Feb 2022 09:02:54 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Irregular-format-of-csv-files-and-combing-large-number-of-data/m-p/798941#M314092</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2022-02-27T09:02:54Z</dc:date>
    </item>
    <item>
      <title>Re: Irregular format of csv files and combing large number of data</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Irregular-format-of-csv-files-and-combing-large-number-of-data/m-p/798943#M314093</link>
      <description>&lt;P&gt;The name of the spreadsheet is long and consists of:&lt;/P&gt;
&lt;P&gt;location-study's name-patient's number- time of measurement-date of measurement-buffer:&lt;/P&gt;
&lt;P&gt;7-PROKIDN-001-T1-150222_buf12102020&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;7-PROKIDN-001-T1: This part is the same for all patients with exception of patient's number (&lt;/P&gt;
&lt;P&gt;for patient 17:&amp;nbsp;7-PROKIDN-017-T1). Everything else&amp;nbsp;can change because of date of measurement and type of buffer.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Patients are 120 but not in sequence numbering.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Tables has identical structure, all variables are in the same order and only column 1 and 4 are needed .&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 27 Feb 2022 09:26:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Irregular-format-of-csv-files-and-combing-large-number-of-data/m-p/798943#M314093</guid>
      <dc:creator>Job04</dc:creator>
      <dc:date>2022-02-27T09:26:38Z</dc:date>
    </item>
    <item>
      <title>Re: Irregular format of csv files and combing large number of data</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Irregular-format-of-csv-files-and-combing-large-number-of-data/m-p/798948#M314094</link>
      <description>&lt;P&gt;This reads all data into a single dataset:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want;
length fname $200 pat_id $3;
retain filname;
infile "C:\Users\hp\Desktop\data\7-PROKIDN-*-T1*.csv" filename=fname dlm=",";
input @; /* this is there so that a change in fname can be detected */
if filname ne fname
then do i = 1 to 7; /* skips header when a new file is read */
  input;
end;
filname = fname;
input x :$10. x_fold class $ normal;
pat_id = substr(scan(fname,-1,"\"),11,3); /* extracts patient from filename */
drop x_fold class i filname;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Sun, 27 Feb 2022 11:39:02 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Irregular-format-of-csv-files-and-combing-large-number-of-data/m-p/798948#M314094</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2022-02-27T11:39:02Z</dc:date>
    </item>
    <item>
      <title>Re: Irregular format of csv files and combing large number of data</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Irregular-format-of-csv-files-and-combing-large-number-of-data/m-p/798964#M314101</link>
      <description>&lt;P&gt;This imports&amp;nbsp; all variables and&amp;nbsp; patient's number&amp;nbsp; and not the column "normal".&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Many messages I get of this type:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;NOTE: Invalid data for x_fold in line 8 14-16.&lt;/P&gt;
&lt;P&gt;NOTE: Invalid data for normal in line 8 23-31.&lt;/P&gt;
&lt;P&gt;RULE:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9--&lt;/P&gt;
&lt;P&gt;8&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; "8.472[ppm]","0","MI","3.58542","0.00113","0","0.000[abn/ppm]","0","0","1","3.58542","0.024[&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 93&amp;nbsp; ppm]","1","0","","","0","0","0","" 126&lt;/P&gt;
&lt;P&gt;fname=C:\Users\hp\Desktop\data\7-PROKIDN-002-T1_141020_buf121020.csv pat_id=002&lt;/P&gt;
&lt;P&gt;filname=C:\Users\hp\Desktop\data\7-PROKIDN -002-T1_141020_buf121020.csv i=8 x="8.472[ppm x_fold=.&lt;/P&gt;
&lt;P&gt;class="MI" normal=. _ERROR_=1 _N_=1&lt;/P&gt;
&lt;P&gt;NOTE: Invalid data for x_fold in line 9 14-16.&lt;/P&gt;
&lt;P&gt;NOTE: Invalid data for normal in line 9 23-31.&lt;/P&gt;
&lt;P&gt;9&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; "8.215[ppm]","0","MI","6.38263","0.00202","0","0.000[abn/ppm]","0","0","1","6.38263","0.050[&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 93&amp;nbsp; ppm]","1","0","","","0","0","0","" 126&lt;/P&gt;
&lt;P&gt;fname=C:\Users\hp\Desktop\data\7-PROKIDN -002-T1_141020_buf121020.csv pat_id=002&lt;/P&gt;
&lt;P&gt;filname=C:\Users\hp\Desktop\data\7-PROKIDN -002-T1_141020_buf121020.csv i=. x="8.215[ppm x_fold=.&lt;/P&gt;
&lt;P&gt;class="MI" normal=. _ERROR_=1 _N_=2&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Is it possible to make the variables &lt;SPAN&gt;horizontally&lt;/SPAN&gt;?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 27 Feb 2022 13:31:34 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Irregular-format-of-csv-files-and-combing-large-number-of-data/m-p/798964#M314101</guid>
      <dc:creator>Job04</dc:creator>
      <dc:date>2022-02-27T13:31:34Z</dc:date>
    </item>
    <item>
      <title>Re: Irregular format of csv files and combing large number of data</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Irregular-format-of-csv-files-and-combing-large-number-of-data/m-p/798965#M314102</link>
      <description>&lt;P&gt;One more thing please.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;the variables in x column appears&amp;nbsp; in quotation ( "6.231[ppm]"). Is it possible to not include the quotation ?&lt;/P&gt;</description>
      <pubDate>Sun, 27 Feb 2022 13:57:16 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Irregular-format-of-csv-files-and-combing-large-number-of-data/m-p/798965#M314102</guid>
      <dc:creator>Job04</dc:creator>
      <dc:date>2022-02-27T13:57:16Z</dc:date>
    </item>
    <item>
      <title>Re: Irregular format of csv files and combing large number of data</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Irregular-format-of-csv-files-and-combing-large-number-of-data/m-p/798967#M314104</link>
      <description>&lt;P&gt;You did not show us the&amp;nbsp;&lt;EM&gt;real&lt;/EM&gt; contents of the file. Add the DSD option to the INFILE statement.&lt;/P&gt;</description>
      <pubDate>Sun, 27 Feb 2022 14:19:40 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Irregular-format-of-csv-files-and-combing-large-number-of-data/m-p/798967#M314104</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2022-02-27T14:19:40Z</dc:date>
    </item>
    <item>
      <title>Re: Irregular format of csv files and combing large number of data</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Irregular-format-of-csv-files-and-combing-large-number-of-data/m-p/798968#M314105</link>
      <description>&lt;P&gt;Thank you so much&lt;/P&gt;</description>
      <pubDate>Sun, 27 Feb 2022 14:25:40 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Irregular-format-of-csv-files-and-combing-large-number-of-data/m-p/798968#M314105</guid>
      <dc:creator>Job04</dc:creator>
      <dc:date>2022-02-27T14:25:40Z</dc:date>
    </item>
    <item>
      <title>Re: Irregular format of csv files and combing large number of data</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Irregular-format-of-csv-files-and-combing-large-number-of-data/m-p/799000#M314123</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/324234"&gt;@Job04&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;The name of the spreadsheet is long and consists of:&lt;/P&gt;
&lt;P&gt;location-study's name-patient's number- time of measurement-date of measurement-buffer:&lt;/P&gt;
&lt;P&gt;7-PROKIDN-001-T1-150222_buf12102020&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;An aside to the main topic: &lt;STRONG&gt;thinking&lt;/STRONG&gt; of CSV files as "spreadsheets" can lead to all sorts of problems. CSV are simple text flat-files, spreadsheets are not. To make things potentially even worse, opening CSV in spreadsheet programs will cause the spreadsheet software to "interpret" values from the CSV and can display things that are quite different than the text. And if you SAVE the file will looking at it in a spreadsheet can permanently change values.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You mention one issue related to quotes around values, which &lt;STRONG&gt;do not appear&lt;/STRONG&gt; in your example in the first post. I have to assume the pasted text is from looking at it in the spreadsheet. Dates and times a particularly in danger and some values, such as 5-15 (possibly a code entry or part number) can become a date value.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Also, using Proc Import with multiple files of the same structure can result in variables with different names, types and lengths making it difficult to combine data later. Each call to Proc Import can make different choices on examining the file. So different contents can lead to different results even though your desire is to have the variable names the same, lengths of character variables the same and variables as either numeric or character for the same column.&lt;/P&gt;</description>
      <pubDate>Sun, 27 Feb 2022 19:55:35 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Irregular-format-of-csv-files-and-combing-large-number-of-data/m-p/799000#M314123</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2022-02-27T19:55:35Z</dc:date>
    </item>
  </channel>
</rss>

