<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Cleaning data within DATA/INPUT step to remove new line feed and potentially other characters, in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Cleaning-data-within-DATA-INPUT-step-to-remove-new-line-feed-and/m-p/560890#M156930</link>
    <description>&lt;P&gt;If you are running in a Windows environment and the problem is strictly a line feed you might try adding to your INFILE statement: TERMSTR=CRLF&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;That would mean the a single line feed is not considered the end of a record.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Also, you might try a simple $275. informat instead of char. If the data is a proper CSV file then usually that will work.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If that doesn't work then you may need to share the code you are using and a few lines of example data that have the problem. Paste the Code and the data into separate code boxes opened using the forum's {I} or "running man" icon so the message windows don't reformat the text, which can seriously degrade the ability to read data.&lt;/P&gt;
&lt;P&gt;Your example data and code really only needs to have a few variables, not all 200+, as long as the code and data behave the same.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Wed, 22 May 2019 16:45:45 GMT</pubDate>
    <dc:creator>ballardw</dc:creator>
    <dc:date>2019-05-22T16:45:45Z</dc:date>
    <item>
      <title>Cleaning data within DATA/INPUT step to remove new line feed and potentially other characters,</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Cleaning-data-within-DATA-INPUT-step-to-remove-new-line-feed-and/m-p/560887#M156929</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have a data step with LENGTH/FORMAT/INFORMAT/INPUT components that imports a CSV file.&lt;/P&gt;&lt;P&gt;It's a big piece of code as it has close to 200 fields and yes, I copied it from the import wizard.&lt;/P&gt;&lt;P&gt;I've been using this for a while and it's been working fine.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;However, I just needed to add another variable to import, Comment, and now I'm having all kinds of issues.&lt;/P&gt;&lt;P&gt;The comment variable could and does have all kinds of characters in it and the ones that I think are causing me a headache are new line characters and commas.&lt;/P&gt;&lt;P&gt;So what happens when I run my code only part of the comment will be displayed in the Comment field and then all the following fields will get messed up (for example, Comment is supposed to say '$10,000 was paid to the customer' and the CommentDate field that follows should say 3/5/2019 - instead the Comment field is showing $10 and the CommentDate field is showing .).&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;My question is, is there a way to clean the data coming in somewhere in the INPUT section.&lt;/P&gt;&lt;P&gt;For example, where I have:&lt;/P&gt;&lt;P&gt;INPUT&lt;/P&gt;&lt;P&gt;Comment&amp;nbsp; &amp;nbsp;: $CHAR275.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I could have something like&lt;/P&gt;&lt;P&gt;Comment = TRNWRD(Comment, ".", "")&amp;nbsp; :$CHAR275.&amp;nbsp;&lt;/P&gt;&lt;P&gt;and/or something similar to remove new line characters.&lt;/P&gt;&lt;P&gt;Btw, I tried this and it's not working.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Any ideas?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks!&lt;/P&gt;</description>
      <pubDate>Wed, 22 May 2019 16:30:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Cleaning-data-within-DATA-INPUT-step-to-remove-new-line-feed-and/m-p/560887#M156929</guid>
      <dc:creator>SasDewd</dc:creator>
      <dc:date>2019-05-22T16:30:12Z</dc:date>
    </item>
    <item>
      <title>Re: Cleaning data within DATA/INPUT step to remove new line feed and potentially other characters,</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Cleaning-data-within-DATA-INPUT-step-to-remove-new-line-feed-and/m-p/560890#M156930</link>
      <description>&lt;P&gt;If you are running in a Windows environment and the problem is strictly a line feed you might try adding to your INFILE statement: TERMSTR=CRLF&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;That would mean the a single line feed is not considered the end of a record.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Also, you might try a simple $275. informat instead of char. If the data is a proper CSV file then usually that will work.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If that doesn't work then you may need to share the code you are using and a few lines of example data that have the problem. Paste the Code and the data into separate code boxes opened using the forum's {I} or "running man" icon so the message windows don't reformat the text, which can seriously degrade the ability to read data.&lt;/P&gt;
&lt;P&gt;Your example data and code really only needs to have a few variables, not all 200+, as long as the code and data behave the same.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 22 May 2019 16:45:45 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Cleaning-data-within-DATA-INPUT-step-to-remove-new-line-feed-and/m-p/560890#M156930</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2019-05-22T16:45:45Z</dc:date>
    </item>
    <item>
      <title>Re: Cleaning data within DATA/INPUT step to remove new line feed and potentially other characters,</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Cleaning-data-within-DATA-INPUT-step-to-remove-new-line-feed-and/m-p/560894#M156932</link>
      <description>&lt;P&gt;Thank you for your reply&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13884"&gt;@ballardw&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I tried adding&amp;nbsp;&lt;SPAN&gt;TERMSTR=CRLF but that resulted in no records returned at all&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;So I have&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;INFILE "&amp;amp;path/&amp;amp;&amp;amp;File&amp;amp;i"&lt;BR /&gt;DELIMITER = ','&lt;BR /&gt;DSD&lt;BR /&gt;LRECL=32767&lt;BR /&gt;FIRSTOBS=2;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;and I changed it to&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;INFILE "&amp;amp;path/&amp;amp;&amp;amp;File&amp;amp;i"&lt;BR /&gt;DELIMITER = ','&lt;BR /&gt;DSD &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;TERMSTR=CRLF&lt;BR /&gt;LRECL=32767&lt;BR /&gt;FIRSTOBS=2;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;but, like I said, I get back an empty data set.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;I also forgot to mention, and I think it's important, that some comments have multiple commas (I believe having one comma is OK since there is the DSD option, but having another one would throw it off?).&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;So, I think it's both multiple commas and new line feeds (which there could also be multiple of) that's causing the issue.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Here's a sample of my code (just the variable that's causing the issue)&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;DATA WORK.DAILY_DATA_FEED;&lt;BR /&gt;LENGTH&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Comment $ 275&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;...&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;FORMAT&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Comment $CHAR275.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;...&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;INFORMAT&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Comment $CHAR275.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;....&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;INFILE "&amp;amp;path/&amp;amp;&amp;amp;File&amp;amp;i"&lt;BR /&gt;DELIMITER = ','&lt;BR /&gt;DSD &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;TERMSTR=CRLF&lt;BR /&gt;LRECL=32767&lt;BR /&gt;FIRSTOBS=2;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;INPUT&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Comment : $CHAR275.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;One of the comments that is causing an issue is:&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;'$10,034 paid to customer, $12000 max&lt;BR /&gt;Thank you. John M.'&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;As you can see there is a new line feed after max.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Thanks!&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 22 May 2019 17:01:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Cleaning-data-within-DATA-INPUT-step-to-remove-new-line-feed-and/m-p/560894#M156932</guid>
      <dc:creator>SasDewd</dc:creator>
      <dc:date>2019-05-22T17:01:38Z</dc:date>
    </item>
    <item>
      <title>Re: Cleaning data within DATA/INPUT step to remove new line feed and potentially other characters,</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Cleaning-data-within-DATA-INPUT-step-to-remove-new-line-feed-and/m-p/560903#M156937</link>
      <description>&lt;P&gt;If possible please attach a TXT file (don't use CSV extension as the download process gets corrupted by spreadsheets opening the file) with a few examples. I'm afraid anything pasted&amp;nbsp;on the forum&amp;nbsp;is not the same , i.e. line feeds and or carriage returns as your actual file.&lt;/P&gt;</description>
      <pubDate>Wed, 22 May 2019 17:26:30 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Cleaning-data-within-DATA-INPUT-step-to-remove-new-line-feed-and/m-p/560903#M156937</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2019-05-22T17:26:30Z</dc:date>
    </item>
    <item>
      <title>Re: Cleaning data within DATA/INPUT step to remove new line feed and potentially other characters,</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Cleaning-data-within-DATA-INPUT-step-to-remove-new-line-feed-and/m-p/560916#M156943</link>
      <description>&lt;P&gt;First thing is HOW are you generating the file?&amp;nbsp; If you are writing it with SAS then the commas will NOT be a problem because the values that contain commas will have quotes added around them.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Then the problem is just embedded line-breaks.&amp;nbsp; Again if you are careful in how you create the file then those can also be read.&amp;nbsp; The trick is to make sure your actual data never contains CR ('0D'x) followed immediately by LF ('0A'X) and that all of your lines do end with CR LF.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;For example:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;file mycsv temp ;
data  _null_;
  x=1;
  y=2;
  comment='This has commas, and CR ' || '0d'x || 'and LF'|| '0A'x ||'but not both';
file mycsv termstr=crlf dsd dlm=',';
  put x y comment;
run;

data want;
  infile mycsv termstr=crlf dsd truncover ;
  input x y comment :$200. ;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="image.png" style="width: 472px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/29696i9AD24680C61AA3D3/image-size/large?v=v2&amp;amp;px=999" role="button" title="image.png" alt="image.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;PRE&gt;33    data _null_;
34     infile mycsv termstr=crlf;
35     input;
36     list;
37    run;

NOTE: The infile MYCSV is:
      Filename=C:\downloads\mycsv.dat,
      RECFM=V,LRECL=32767,File Size (bytes)=52,
      Last Modified=22May2019:13:51:09,
      Create Time=22May2019:13:51:09

RULE:     ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+--

1   CHAR  1,2,"This has commas, and CR .and LF.but not both" 50
    ZONE  32322566726672666667226662452066624406772667266762
    NUMR  1C2C24893081303FDD13C01E40320D1E40C6A2540EF402F482
NOTE: 1 record was read from the infile MYCSV.
      The minimum record length was 50.
      The maximum record length was 50.
&lt;/PRE&gt;</description>
      <pubDate>Wed, 22 May 2019 17:54:10 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Cleaning-data-within-DATA-INPUT-step-to-remove-new-line-feed-and/m-p/560916#M156943</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2019-05-22T17:54:10Z</dc:date>
    </item>
    <item>
      <title>Re: Cleaning data within DATA/INPUT step to remove new line feed and potentially other characters,</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Cleaning-data-within-DATA-INPUT-step-to-remove-new-line-feed-and/m-p/560926#M156947</link>
      <description>&lt;P&gt;Thank you&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13884"&gt;@ballardw&lt;/a&gt;&amp;nbsp;!&lt;/P&gt;&lt;P&gt;I'll need to do some work and clean the file before I upload it here as most of it is our customers' data.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Let me see if I can put something together.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/159"&gt;@Tom&lt;/a&gt;!&lt;/P&gt;&lt;P&gt;The CSV file is generated by a separate entity and I have no control over it.&lt;/P&gt;&lt;P&gt;It would have been much easier to just get rid of all these characters and new lines before creating the file but, again, it is just dropped in a folder every morning and all I do is pick it up from there and run it through the DATA step I posted above.&lt;/P&gt;&lt;P&gt;It could include every possible character as it comes from a comment section.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks again for your answers!&lt;/P&gt;</description>
      <pubDate>Wed, 22 May 2019 18:23:03 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Cleaning-data-within-DATA-INPUT-step-to-remove-new-line-feed-and/m-p/560926#M156947</guid>
      <dc:creator>SasDewd</dc:creator>
      <dc:date>2019-05-22T18:23:03Z</dc:date>
    </item>
    <item>
      <title>Re: Cleaning data within DATA/INPUT step to remove new line feed and potentially other characters,</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Cleaning-data-within-DATA-INPUT-step-to-remove-new-line-feed-and/m-p/560943#M156951</link>
      <description>&lt;P&gt;Tell who ever is generating the files to follow the standard.&lt;/P&gt;
&lt;P&gt;&lt;A href="https://tools.ietf.org/html/rfc4180" target="_blank"&gt;https://tools.ietf.org/html/rfc4180&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;But also tell them that you would prefer that they replace and actual CRLF values in the data with just a single CR value instead.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Then you will get files that SAS can read.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you really do get embedded CRLF in the middle of the files there are number of examples of how to fix those types of files on this forum.&lt;/P&gt;</description>
      <pubDate>Wed, 22 May 2019 19:04:54 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Cleaning-data-within-DATA-INPUT-step-to-remove-new-line-feed-and/m-p/560943#M156951</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2019-05-22T19:04:54Z</dc:date>
    </item>
    <item>
      <title>Re: Cleaning data within DATA/INPUT step to remove new line feed and potentially other characters,</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Cleaning-data-within-DATA-INPUT-step-to-remove-new-line-feed-and/m-p/560964#M156959</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13884"&gt;@ballardw&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I just created a small csv file with only 5-6 columns (comment field included) in it and 2 records, both of which are giving me trouble.&lt;/P&gt;&lt;P&gt;I was going to send it to you but before I did I tested it running through the slimmed-down code (removed all the other columns) and added TERMSTR=CLRF option....and it worked fine!&lt;/P&gt;&lt;P&gt;So, if I add the TERMSTR option to the original code and run it through all the records and all the columns I get no records returned but when I run it against the newly created sample file with only a couple of rows and few columns it works just fine.&lt;/P&gt;&lt;P&gt;I get no errors, no warnings but for some reason it just doesn't return any records when I use&amp;nbsp;TERMSTR=CLRF...&lt;/P&gt;</description>
      <pubDate>Wed, 22 May 2019 19:58:40 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Cleaning-data-within-DATA-INPUT-step-to-remove-new-line-feed-and/m-p/560964#M156959</guid>
      <dc:creator>SasDewd</dc:creator>
      <dc:date>2019-05-22T19:58:40Z</dc:date>
    </item>
  </channel>
</rss>

