DATA Step, Macro, Functions and more

Find out is a csv file is in PC or UNIX format?

Reply
Contributor
Posts: 51

Find out is a csv file is in PC or UNIX format?

Hi

I'm reading in csv files on the UNIX and using the TERMSTR option in the file statement ( link SAS(R) 9.2 Companion for UNIX Environments)

filename csvfile " /projx/file1.csv" termstr=CRLF;

        data obs_in_ds;

            infile "csvfile" firstobs=2 dlm=',' dsd missover;

       

            input projectid $100.;

         

        run;

TERMSTR=

controls the end-of-line or record delimiters in PC- and UNIX-formatted files. This option enables the sharing of UNIX- and PC-formatted files between the two hosts. The following are values for the TERMSTR= option:

CRLF

Carriage Return Line Feed. This parameter is used to create PC format files.

NL

Newline. This parameter is used to create UNIX format files. NL is the default format.

Use TERMSTR=CRLF when you are writing to a file that you want to read on a PC. If you use this option when creating the file, then you do not need to use TERMSTR=NL when reading the file on the PC.

This works well and do not need to convert the files from PC to unix or vice versa when reading them in. But to avoid changing the code if we receive files in the other format next time it would be good to see if the SAS program can work out which environment the file if from and then use the option or not.

Thanks

Steve

Respected Advisor
Posts: 3,777

Re: Find out is a csv file is in PC or UNIX format?

Did you consider just running DOS2UNIX on the files?

This will not work for TERMSTR=CR.

data _null_;
  
infile FT52F001 lrecl=1000000 termstr=NL length=l eof=eof;
   input @l byte $1.;
  
select(byte);
      when('0d'x) TERMSTR='CRLF';
     
otherwise   TERMSTR='NL';
     
end;
  
put TERMSTR=;
   call symputX('TERMSTR',termstr);

   eof:
stop;
  
run;
%put NOTE: TERMSTR=&termstr;
Super Contributor
Posts: 376

Re: Find out is a csv file is in PC or UNIX format?

What data _null_ said.

As the saying goes "If the only tool you own is a hammer, everything starts to look like a nail".  Other than as an academic exercise in SAS, why don't you just ensure that your input files are in Unix format?

Google "sed convert dos to unix".  This was the first hit:  HowTo: UNIX / Linux Convert DOS Newlines CR-LF to Unix/Linux Format

Also, if you're FTPing the files from Windows to Unix and vice versa, make sure you transfer text files as TEXT not BINARY.  The FTP protocol will handle converting the line terminators to the correct value for the given operating system.  If you're using an FTP client, add .CSV, .SAS, .LOG, .LST, etc to the list of filename extensions that are considered text files.

Hope this helps,

Scott

Valued Guide
Posts: 2,175

Re: Find out is a csv file is in PC or UNIX format?

a solution that works with both unix and windows line endings is to extend the DLM= option to include '0D'x

the normal line ending on unix is just '0A'x and on windows '0D0A'x so the only problem is that 0D.

Contributor
Posts: 51

Re: Find out is a csv file is in PC or UNIX format?

Hi Peter

Thanks for this.   So do I need to change the DLM= value each time the file changes or can I use a DLM value which will work with both formats?

Regards

Steve

Valued Guide
Posts: 2,175

Re: Find out is a csv file is in PC or UNIX format?

for any infile statement that refers to files from either environment you can specif DLM='0d20'x

(assuming space is 20x)

It seems very unlikely that DLM type input would ever have a 0Dx that must be treated as data

On windows the environment would eat that 0Dx and on unix, it would be treated like a space delimiter (at the end of a line this would have no impact).

So on both unix and on windows platforms  DLM='0D20'x will achieve what you need.

give it a try

Contributor
Posts: 51

Re: Find out is a csv file is in PC or UNIX format?

Hi Scott

Being a traditionalist....I did not want to change the files we got to read in from the client.  Sure I could convert the files but would rather keep them in their original state. I thought there may be a way to tell if a file was windows or unix based.

Thanks

Steve

Respected Advisor
Posts: 3,777

Re: Find out is a csv file is in PC or UNIX format?

slolay wrote:

I thought there may be a way to tell if a file was windows or unix based.

Hello, I showed you the code you need above.  It reads one record and creates a TERMSTR macro variable.

Ask a Question
Discussion stats
  • 7 replies
  • 3004 views
  • 7 likes
  • 4 in conversation