Solved: Re: Read CSV file conditionally

mnjtrana · Posted 06-19-2017 07:37 AM

Hi guys;

I have the below CSV file(DLM=';'). I want to read this file in sas datasets. The file contains lot of useless records, so we only want to read the records following the file - Detailed Report:;; till total rows returned:5;;

also the header will be the first record after the row- Detailed Report:;; .

for below file, the header is Contract_Id;HOSTID;HOSTEMAIL

Can anybody help? Thanks in advance;

Input file:

This is a test file;created on19062017;Created by Mnjtrana

country-India;;

Report name:Testing;;

email:test@test.com;;

Detailed Report:;;

Contract_Id;HOSTID;HOSTEMAIL

1245;bdzasd;dd@ff.dk

3454;bdiojk;xx@dd.com

3133;bdgegg;df23@dd.com

489;bdui3;u87@ff.com

2435;bdyu9k;fsf2@gh.com

total rows returned:5;;

Cheers from India!

Manjeet

Tom · Posted 06-19-2017 08:18 AM

If there are always the same number of header lines then just skip them using the FIRSTOBS= option on the INFILE statement.

To skip the trailer line you can try just pre-reading the line and checking it.

data want ;
  infile 'myfile.dlm' dsd dlm=';' firstobs=7 truncover ;
  input @;
  if _infile_=: 'total rows' then delete;
  length Contract_Id $10 HOSTID $20 HOSTEMAIL $60 ;
  input Contract_Id HOSTID HOSTEMAIL;
run;

If the header part is variable length then just use similar method to skip over it.

data want ;
  infile 'myfile.dlm' dsd dlm=';' firstobs=7 truncover ;
  input @;
  if _n_=1 then do while (_infile_^=: 'Contract_Id;') ;
    input / @;
  end;
  if _infile_=: 'total rows' then delete;
  if _infile_=: 'Contract_Id;' then delete;
  length Contract_Id $10 HOSTID $20 HOSTEMAIL $60 ;
  input Contract_Id HOSTID HOSTEMAIL;
run;

View solution in original post

Kurt_Bremser · Posted 06-19-2017 07:45 AM

RETAIN a flag variable in the data step, and initialize it to 0.

Do a simple

input@;

This holds the line pointer.

If _infile_ contains "Detailed Report", set the flag to 1 and do

input;

to skip to the next line.

If _infile_ contains "total rows returned", set flag to 0 again.

If flag is set, read the line as intended. Use @1 to start from the beginning of the line.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

RW9 · Posted 06-19-2017 07:54 AM

Sorry, have to say it, that is not a CSV (Comma Separated Varible file) it is a delimited file. Its not even a real delimited file, you have multiple delimited files in one. As read is sequential, most techniques will still have to read the data, wether they output it to the dataset or not is a different matter. Personally I would go back to the author and ask then to fix it, either by providing correctly formatted files for each section, or in a format which can handle multiple datasets such as XML. Alternatively you can write your own file processor (i.e. you write maintain, validate etc.) following the advice given by @Kurt_Bremser

mnjtrana · Posted 06-19-2017 08:08 AM

sorry my bad, this is a delimited file and yes it contains multiple type of data.
Changing the content of this file is not possible, already checked with the author. I will try th solution given by @Kurt_Bremser

Cheers from India!

Manjeet

Tom · Posted 06-19-2017 08:18 AM

If there are always the same number of header lines then just skip them using the FIRSTOBS= option on the INFILE statement.

To skip the trailer line you can try just pre-reading the line and checking it.

data want ;
  infile 'myfile.dlm' dsd dlm=';' firstobs=7 truncover ;
  input @;
  if _infile_=: 'total rows' then delete;
  length Contract_Id $10 HOSTID $20 HOSTEMAIL $60 ;
  input Contract_Id HOSTID HOSTEMAIL;
run;

If the header part is variable length then just use similar method to skip over it.

data want ;
  infile 'myfile.dlm' dsd dlm=';' firstobs=7 truncover ;
  input @;
  if _n_=1 then do while (_infile_^=: 'Contract_Id;') ;
    input / @;
  end;
  if _infile_=: 'total rows' then delete;
  if _infile_=: 'Contract_Id;' then delete;
  length Contract_Id $10 HOSTID $20 HOSTEMAIL $60 ;
  input Contract_Id HOSTID HOSTEMAIL;
run;

Read CSV file conditionally

Re: Read CSV file conditionally

Re: Read CSV file conditionally

Re: Read CSV file conditionally

Re: Read CSV file conditionally

Re: Read CSV file conditionally

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away