DATA Step, Macro, Functions and more

Read CSV file conditionally

Accepted Solution Solved
Reply
Contributor
Posts: 52
Accepted Solution

Read CSV file conditionally

Hi guys;

 

I have the below CSV file(DLM=';'). I want to read this file in sas datasets. The file contains lot of useless records, so we only want to read the records following the file - Detailed Report:;; till total rows returned:5;;

 

also the header will be the first record after the row- Detailed Report:;; .

for below file, the header is Contract_Id;HOSTID;HOSTEMAIL

 

Can anybody help? Thanks in advance;

 

Input file:

This is a test file;created on19062017;Created by Mnjtrana

country-India;;

Report name:Testing;;

email:test@test.com;;

Detailed Report:;;

Contract_Id;HOSTID;HOSTEMAIL

1245;bdzasd;dd@ff.dk

3454;bdiojk;xx@dd.com

3133;bdgegg;df23@dd.com

489;bdui3;u87@ff.com

2435;bdyu9k;fsf2@gh.com

total rows returned:5;;

 

required output


Cheers from India!

Manjeet

Accepted Solutions
Solution
‎06-21-2017 05:51 AM
Super User
Super User
Posts: 7,076

Re: Read CSV file conditionally

[ Edited ]

If there are always the same number of header lines then just skip them using the FIRSTOBS= option on the INFILE statement.

To skip the trailer line you can try just pre-reading the line and checking it.

data want ;
  infile 'myfile.dlm' dsd dlm=';' firstobs=7 truncover ;
  input @;
  if _infile_=: 'total rows' then delete;
  length Contract_Id $10 HOSTID $20 HOSTEMAIL $60 ;
  input Contract_Id HOSTID HOSTEMAIL;
run;

If the header part is variable length then just use similar method to skip over it.

data want ;
  infile 'myfile.dlm' dsd dlm=';' firstobs=7 truncover ;
  input @;
  if _n_=1 then do while (_infile_^=: 'Contract_Id;') ;
    input / @;
  end;
  if _infile_=: 'total rows' then delete;
  if _infile_=: 'Contract_Id;' then delete;
  length Contract_Id $10 HOSTID $20 HOSTEMAIL $60 ;
  input Contract_Id HOSTID HOSTEMAIL;
run;

 

View solution in original post


All Replies
Super User
Posts: 7,854

Re: Read CSV file conditionally

[ Edited ]

RETAIN a flag variable in the data step, and initialize it to 0.

Do a simple

input@;

This holds the line pointer.

If _infile_ contains "Detailed Report", set the flag to 1 and do

input;

to skip to the next line.

If _infile_ contains "total rows returned", set flag to 0 again.

If flag is set, read the line as intended. Use @1 to start from the beginning of the line.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Super User
Super User
Posts: 7,988

Re: Read CSV file conditionally

Sorry, have to say it, that is not a CSV (Comma Separated Varible file) it is a delimited file.  Its not even a real delimited file, you have multiple delimited files in one.  As read is sequential, most techniques will still have to read the data, wether they output it to the dataset or not is a different matter.  Personally I would go back to the author and ask then to fix it, either by providing correctly formatted files for each section, or in a format which can handle multiple datasets such as XML.  Alternatively you can write your own file processor (i.e. you write maintain, validate etc.) following the advice given by @KurtBremser

Contributor
Posts: 52

Re: Read CSV file conditionally

sorry my bad, this is a delimited file and yes it contains multiple type of data.
Changing the content of this file is not possible, already checked with the author. I will try th solution given by @KurtBremser

Cheers from India!

Manjeet
Solution
‎06-21-2017 05:51 AM
Super User
Super User
Posts: 7,076

Re: Read CSV file conditionally

[ Edited ]

If there are always the same number of header lines then just skip them using the FIRSTOBS= option on the INFILE statement.

To skip the trailer line you can try just pre-reading the line and checking it.

data want ;
  infile 'myfile.dlm' dsd dlm=';' firstobs=7 truncover ;
  input @;
  if _infile_=: 'total rows' then delete;
  length Contract_Id $10 HOSTID $20 HOSTEMAIL $60 ;
  input Contract_Id HOSTID HOSTEMAIL;
run;

If the header part is variable length then just use similar method to skip over it.

data want ;
  infile 'myfile.dlm' dsd dlm=';' firstobs=7 truncover ;
  input @;
  if _n_=1 then do while (_infile_^=: 'Contract_Id;') ;
    input / @;
  end;
  if _infile_=: 'total rows' then delete;
  if _infile_=: 'Contract_Id;' then delete;
  length Contract_Id $10 HOSTID $20 HOSTEMAIL $60 ;
  input Contract_Id HOSTID HOSTEMAIL;
run;

 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 151 views
  • 3 likes
  • 4 in conversation