BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mnjtrana
Pyrite | Level 9

Hi guys;

 

I have the below CSV file(DLM=';'). I want to read this file in sas datasets. The file contains lot of useless records, so we only want to read the records following the file - Detailed Report:;; till total rows returned:5;;

 

also the header will be the first record after the row- Detailed Report:;; .

for below file, the header is Contract_Id;HOSTID;HOSTEMAIL

 

Can anybody help? Thanks in advance;

 

Input file:

This is a test file;created on19062017;Created by Mnjtrana

country-India;;

Report name:Testing;;

email:test@test.com;;

Detailed Report:;;

Contract_Id;HOSTID;HOSTEMAIL

1245;bdzasd;dd@ff.dk

3454;bdiojk;xx@dd.com

3133;bdgegg;df23@dd.com

489;bdui3;u87@ff.com

2435;bdyu9k;fsf2@gh.com

total rows returned:5;;

 

required output


Cheers from India!

Manjeet
1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

If there are always the same number of header lines then just skip them using the FIRSTOBS= option on the INFILE statement.

To skip the trailer line you can try just pre-reading the line and checking it.

data want ;
  infile 'myfile.dlm' dsd dlm=';' firstobs=7 truncover ;
  input @;
  if _infile_=: 'total rows' then delete;
  length Contract_Id $10 HOSTID $20 HOSTEMAIL $60 ;
  input Contract_Id HOSTID HOSTEMAIL;
run;

If the header part is variable length then just use similar method to skip over it.

data want ;
  infile 'myfile.dlm' dsd dlm=';' firstobs=7 truncover ;
  input @;
  if _n_=1 then do while (_infile_^=: 'Contract_Id;') ;
    input / @;
  end;
  if _infile_=: 'total rows' then delete;
  if _infile_=: 'Contract_Id;' then delete;
  length Contract_Id $10 HOSTID $20 HOSTEMAIL $60 ;
  input Contract_Id HOSTID HOSTEMAIL;
run;

 

View solution in original post

4 REPLIES 4
Kurt_Bremser
Super User

RETAIN a flag variable in the data step, and initialize it to 0.

Do a simple

input@;

This holds the line pointer.

If _infile_ contains "Detailed Report", set the flag to 1 and do

input;

to skip to the next line.

If _infile_ contains "total rows returned", set flag to 0 again.

If flag is set, read the line as intended. Use @1 to start from the beginning of the line.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Sorry, have to say it, that is not a CSV (Comma Separated Varible file) it is a delimited file.  Its not even a real delimited file, you have multiple delimited files in one.  As read is sequential, most techniques will still have to read the data, wether they output it to the dataset or not is a different matter.  Personally I would go back to the author and ask then to fix it, either by providing correctly formatted files for each section, or in a format which can handle multiple datasets such as XML.  Alternatively you can write your own file processor (i.e. you write maintain, validate etc.) following the advice given by @Kurt_Bremser

mnjtrana
Pyrite | Level 9
sorry my bad, this is a delimited file and yes it contains multiple type of data.
Changing the content of this file is not possible, already checked with the author. I will try th solution given by @Kurt_Bremser

Cheers from India!

Manjeet
Tom
Super User Tom
Super User

If there are always the same number of header lines then just skip them using the FIRSTOBS= option on the INFILE statement.

To skip the trailer line you can try just pre-reading the line and checking it.

data want ;
  infile 'myfile.dlm' dsd dlm=';' firstobs=7 truncover ;
  input @;
  if _infile_=: 'total rows' then delete;
  length Contract_Id $10 HOSTID $20 HOSTEMAIL $60 ;
  input Contract_Id HOSTID HOSTEMAIL;
run;

If the header part is variable length then just use similar method to skip over it.

data want ;
  infile 'myfile.dlm' dsd dlm=';' firstobs=7 truncover ;
  input @;
  if _n_=1 then do while (_infile_^=: 'Contract_Id;') ;
    input / @;
  end;
  if _infile_=: 'total rows' then delete;
  if _infile_=: 'Contract_Id;' then delete;
  length Contract_Id $10 HOSTID $20 HOSTEMAIL $60 ;
  input Contract_Id HOSTID HOSTEMAIL;
run;

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 2084 views
  • 3 likes
  • 4 in conversation