- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi guys;
I have the below CSV file(DLM=';'). I want to read this file in sas datasets. The file contains lot of useless records, so we only want to read the records following the file - Detailed Report:;; till total rows returned:5;;
also the header will be the first record after the row- Detailed Report:;; .
for below file, the header is Contract_Id;HOSTID;HOSTEMAIL
Can anybody help? Thanks in advance;
Input file:
This is a test file;created on19062017;Created by Mnjtrana
country-India;;
Report name:Testing;;
email:test@test.com;;
Detailed Report:;;
Contract_Id;HOSTID;HOSTEMAIL
1245;bdzasd;dd@ff.dk
3454;bdiojk;xx@dd.com
3133;bdgegg;df23@dd.com
489;bdui3;u87@ff.com
2435;bdyu9k;fsf2@gh.com
total rows returned:5;;
Cheers from India!
Manjeet
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If there are always the same number of header lines then just skip them using the FIRSTOBS= option on the INFILE statement.
To skip the trailer line you can try just pre-reading the line and checking it.
data want ;
infile 'myfile.dlm' dsd dlm=';' firstobs=7 truncover ;
input @;
if _infile_=: 'total rows' then delete;
length Contract_Id $10 HOSTID $20 HOSTEMAIL $60 ;
input Contract_Id HOSTID HOSTEMAIL;
run;
If the header part is variable length then just use similar method to skip over it.
data want ;
infile 'myfile.dlm' dsd dlm=';' firstobs=7 truncover ;
input @;
if _n_=1 then do while (_infile_^=: 'Contract_Id;') ;
input / @;
end;
if _infile_=: 'total rows' then delete;
if _infile_=: 'Contract_Id;' then delete;
length Contract_Id $10 HOSTID $20 HOSTEMAIL $60 ;
input Contract_Id HOSTID HOSTEMAIL;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
RETAIN a flag variable in the data step, and initialize it to 0.
Do a simple
input@;
This holds the line pointer.
If _infile_ contains "Detailed Report", set the flag to 1 and do
input;
to skip to the next line.
If _infile_ contains "total rows returned", set flag to 0 again.
If flag is set, read the line as intended. Use @1 to start from the beginning of the line.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Sorry, have to say it, that is not a CSV (Comma Separated Varible file) it is a delimited file. Its not even a real delimited file, you have multiple delimited files in one. As read is sequential, most techniques will still have to read the data, wether they output it to the dataset or not is a different matter. Personally I would go back to the author and ask then to fix it, either by providing correctly formatted files for each section, or in a format which can handle multiple datasets such as XML. Alternatively you can write your own file processor (i.e. you write maintain, validate etc.) following the advice given by @Kurt_Bremser
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Changing the content of this file is not possible, already checked with the author. I will try th solution given by @Kurt_Bremser
Cheers from India!
Manjeet
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If there are always the same number of header lines then just skip them using the FIRSTOBS= option on the INFILE statement.
To skip the trailer line you can try just pre-reading the line and checking it.
data want ;
infile 'myfile.dlm' dsd dlm=';' firstobs=7 truncover ;
input @;
if _infile_=: 'total rows' then delete;
length Contract_Id $10 HOSTID $20 HOSTEMAIL $60 ;
input Contract_Id HOSTID HOSTEMAIL;
run;
If the header part is variable length then just use similar method to skip over it.
data want ;
infile 'myfile.dlm' dsd dlm=';' firstobs=7 truncover ;
input @;
if _n_=1 then do while (_infile_^=: 'Contract_Id;') ;
input / @;
end;
if _infile_=: 'total rows' then delete;
if _infile_=: 'Contract_Id;' then delete;
length Contract_Id $10 HOSTID $20 HOSTEMAIL $60 ;
input Contract_Id HOSTID HOSTEMAIL;
run;