Hi guys;
I have the below CSV file(DLM=';'). I want to read this file in sas datasets. The file contains lot of useless records, so we only want to read the records following the file - Detailed Report:;; till total rows returned:5;;
also the header will be the first record after the row- Detailed Report:;; .
for below file, the header is Contract_Id;HOSTID;HOSTEMAIL
Can anybody help? Thanks in advance;
Input file:
This is a test file;created on19062017;Created by Mnjtrana
country-India;;
Report name:Testing;;
email:test@test.com;;
Detailed Report:;;
Contract_Id;HOSTID;HOSTEMAIL
1245;bdzasd;dd@ff.dk
3454;bdiojk;xx@dd.com
3133;bdgegg;df23@dd.com
489;bdui3;u87@ff.com
2435;bdyu9k;fsf2@gh.com
total rows returned:5;;
If there are always the same number of header lines then just skip them using the FIRSTOBS= option on the INFILE statement.
To skip the trailer line you can try just pre-reading the line and checking it.
data want ;
infile 'myfile.dlm' dsd dlm=';' firstobs=7 truncover ;
input @;
if _infile_=: 'total rows' then delete;
length Contract_Id $10 HOSTID $20 HOSTEMAIL $60 ;
input Contract_Id HOSTID HOSTEMAIL;
run;
If the header part is variable length then just use similar method to skip over it.
data want ;
infile 'myfile.dlm' dsd dlm=';' firstobs=7 truncover ;
input @;
if _n_=1 then do while (_infile_^=: 'Contract_Id;') ;
input / @;
end;
if _infile_=: 'total rows' then delete;
if _infile_=: 'Contract_Id;' then delete;
length Contract_Id $10 HOSTID $20 HOSTEMAIL $60 ;
input Contract_Id HOSTID HOSTEMAIL;
run;
RETAIN a flag variable in the data step, and initialize it to 0.
Do a simple
input@;
This holds the line pointer.
If _infile_ contains "Detailed Report", set the flag to 1 and do
input;
to skip to the next line.
If _infile_ contains "total rows returned", set flag to 0 again.
If flag is set, read the line as intended. Use @1 to start from the beginning of the line.
Sorry, have to say it, that is not a CSV (Comma Separated Varible file) it is a delimited file. Its not even a real delimited file, you have multiple delimited files in one. As read is sequential, most techniques will still have to read the data, wether they output it to the dataset or not is a different matter. Personally I would go back to the author and ask then to fix it, either by providing correctly formatted files for each section, or in a format which can handle multiple datasets such as XML. Alternatively you can write your own file processor (i.e. you write maintain, validate etc.) following the advice given by @Kurt_Bremser
If there are always the same number of header lines then just skip them using the FIRSTOBS= option on the INFILE statement.
To skip the trailer line you can try just pre-reading the line and checking it.
data want ;
infile 'myfile.dlm' dsd dlm=';' firstobs=7 truncover ;
input @;
if _infile_=: 'total rows' then delete;
length Contract_Id $10 HOSTID $20 HOSTEMAIL $60 ;
input Contract_Id HOSTID HOSTEMAIL;
run;
If the header part is variable length then just use similar method to skip over it.
data want ;
infile 'myfile.dlm' dsd dlm=';' firstobs=7 truncover ;
input @;
if _n_=1 then do while (_infile_^=: 'Contract_Id;') ;
input / @;
end;
if _infile_=: 'total rows' then delete;
if _infile_=: 'Contract_Id;' then delete;
length Contract_Id $10 HOSTID $20 HOSTEMAIL $60 ;
input Contract_Id HOSTID HOSTEMAIL;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.