read .dat file

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 9
Accepted Solution

read .dat file

I have a test.dat file with a header and trailer row and the below columns are in the test file with the datatypes below.

The header row is of the format -> HEADER|20120325|20120324

The trailer row is -> TEST|20120325|20120324|10      (10 is the number of rows)

Filename:test.dat

Column    -   Datatype and size on the test.dat file

----------         -----------------------------------------------------------------

acct                varchar(10)

eff_date           datetime

code               char(3)

value_amt         decimal(27,6)

id_product         int

Data in test.dat file:

--------------------------------------------

HEADER|20120325|20120324

1234|2012-02-24|XTY|2988.65|3

TEST|20120325|20120324|10

I need to read the load above test.dat file and create a SAS dataset excluding the header and trailer rows. How do I do the data type conversions and load the .dat file into a sas dataset. please advise.


Accepted Solutions
Solution
‎04-19-2012 03:32 PM
Super User
Super User
Posts: 6,502

Re: read .dat file

Here is code that basically ignores the first and last lines.  You could also have it try to pull the dates and numbers from those records.

data want ;

  length acct $10 eff_date 8 code $3 value_amt 8 id_product 8 ;

  informat eff_date yymmdd.;

  format eff_date yymmdd10.;

  infile 'test.dat' end=eof truncover dsd dlm='|';

  input @;

  if _n_=1 then put _infile_;

  else if eof then put _infile_;

  else do;

    input acct--id_product ;

    output;

  end;

run;

View solution in original post


All Replies
Contributor
Posts: 42

Re: read .dat file

You should be able to skip the header row by setting firstobs=2 on your infile statement.  You can check for the value of Acct and throw away the trailer row b y not outputting it to the dataset.  Use input statements with the appropriate formats to handle data type conversions.

Solution
‎04-19-2012 03:32 PM
Super User
Super User
Posts: 6,502

Re: read .dat file

Here is code that basically ignores the first and last lines.  You could also have it try to pull the dates and numbers from those records.

data want ;

  length acct $10 eff_date 8 code $3 value_amt 8 id_product 8 ;

  informat eff_date yymmdd.;

  format eff_date yymmdd10.;

  infile 'test.dat' end=eof truncover dsd dlm='|';

  input @;

  if _n_=1 then put _infile_;

  else if eof then put _infile_;

  else do;

    input acct--id_product ;

    output;

  end;

run;

Super Contributor
Posts: 1,636

Re: read .dat file

Hi Tom,

Why did you use "dsd" in " infile 'test.dat' end=eof truncover dsd dlm='|';"

Thanks!

Super User
Super User
Posts: 6,502

Re: read .dat file

DSD option on INFILE will treat adjacent delimiters as an indication of a missing value.

So a line like this:

1234||XTY|2988.65|3

Will get a missing value for the date variable.  Without the DSD option SAS would try to read XTY for the date variable.


It seems like a strange default to NOT treat adjacent delimiters as indicating a missing value.  But think about normal list mode input statements where multiple spaces between words are treated the same as a single space.  That is the same as setting DLM=' '.

Super Contributor
Posts: 1,636

Re: read .dat file

Thank you Tom!

for OP's sample data:

HEADER|20120325|20120324

1234|2012-02-24|XTY|2988.65|3

TEST|20120325|20120324|10

both with and without 'dsd' should work.

Valued Guide
Posts: 2,175

Re: read .dat file

LInlin

although it works on this occasion it is good practise and helpful reminder, to always use DSD with delimiters except only when reading pre formatted reports where there is variable width 'white space' between the data you want to read.

Super Contributor
Posts: 1,636

Re: read .dat file

Thank you Peter! It is good to know.  - Linlin

Respected Advisor
Posts: 4,660

Re: read .dat file

I prefer something a little more robust for reading a data file with some internal validation (with pieces stolen from Tom).

data want(drop=_Smiley Happy;

  retain _hLine _tLine (2*%sysfunc(constant(EXACTINT)));

  length _kwd $10 _kDate1 8 _kDate2 $8 _nbLines 8;

  length acct $10 eff_date 8 code $3 value_amt 8 id_product 8 ;

  informat eff_date _kDate1 yymmdd.;

  format eff_date _kDate1 yymmdd10.;

  infile "test.dat" truncover dsd dlm='|';

  input _kwd--_nbLines @;

  select (_kwd);

  when ("HEADER") do;

   If _n_ > _hLine then put "Double header line: " _n_= _infile_;

_hLine = _n_;

  end;

  when ("TEST") do;

   if _n_ < _hLine then put "Missing header: " _n_= _infile_;

/* Adjust condition according to exact meaning of line count */

else if _n_ - _hLine + 1 ne _nbLines then put "Invalid line count" _n_= _infile_;

_tLine = _n_;

  end;

  otherwise do;

    if _hLine < _n_ < _tLine then do;

  input @1 acct--id_product;

  if missing(id_product) then put "Line skipped: " _n_= _infile_;

  else output;

end;

else if _infile_ ne " " then put "Line out of sequence: " _n_= _infile_;

  end;

end;

run;

PG

PG
Super User
Posts: 9,691

Re: read .dat file

It looks like the data is from some dbms ORACLE DB2?

I think OP don't want to input these variables by hand and want to generate some code dynamically.

I coded somthing before, But not exactly like yours. hope to help you a little bit.

data temp;
infile datalines expandtabs;
input ( col1 - col3 ) (: $40.);
datalines;
TALDTE     会计年月     CHAR(7)     7
TALDAY     会计日     CHAR(2)     2
BSTNUM     账套号     CHAR(2)     2
PROCOD     省市     CHAR(2)     2
BDTBR     行号     CHAR(4)     4
CCY     币种     CHAR(2)     2
SUBNUM     主科目号     CHAR(5)     5
RPTSUBNUM     报告科目号     CHAR(4)     4
RESMSG1     预留段1     CHAR(8)     8
RESMSG2     预留段2     CHAR(8)     8
RESMSG3     预留段3     CHAR(8)     8
RESMSG4     预留段4     CHAR(8)     8
RESMSG5     预留段5     CHAR(8)     8
DTYACCDTE     销户时间     CHAR(8)     8
ACCSTA     帐户状态     CHAR(2)     2
CURDBTTRSNUM     当日借方发生笔数     INTEGER     
CURDBTTRSAMT     当日借方发生额     DEC(19, 2)     19
CURCDTTRSNUM     当日贷方发生笔数     INTEGER     
CURCDTTRSAMT     当日贷方发生额     DEC(19, 2)     19
CURTDPDBTTRSNUM     当旬借方发生笔数     INTEGER     
CURTDPDBTTRSAMT     当旬借方发生额     DEC(19, 2)     19
CURTDPCDTTRSNUM     当旬贷方发生笔数     INTEGER     
CURTDPCDTTRSAMT     当旬贷方发生额     DEC(19, 2)     19
CURMONDBTTRSNUM     当月借方发生笔数     INTEGER     
CURMONDBTTRSAMT     当月借方发生额     DEC(19, 2)     19
CURMONCDTTRSNUM     当月贷方发生笔数     INTEGER     
CURMONCDTTRSAMT     当月贷方发生额     DEC(19, 2)     19
CURQUTDBTTRSNUM     当季借方发生笔数     INTEGER     
CURQUTDBTTRSAMT     当季借方发生额     DEC(19, 2)     19
CURQUTCDTTRSNUM     当季贷方发生笔数     INTEGER     
CURQUTCDTTRSAMT     当季贷方发生额     DEC(19, 2)     19
CURYERDBTTRSNUM     当年借方发生笔数     INTEGER     
CURYERDBTTRSAMT     当年借方发生额     DEC(19, 2)     19
CURYERCDTTRSNUM     当年贷方发生笔数     INTEGER     
CURYERCDTTRSAMT     当年贷方发生额     DEC(19, 2)     19
CURDBTBAL     当前借方余额     DEC(19, 2)     19
CURCDTBAL     当前贷方余额     DEC(19, 2)     19
PREDBTTRSNUM     上日借方发生笔数     INTEGER     
PREDBTTRSAMT     上日借方发生额     DEC(19, 2)     19
PRECDTTRSNUM     上日贷方发生笔数     INTEGER     
PRECDTTRSAMT     上日贷方发生额     DEC(19, 2)     19
PREDAYDBTBAL     上日借方余额     DEC(19, 2)     19
PREDAYCDTBAL     上日贷方余额     DEC(19, 2)     19
PRETDPDBTTRSNUM     上旬借方发生笔数     INTEGER     
PRETDPDBTTRSAMT     上旬借方发生额     DEC(19, 2)     19
PRETDPCDTTRSNUM     上旬贷方发生笔数     INTEGER     
PRETDPCDTTRSAMT     上旬贷方发生额     DEC(19, 2)     19
PRETDPDBTBAL     上旬借方余额     DEC(19, 2)     19
PRETDPCDTBAL     上旬贷方余额     DEC(19, 2)     19
PREMONDBTTRSNUM     上月借方发生笔数     INTEGER     
PREMONDBTTRSAMT     上月借方发生额     DEC(19, 2)     19
PREMONCDTTRSNUM     上月贷方发生笔数     INTEGER     
PREMONCDTTRSAMT     上月贷方发生额     DEC(19, 2)     19
PREMONDBTBAL     上月借方余额     DEC(19, 2)     19
PREMONCDTBAL     上月贷方余额     DEC(19, 2)     19
PREQUTDBTTRSNUM     上季借方发生笔数     INTEGER     
PREQUTDBTTRSAMT     上季借方发生额     DEC(19, 2)     19
PREQUTCDTTRSNUM     上季贷方发生笔数     INTEGER     
PREQUTCDTTRSAMT     上季贷方发生额     DEC(19, 2)     19
PREQUTDBTBAL     上季借方余额     DEC(19, 2)     19
PREQUTCDTBAL     上季贷方余额     DEC(19, 2)     19
PREYERDBTTRSNUM     上年借方发生笔数     INTEGER     
PREYERDBTTRSAMT     上年借方发生额     DEC(19, 2)     19
PREYERCDTTRSNUM     上年贷方发生笔数     INTEGER     
PREYERCDTTRSAMT     上年贷方发生额     DEC(19, 2)     19
PREYERDBTBAL     上年借方余额     DEC(19, 2)     19
PREYERCDTBAL     上年贷方余额     DEC(19, 2)     19
CURTDPDAYDBTBAL     本旬日平均借方余额     DEC(19, 2)     19
CURTDPDAYCDTBAL     本旬日平均贷方余额     DEC(19, 2)     19
CURMONDAYDBTBAL     本月日平均借方余额     DEC(19, 2)     19
CURMONDAYCDTBAL     本月日平均贷方余额     DEC(19, 2)     19
CURQUTDAYDBTBAL     本季日平均借方余额     DEC(19, 2)     19
CURQUTDAYCDTBAL     本季日平均贷方余额     DEC(19, 2)     19
CURYERDAYDBTBAL     本年日平均借方余额     DEC(19, 2)     19
CURYERDAYCDTBAL     本年日平均贷方余额     DEC(19, 2)     19
PRETDPDAYDBTBAL     上旬日平均借方余额     DEC(19, 2)     19
PRETDPDAYCDTBAL     上旬日平均贷方余额     DEC(19, 2)     19
PREMONDAYDBTBAL     上月日平均借方余额     DEC(19, 2)     19
PREMONDAYCDTBAL     上月日平均贷方余额     DEC(19, 2)     19
PREQUTDAYDBTBAL     上季日平均借方余额     DEC(19, 2)     19
PREQUTDAYCDTBAL     上季日平均贷方余额     DEC(19, 2)     19
PREYERDAYDBTBAL     上年日平均借方余额     DEC(19, 2)     19
PREYERDAYCDTBAL     上年日平均贷方余额     DEC(19, 2)     19
GTLLGRDBTBAL     分户汇总借方余额     DEC(19, 2)     19
GTLLGRCDTBAL     分户汇总贷方余额     DEC(19, 2)     19
ACCBLDDTE     建账时间     CHAR(26)     26
BNSTYP     类型     CHAR(1)     1
;
run;


data temp;
 set temp;
 length f $ 40;
 select ;
 when(col3 eq: 'CHAR') f=cats(': $',scan(col3,-1,'() '),'.');
 otherwise f=': best32.';
 end;
run;

proc sql noprint;
 select catx(' ',col1,f) as a,cats(col1,'="',col2,'"') as b
into : informat separated by ' ', : label separated by ' '
  from temp;

quit;
%put &informat &label ;

data qkm4.xx;
input fname : $200. ;
fname=cats('/spds_data11/temp/ABC/GL_CURGLGHIS/',fname);
infile dummy filevar=fname end=last delimiter='|!' truncover 
 lrecl=32767 ;
do until(last);
 input &informat;
 output;         

label &label ;
end;
datalines;
18GL_CURGLGHIS220100831
18GL_CURGLGHIS220100930
18GL_CURGLGHIS220101031
19GL_CURGLGHIS220100831
19GL_CURGLGHIS220100930
19GL_CURGLGHIS220101031
20GL_CURGLGHIS220100831
20GL_CURGLGHIS220100930
20GL_CURGLGHIS220101031
21GL_CURGLGHIS220100831
21GL_CURGLGHIS220100930
21GL_CURGLGHIS220101031
22GL_CURGLGHIS220100831
22GL_CURGLGHIS220100930
22GL_CURGLGHIS220101031
;
run;


Ksharp

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 881 views
  • 6 likes
  • 7 in conversation