Building models with SAS Enterprise Miner, SAS Factory Miner, SAS Visual Data Mining and Machine Learning or just with programming

SAS data step infile delimiter questions

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 82
Accepted Solution

SAS data step infile delimiter questions

I came across a data step like this in my program:

data test  ;

   infile '/sasdata/REF.txt'

          lrecl = 256

          delimiter = '    '

          dsd

          missover

          firstobs = 2;

   ;

   attrib MODEL_NM length = $11

      format = $11.

      informat = $11.;          

   input MODEL_NM;  

attrib GRADE_ID length = $3

      format = $3.

      informat = $3.;

input MODEL_NM GRADE_ID

run;

My questions are:

1. Why we need to specify length explicitly, when it is already specified in the format and informat?

2. The delimiter '  '---> if the column model_nm has a value like 'ABC  123' will it read correctly as one column...or as separate columns because of the space?


Accepted Solutions
Solution
‎08-20-2014 11:45 AM
Super User
Super User
Posts: 6,498

Re: SAS data step infile delimiter questions

Your first question is backward.  The FORMAT attached to a variable just defines how it should be displayed.  It is only as a side effect that it has an impact on the defined length of the variable. The real problem with this example is the attachment of permanent FORMAT and INFORMAT to simple character variables.  These can cause trouble for later users. If they inadvertently get attached to a longer variable the values will appear as if they have been truncated.

If you tell SAS to use the DSD option then two adjacent delimiters are treated as an indication of a missing value.  So if in you example the line had two spaces between 'ABC' and '123' then GRADE_ID would be assigned a missing value and the '123' would be ignored.

View solution in original post


All Replies
Solution
‎08-20-2014 11:45 AM
Super User
Super User
Posts: 6,498

Re: SAS data step infile delimiter questions

Your first question is backward.  The FORMAT attached to a variable just defines how it should be displayed.  It is only as a side effect that it has an impact on the defined length of the variable. The real problem with this example is the attachment of permanent FORMAT and INFORMAT to simple character variables.  These can cause trouble for later users. If they inadvertently get attached to a longer variable the values will appear as if they have been truncated.

If you tell SAS to use the DSD option then two adjacent delimiters are treated as an indication of a missing value.  So if in you example the line had two spaces between 'ABC' and '123' then GRADE_ID would be assigned a missing value and the '123' would be ignored.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 1 reply
  • 342 views
  • 0 likes
  • 2 in conversation