Desktop productivity for business analysts and programmers

LRECL in infile statment informat don't work

Accepted Solution Solved
Reply
Contributor
Posts: 24
Accepted Solution

LRECL in infile statment informat don't work

Hi!
 
I'm want to import a lot of CSV(DLM=';') files and  because I had problems with numeric and data variables, I did a Data step and  INFILE to can change the informat and the format.
To know which informat choose in every variable, I did before an import with the sas's assistant for one file and I copied the program from the log and I modified thant a little. 
The problem is: in this code there is a parameter 'LRECL' with a determinate value choosed by Sas but if I set the same value (or the default value) for all files the informat and format choosed do not work in the others files, and is very difficult to know which value set for every file because I have a lot of files.
Do you know another way to make that? 
I don't understand why the format and informat works in this situation and do not work in another.
 
I copy here the structure of my program.
If you know a way to choose always the right informat I will be very grateful, I have often problem with that.
 
Thanks you very much!

Accepted Solutions
Solution
3 weeks ago
Contributor
Posts: 24

Re: LRECL in infile statment informat don't work

Thank you very much for your reponse. I paste below the code structure.

 

If i understood correctly, I must look for the LRECL's value which work ok for the biggest file and then set this value in the LRECL for all the files ? 

This does not generate the problems of disk space? I have to import several big files.

 

%MACRO import (fichier=,table_name=);
DATA WORK.&table_name;
INFILE "/data/XXX/Lignes/&fichier"
LRECL=603
ENCODING="LATIN9"
TERMSTR=CRLF
DLM=';' FIRSTOBS=2
MISSOVER
DSD ;
INPUT
'VARIABLE A'n : $27.
vARIABLE B: $34.

FORMAT
'VARIABLE A'n $27.
vARIABLE B $34.

INFORMAT
'VARIABLE A'n $27.
vARIABLE B $34.

RUN;

%MEND;

%import (fichier=file_AB.csv, table_name=table_sas_AB);

View solution in original post


All Replies
Respected Advisor
Posts: 4,679

Re: LRECL in infile statment informat don't work

Posted in reply to luciacossaro

@luciacossaro

I believe you forgot to post your code.

 

In general:

LRECL tells SAS how long a source record can be - but the source record doesn't have to be that long. If you want to be on the safe side then just use a value for LRECL which is for sure always a bit bigger than the longest source record you can reasonably expect. 

Solution
3 weeks ago
Contributor
Posts: 24

Re: LRECL in infile statment informat don't work

Thank you very much for your reponse. I paste below the code structure.

 

If i understood correctly, I must look for the LRECL's value which work ok for the biggest file and then set this value in the LRECL for all the files ? 

This does not generate the problems of disk space? I have to import several big files.

 

%MACRO import (fichier=,table_name=);
DATA WORK.&table_name;
INFILE "/data/XXX/Lignes/&fichier"
LRECL=603
ENCODING="LATIN9"
TERMSTR=CRLF
DLM=';' FIRSTOBS=2
MISSOVER
DSD ;
INPUT
'VARIABLE A'n : $27.
vARIABLE B: $34.

FORMAT
'VARIABLE A'n $27.
vARIABLE B $34.

INFORMAT
'VARIABLE A'n $27.
vARIABLE B $34.

RUN;

%MEND;

%import (fichier=file_AB.csv, table_name=table_sas_AB);
Super User
Posts: 9,913

Re: LRECL in infile statment informat don't work

Posted in reply to luciacossaro

First of all, your data step is missing several semicolons:

DATA WORK.&table_name;
INFILE "/data/XXX/Lignes/&fichier"
LRECL=603
ENCODING="LATIN9"
TERMSTR=CRLF
DLM=';' FIRSTOBS=2
MISSOVER
DSD ;
INPUT
'VARIABLE A'n : $27.
vARIABLE B: $34.
;
FORMAT
'VARIABLE A'n $27.
vARIABLE B $34.
;
INFORMAT
'VARIABLE A'n $27.
vARIABLE B $34.
;
RUN;

second, I STRONGLY advise you to not use the named literal construct:

'VARIABLE A'n

Use a proper SAS name instead:

variable_a

and put nicely formatted text into the label, where it belongs.

Third, I have a strong suspicion that this input statement

INPUT
'VARIABLE A'n : $27.
vARIABLE B: $34.
;

will not do what you want, as it creates a character variable 'VARIABLE A' of length 27, a numeric variable vARIABLE, and a character variable B of length 34.

Here you already have one of the pitfalls caused by using named literals.

 

regarding your question:

The space required by a SAS dataset is determined (roughly) by observation size * observation number, where observation size is the sum of all lengths of all variables in an observation. So if you read only three numbers from a file with lrecl=32767, the observation size will still be just 24.

lrecl= just determines the size of the buffer set aside for reading a single line from the input file; what goes into the dataset is determined solely by the following data step statements.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Respected Advisor
Posts: 4,679

Re: LRECL in infile statment informat don't work

[ Edited ]
Posted in reply to luciacossaro

@luciacossaro

LRECL is used to read data from source into the SAS buffer (1 line at a time). It has nothing to do with how much disk space gets used when writing the SAS file to disk.

 

Having an unnecessarily overlong LRECL will cause SAS to assign more memory to the buffer and impact a bit on performance and memory usage - but this is it.

 

If you're concerned about disk storage then make sure to use option COMPRESS.

%MACRO import (fichier=,table_name=);
  DATA WORK.&table_name (compress=yes);
 ........

 

Super User
Posts: 9,913

Re: LRECL in infile statment informat don't work

Posted in reply to luciacossaro

Just to clarify, this is how I would write the data step:

data work.&table_name;
infile
  "/data/XXX/Lignes/&fichier"
  lrecl=603
  encoding="LATIN9"
  termstr=CRLF
  dlm=';'
  firstobs=2
  truncover
  dsd
;
input
  variable_a :$27.
  variable_b :$34.
;
label
  variable_a = "VARIABLE A"
  variable_b = "VARIABLE B"
;
run;

informat and format as posted by you are redundant for character variables, so I omitted them.

Note how indendation makes functional blocks visible.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 138 views
  • 0 likes
  • 3 in conversation