BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
luciacossaro
Obsidian | Level 7
Hi!
 
I'm want to import a lot of CSV(DLM=';') files and  because I had problems with numeric and data variables, I did a Data step and  INFILE to can change the informat and the format.
To know which informat choose in every variable, I did before an import with the sas's assistant for one file and I copied the program from the log and I modified thant a little. 
The problem is: in this code there is a parameter 'LRECL' with a determinate value choosed by Sas but if I set the same value (or the default value) for all files the informat and format choosed do not work in the others files, and is very difficult to know which value set for every file because I have a lot of files.
Do you know another way to make that? 
I don't understand why the format and informat works in this situation and do not work in another.
 
I copy here the structure of my program.
If you know a way to choose always the right informat I will be very grateful, I have often problem with that.
 
Thanks you very much!
1 ACCEPTED SOLUTION

Accepted Solutions
luciacossaro
Obsidian | Level 7

Thank you very much for your reponse. I paste below the code structure.

 

If i understood correctly, I must look for the LRECL's value which work ok for the biggest file and then set this value in the LRECL for all the files ? 

This does not generate the problems of disk space? I have to import several big files.

 

%MACRO import (fichier=,table_name=);
DATA WORK.&table_name;
INFILE "/data/XXX/Lignes/&fichier"
LRECL=603
ENCODING="LATIN9"
TERMSTR=CRLF
DLM=';' FIRSTOBS=2
MISSOVER
DSD ;
INPUT
'VARIABLE A'n : $27.
vARIABLE B: $34.

FORMAT
'VARIABLE A'n $27.
vARIABLE B $34.

INFORMAT
'VARIABLE A'n $27.
vARIABLE B $34.

RUN;

%MEND;

%import (fichier=file_AB.csv, table_name=table_sas_AB);

View solution in original post

5 REPLIES 5
Patrick
Opal | Level 21

@luciacossaro

I believe you forgot to post your code.

 

In general:

LRECL tells SAS how long a source record can be - but the source record doesn't have to be that long. If you want to be on the safe side then just use a value for LRECL which is for sure always a bit bigger than the longest source record you can reasonably expect. 

luciacossaro
Obsidian | Level 7

Thank you very much for your reponse. I paste below the code structure.

 

If i understood correctly, I must look for the LRECL's value which work ok for the biggest file and then set this value in the LRECL for all the files ? 

This does not generate the problems of disk space? I have to import several big files.

 

%MACRO import (fichier=,table_name=);
DATA WORK.&table_name;
INFILE "/data/XXX/Lignes/&fichier"
LRECL=603
ENCODING="LATIN9"
TERMSTR=CRLF
DLM=';' FIRSTOBS=2
MISSOVER
DSD ;
INPUT
'VARIABLE A'n : $27.
vARIABLE B: $34.

FORMAT
'VARIABLE A'n $27.
vARIABLE B $34.

INFORMAT
'VARIABLE A'n $27.
vARIABLE B $34.

RUN;

%MEND;

%import (fichier=file_AB.csv, table_name=table_sas_AB);
Kurt_Bremser
Super User

First of all, your data step is missing several semicolons:

DATA WORK.&table_name;
INFILE "/data/XXX/Lignes/&fichier"
LRECL=603
ENCODING="LATIN9"
TERMSTR=CRLF
DLM=';' FIRSTOBS=2
MISSOVER
DSD ;
INPUT
'VARIABLE A'n : $27.
vARIABLE B: $34.
;
FORMAT
'VARIABLE A'n $27.
vARIABLE B $34.
;
INFORMAT
'VARIABLE A'n $27.
vARIABLE B $34.
;
RUN;

second, I STRONGLY advise you to not use the named literal construct:

'VARIABLE A'n

Use a proper SAS name instead:

variable_a

and put nicely formatted text into the label, where it belongs.

Third, I have a strong suspicion that this input statement

INPUT
'VARIABLE A'n : $27.
vARIABLE B: $34.
;

will not do what you want, as it creates a character variable 'VARIABLE A' of length 27, a numeric variable vARIABLE, and a character variable B of length 34.

Here you already have one of the pitfalls caused by using named literals.

 

regarding your question:

The space required by a SAS dataset is determined (roughly) by observation size * observation number, where observation size is the sum of all lengths of all variables in an observation. So if you read only three numbers from a file with lrecl=32767, the observation size will still be just 24.

lrecl= just determines the size of the buffer set aside for reading a single line from the input file; what goes into the dataset is determined solely by the following data step statements.

Patrick
Opal | Level 21

@luciacossaro

LRECL is used to read data from source into the SAS buffer (1 line at a time). It has nothing to do with how much disk space gets used when writing the SAS file to disk.

 

Having an unnecessarily overlong LRECL will cause SAS to assign more memory to the buffer and impact a bit on performance and memory usage - but this is it.

 

If you're concerned about disk storage then make sure to use option COMPRESS.

%MACRO import (fichier=,table_name=);
  DATA WORK.&table_name (compress=yes);
 ........

 

Kurt_Bremser
Super User

Just to clarify, this is how I would write the data step:

data work.&table_name;
infile
  "/data/XXX/Lignes/&fichier"
  lrecl=603
  encoding="LATIN9"
  termstr=CRLF
  dlm=';'
  firstobs=2
  truncover
  dsd
;
input
  variable_a :$27.
  variable_b :$34.
;
label
  variable_a = "VARIABLE A"
  variable_b = "VARIABLE B"
;
run;

informat and format as posted by you are redundant for character variables, so I omitted them.

Note how indendation makes functional blocks visible.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1198 views
  • 0 likes
  • 3 in conversation