Hello,
I have a flat file of a large number of categorical variables and their labels that I need to import into SAS.
Sample lines:
Label Variable Type (categorical or numeric) Category or value
Not reported Sex C NR
Female Sex C F
Male Sex C M
1:2 Ratio C 2
1:3 Ratio C 3
The challenges seem to be that, even though the file is tab delimited, individual values are of:
- Variable length of labels, variable names, and category/value
- Some entries contain spaces
- Some entries contain : symbols
- Some lines are for categorical variables, some for numerical
My code as follows:
data format;
infile "H:\Working data\file.dat" delimiter='05'x;
input label & $ variable $ type $ value &;
run ;
This has not worked, with the most consistent error message " SAS went to a new line when INPUT statement reached past the end of a line. Moreover, the values in the .sas dataset seem to be eliminating spaces.
Two questions for the group:
- Is reading in the file even the right thing to do, if I only need it eventually to label individual variables?
- If I do need to import the file, how can I modify the code to ensure that the file is read in accurately?
Thanks very much,
Emily
Tabs are '09'x, not '05'x.
Define the variables with a proper length in a LENGTH statement before using them in the INPUT. After that, you do not need to use any attributes in INPUT.
Tabs are '09'x, not '05'x.
Define the variables with a proper length in a LENGTH statement before using them in the INPUT. After that, you do not need to use any attributes in INPUT.
Thank you!
Labels are attached to variables. You can modify the way values are displayed by using a FORMAT. So it looks like your file has format definitions, not variable labels.
Your posted file does not look like a delimited file. It looks like someone has tried to align the text into columns. If it does have tab character then perhaps it is because someone edited the file in a word processor that inserts tab characters to replace spaces? If it has tabs they should the '09'x on an ASCII based system, Only on EBCIDIC based systems (IBM mainframes) is '05'x a tab character.
You can use the EXPANDTABS option on the INFILE statement to have the tabs expanded to spaces using stops every 8 columns.
Try this code to read the file and generate the formats.
data ctnlin ;
infile "H:\Working data\file.dat" expandtabs truncover firstobs=2;
length fmtname $32 type $1 start $40 label $200 ;
input label fmtname type start ;
if type='C' then fmtname='$' || fmtname;
if andigit(char(fmtname,length(fmtname)) then fmtname=cats(fmtname,'F');
run;
proc format cntlin=cntlin fmtlib;
run;
Results
---------------------------------------------------------------------------- | FORMAT NAME: $RATIO LENGTH: 3 NUMBER OF VALUES: 2 | | MIN LENGTH: 1 MAX LENGTH: 40 DEFAULT LENGTH: 3 FUZZ: 0 | |--------------------------------------------------------------------------| |START |END |LABEL (VER. V7|V8 01AUG2021:17:34:15)| |----------------+----------------+----------------------------------------| |2 |2 |1:2 | |3 |3 |1:3 | ---------------------------------------------------------------------------- ---------------------------------------------------------------------------- | FORMAT NAME: $SEX LENGTH: 12 NUMBER OF VALUES: 3 | | MIN LENGTH: 1 MAX LENGTH: 40 DEFAULT LENGTH: 12 FUZZ: 0 | |--------------------------------------------------------------------------| |START |END |LABEL (VER. V7|V8 01AUG2021:17:34:15)| |----------------+----------------+----------------------------------------| |F |F |Female | |M |M |Male | |NR |NR |Not reported | ----------------------------------------------------------------------------
Thank you!
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.