BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
EmilyAV
Calcite | Level 5

Hello,

I have a flat file of a large number of categorical variables and their labels that I need to import into SAS.

Sample lines:

Label               Variable   Type (categorical or numeric)     Category or value

Not reported    Sex          C                                                NR

Female            Sex          C                                                F

Male                Sex          C                                                M

1:2                   Ratio        C                                               2

1:3                   Ratio        C                                               3

 

The challenges seem to be that, even though the file is tab delimited, individual values are of:

- Variable length of labels, variable names, and category/value 

- Some entries contain spaces

- Some entries contain : symbols

- Some lines are for categorical variables, some for numerical

 

My code as follows:

data format;
infile "H:\Working data\file.dat" delimiter='05'x;
input label & $ variable $ type $ value &;
run ;

 

This has not worked, with the most consistent error message " SAS went to a new line when INPUT statement reached past the end of a line. Moreover, the values in the .sas dataset seem to be eliminating spaces.

 

Two questions for the group:

- Is reading in the file even the right thing to do, if I only need it eventually to label individual variables?

- If I do need to import the file, how can I modify the code to ensure that the file is read in accurately?

 

 

Thanks very much,

Emily

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

Tabs are '09'x, not '05'x.

Define the variables with a proper length in a LENGTH statement before using them in the INPUT. After that, you do not need to use any attributes in INPUT.

View solution in original post

4 REPLIES 4
Kurt_Bremser
Super User

Tabs are '09'x, not '05'x.

Define the variables with a proper length in a LENGTH statement before using them in the INPUT. After that, you do not need to use any attributes in INPUT.

Tom
Super User Tom
Super User

Labels are attached to variables.  You can modify the way values are displayed by using a FORMAT.  So it looks like your file has format definitions, not variable labels.

 

Your posted file does not look like a delimited file.  It looks like someone has tried to align the text into columns. If it does have tab character then perhaps it is because someone edited the file in a word processor that inserts tab characters to replace spaces?   If it has tabs they should the '09'x on an ASCII based system, Only on EBCIDIC based systems (IBM mainframes) is '05'x a tab character.

 

You can use the EXPANDTABS option on the INFILE statement to have the tabs expanded to spaces using stops every 8 columns.

 

Try this code to read the file and generate the formats.

data ctnlin ;
  infile "H:\Working data\file.dat" expandtabs truncover firstobs=2;
  length fmtname $32 type $1 start $40 label $200 ;
  input label fmtname type start ;
  if type='C' then fmtname='$' || fmtname;
  if andigit(char(fmtname,length(fmtname)) then fmtname=cats(fmtname,'F');
run;

proc format cntlin=cntlin fmtlib;
run;

Results

----------------------------------------------------------------------------
|       FORMAT NAME: $RATIO   LENGTH:    3   NUMBER OF VALUES:    2        |
|   MIN LENGTH:   1  MAX LENGTH:  40  DEFAULT LENGTH:   3  FUZZ:        0  |
|--------------------------------------------------------------------------|
|START           |END             |LABEL  (VER. V7|V8   01AUG2021:17:34:15)|
|----------------+----------------+----------------------------------------|
|2               |2               |1:2                                     |
|3               |3               |1:3                                     |
----------------------------------------------------------------------------


----------------------------------------------------------------------------
|       FORMAT NAME: $SEX     LENGTH:   12   NUMBER OF VALUES:    3        |
|   MIN LENGTH:   1  MAX LENGTH:  40  DEFAULT LENGTH:  12  FUZZ:        0  |
|--------------------------------------------------------------------------|
|START           |END             |LABEL  (VER. V7|V8   01AUG2021:17:34:15)|
|----------------+----------------+----------------------------------------|
|F               |F               |Female                                  |
|M               |M               |Male                                    |
|NR              |NR              |Not reported                            |
----------------------------------------------------------------------------

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 732 views
  • 0 likes
  • 3 in conversation