BookmarkSubscribeRSS Feed
Sean_OConnor
Fluorite | Level 6

Good Morning,

 

I have a file which is not delimited so it's difficult to judge when one column ends and other begins. However, I have metadata around the maximum size the columns will be and whether they should be numeric or alphanumeric. Below is a snippet of the information.

 

Can anyone recommend an efficient way to read these and create a SAS dataset, please?

 

VAR NameMax lengthType
Var19Alphanumeric
Var213Numeric
Var33Numeric
Var43Numeric
Var58Alphanumeric
Var68Alphanumeric
Var71Alphanumeric
Var84Alphanumeric
Var913Numeric

 

 

6 REPLIES 6
Sean_OConnor
Fluorite | Level 6
Hi Kurt. Unfortunately I can't do this as it isn't publicly available data. I'm wondering if an infile, informat, input statement work?
Kurt_Bremser
Super User

Without seeing an example for the data, I can give you exactly zero help. I can have no clue if you have some kind of column separator, or fixed width columns, or any other thing that can help in locating column boundaries.

 

Your request is like requesting a diagnosis from a doctor who is made deliberately deaf and blind and is not allowed to touch you.

 

Please post the example as requested, and overwrite sensitive values with X's or 0's (one character for one character, and leave everything else - special characters! - untouched).

Sajid01
Meteorite | Level 14

Hello @Sean_OConnor 
This should not be difficult. Something which one does come across.I used the approach as shown in the example below.

Hope this should solve your issue.

data test;
length line $ 
256;
input line $;
datalines;
AB1234567123456789ABCDEFGH
;;
run;
data test2;
set test;
var1=substr(line,1,9);
var2=substr(line,10,9);
var3=substr(line,19,3);
var4=substr(line,22);
run;
ballardw
Super User

If your source is text and there actually is no delimiter then likely the data is fixed column. This documentation is the basic approach for that type of data:

https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.4/lestmtsref/n13ejk9swz5vrbn0z34iazfrp0wp.htm

 

Specify the columns to read (hint: Var1 is columns 1 to 9, Var2 is 10 to 22 if your description is accurate and the data is fixed column).

Caution: 13 digits for numbers sounds likely to be odd. If these are things like account numbers or identifiers that are not to be used in numeric calculations likely they should be character. If they have leading zeroes that you want to see then they really should be character. Unfortunately many folks don't make a difference in their documentation and will say anything that contains only digits is "numeric".

Note: if you have date, time or date time values you will want to be very specific about the informat used to read them to get actual SAS date, time or datetime values and assign an appropriate format for human legibility.

FreelanceReinh
Jade | Level 19

Hello @Sean_OConnor,

 

Assuming that your file has fixed field widths, here's a simple code example using formatted input (which I prefer to column input) and five variables:

/* Create test data for demonstration */

filename ft15f001 temp;
parmcards;
B.1.1.52911,222,333.00-12987SOMETEXT
B.1.351 S49,999.95    17 101shorter
;

/* Read test data */

data want;
infile ft15f001 truncover;
input var1 $9.
      var2 comma13.
      var3 3.
      var4 3.
      var5 $upcase8.;
run;

 

The TRUNCOVER option helps to avoid problems if the record length may vary (as in the example). The length specifications (9, 13, 3, ...) in each informat are important as they determine the columns to be read for each variable (i.e., columns 1 - 9 for VAR1, 10 - 22 for VAR2, ...). For the character variables they also define the lengths (if not defined otherwise, e.g., by a LENGTH statement). Use character informats for character variables and numeric informats for numeric variables. (Note that the SAS variable type can differ from the raw data type, e.g., the numeric DATE9. informat can read an "alphanumeric" field with data like 01Jan2022 and vice versa, as mentioned by ballardw.) Of course, your INFILE statement will refer to a different file name or to a physical path.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 952 views
  • 1 like
  • 5 in conversation