BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
jen123
Fluorite | Level 6

I have a file called ABC.txt.  It contains columns that are numeric.  When I use PROC IMPORT to import the text file, it changes the numeric variables to character.  Here is my code:

proc import datafile=ABC.txt

out=ABC  dbms=tab replace;

getnames=yes;

guessingrows=10000;

RUN;

When I use EG to import the text file (File/Import Data), the variables/columns come in as numeric.

Any thoughts as to what is happening?  I would like to void having to list EVERY variables and assign the type.

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
kmw
SAS Employee kmw
SAS Employee

Editor’s Note: The Enterprise Guide(EG) Import Tool and PROC IMPORT use 2 different types of guessing algorithms behind the scenes when reading a delimited file. If you compare the 2 SAS logs you will see a difference in the code as well as the options on the INFILE statement. Therefore variables may not be typed the same way when using the EG Import Tool versus PROC IMPORT when using in EG.  

View solution in original post

14 REPLIES 14
jen123
Fluorite | Level 6

I also tried importing the file as .csv instead of .txt - still same issue

Tom
Super User Tom
Super User

Are you missing the quotes around the physical filename in the actual code?

I usually just copy the first line into a program and assign the types myself.  I find it much easier than relying on IMPORT to guess.

data ABC ;

  infile "ABC.txt" dsd dlm='09'x firstobs=2 truncover lrecl=10000 ;

  length

<<< pasted first line of file with tabs converted to spaces and appropriate lengths assigned. >>>

;

<<< any INFORMATs needed for date/time variables >>>

input firstvar -- lastvar ;

<<< any FORMATS needed>>

run;

If you know your variables are all numeric then it is even easier.

data ABC ;

  infile "ABC.txt" dsd dlm='09'x firstobs=2 truncover lrecl=10000 ;

  input

<<< pasted first line of file with tabs converted to spaces >>>

  ;

run;

In fact you could write a program to read the first line and put it into macro variable.

jen123
Fluorite | Level 6

Hi Tom - Thanks for your response.

I am a novice in programming....learning on the fly for work.  Hence why I use EG.  I am sorry but I do not understand the program you posted.  There are 34 columns/variables in the file. 

I forgot to mention another point to my issue.  Using the PROC IMPORT some month it imports the columns as characters and other months as numeric.  This is for the same columns/variables.

Jenne

Tom
Super User Tom
Super User

What I suggested was to open the spreadsheet (or CSV file) and copy the first line with the variable names and paste that into the program.

If your file contains 34 column then you will have 34 names to copy.  Hopefully the file is formatted with column headers that are valid variable names. (no embedded spaces or special characters, etc).  If not then request the sender to re-format it.  If the file is always the same (except for the data lines) then you can make the program once and use it forever (at least until they redesign the datasheet).

data ABC ;

  infile "ABC.txt" dsd dlm='09'x firstobs=2 truncover lrecl=10000 ;

  input name1 name2 name3 ..... name34 ;

run;

Also

  • see response below from Reeza for how to find the code that PROC IMPORT generates. The code IMPORT generates is usually much more complicated than it needs to be. Plus it usually attaches formats and informats to character variables and other unwanted side effects.  But it should be a good starting point for creating a standard program for that file.
  • Just use generic variable names: input col1 - col34 ;
  • Another trick to stop IMPORT from changing its mind about the data types from month to month is to put in a dummy row in the spreadsheet with example data that SAS will interpret as the right data type. Then you set guessing rows to 1 (or is it 2?) and delete the extra row later in the program to eliminate the dummy data.
  • If you cannot read XLSX file then try saving it as XLS file instead.  But I am not sure that solves the problem of variables types changing when using PROC IMPORT.  That is caused by using EXCEL instead of an actual database system to store the data.  In excel you can put any type of data into any cell, but in a database (or a SAS dataset) the variables (columns) need to be of a single type.
BarryDeCicco
Obsidian | Level 7

My method has been to run Proc IMPORT, and then to copy the data step statements from the log into a program editor.

I then edit the statements to get what I want.  

 

Remember that for each variable, there will be one format statement, one informat statement a line in the input statement which might need to be changed.

BarryDeCicco
Obsidian | Level 7
This is what we've found to be neccessary.
Max06
Calcite | Level 5

Hi Jen,

I've had to find a way around this problem for importing from an excel file. So if you can open or paste the .txt into excel and use the "text to columns" option to break the columns out in excel, you can use the following strategy. You have to specify the type of all the variables you are worried might be imported incorrectly, but you won't have to specify the type of every variable you're importing.

LIBNAME mylibname EXCEL "mypath.xlsx";

data dataset_name;

set mylibname.excelrange (dbSasType=(myVar1=NUMERIC myVar2=NUMERIC myVar3=NUMERIC));

run;

LIBNAME mylibname CLEAR;

Basically this makes the excel workbook at "mypath.xlsx" a library (or folder), similar to your work library. It will include SAS datasets for each range in the excel file. By running the above code without running the last line (LIBNAME mylibname CLEAR), you can see the mylibname excel library in your explorer window and check the name of the range that you will want to import (called "excelrange" in the code above).

Hope this helps!

Max

jen123
Fluorite | Level 6

Hi Max,

The original file is .xlsx and I converted it to .txt or .csv because I thought EG 4.2 does not import .xlsx  files.

Unfortunately about 25 of the variables are of concern.

Jenne

ChrisRodrig
Fluorite | Level 6

This is the best answer here. The rest seemed to be worthless.

I am shocked how difficult it is to have more control over the importing of data into SAS

I really like this:

dbSasType=(myVar1=NUMERIC myVar2=NUMERIC myVar3=NUMERIC)

 

1) Unfortunately there is no similar option when using proc import?

2) I tried using LIBNAME mylibname EXCEL "mypath.xlsx"; instead of proc import to import my data but it only pulls in 255 variables. Is there a work around?

 

Thanks!

Chris

SASKiwi
PROC Star

Try the XLSX LIBNAME engine rather than SAS. If you want ultimate control of your input data don't use Excel use CSV instead.

 

It all goes back to the fact that Excel is deliberably designed so you cannot "lock" a column to be a particular data type like you can with databases. If Excel had this type of functionality then much of the rubbish you have to put up when importing would go away.

ChrisRodrig
Fluorite | Level 6

Thanks! I appreciate the response.

 

I tried:

LIBNAME xl XLSX 'Directory\File.csv' ;
proc datasets lib=xl; quit;
 libname xl CLEAR;

 

 

And am getting

ERROR: The XLSX engine cannot be found.
ERROR: Error in the LIBNAME statement.

 

I think the issue is I am using SAS 9.3? If I upgrade to SAS 9.4? does that look like syntax you were suggesting?

 

I checked and I do seem to have valid licenses for "SAS/ACCESS to PC Files"

SAS/ACCESS Interface to DB2
SAS/ACCESS Interface to Oracle
SAS/ACCESS Interface to Sybase
SAS/ACCESS Interface to PC Files
SAS/ACCESS Interface to ODBC
SAS/ACCESS Interface to OLE DB
SAS/ACCESS Interface to Teradata
SAS/ACCESS Interface to MySQL
SAS/ACCESS Interface to Netezza
SAS/ACCESS Interface to Aster nCluster
SAS/ACCESS Interface to Greenplum
SAS/ACCESS Interface to Sybase IQ

 

Thanks in advance.

Chris

SASKiwi
PROC Star

I think you are right. XLSX might require 9.4. BTW if you switch to using CSVs then you need to use PROC IMPORT. There is no CSV engine for the LIBNAME statement.

Reeza
Super User

The best solution is to list every variable and type, but do it once. Then you can be sure every month the file will be imported appropriately.

The easiest way is to use BASE SAS, if you have access, use proc import to import the .txt file, Then go to the log and copy the code. You may be able to do the same in EG, but I'm not sure.

Use that code as your starter and change the type for the fields that import incorrectly.  Then every month you can use that code instead of proc import.

kmw
SAS Employee kmw
SAS Employee

Editor’s Note: The Enterprise Guide(EG) Import Tool and PROC IMPORT use 2 different types of guessing algorithms behind the scenes when reading a delimited file. If you compare the 2 SAS logs you will see a difference in the code as well as the options on the INFILE statement. Therefore variables may not be typed the same way when using the EG Import Tool versus PROC IMPORT when using in EG.  

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 14 replies
  • 69412 views
  • 3 likes
  • 8 in conversation