04-03-2017 08:23 PM
When I try to import data from my comma delimited .txt file, the variable length is truncated. Can someone tell me why that happen? I am using the following code. Is there any better code other than PROC IMPORT? Thanks.
PROC IMPORT DATAFILE="C:\Have.txt" OUT=Want DBMS=csv replace; GETNAMES=YES; RUN;
04-03-2017 08:39 PM - edited 04-04-2017 10:49 AM
You don't really describe your issue, but perhaps the most likely is that PROC IMPORT didn't look at enought observations to make a good guess at how long the character fields should be.
You could try adding a GUESSINGROWS statement.
PROC IMPORT DATAFILE="C:\Have.txt" OUT=Want replace DBMS=dlm ; delimiter=','; GETNAMES=YES; guessingrows=32000; RUN;
But to really read a CSV file just write the data step yourself. Since it is your data you probably had a better idea of how you want to read it than SAS can guess. You could just make up dummy variable names.
data want ; infile 'have.txt' dsd truncover firstobs=2 ; length var1-var10 $200 ; input var1-var10; run;
Or you could copy the first row of the file and use it to make the variable names. You can set the type and length in a LENGTH statement. Attach any informats that are needed to read the data from the CSV file. Attach any formats that are useful. Avoid using INFORMAT or FORMAT statement as a substitute for defing your variables explicitly with a LENGTH or ATTRIB statement. Avoid attaching $xx formats/informats to character variables as they do not add value and can cause trouble when combining datasets.
Then the INPUT statement is trivial.
data want ; infile 'have.txt' dsd truncover firstobs=2 ; length id $20 date 8 name $50 ; informat date mmddyy.; format date yymmdd10.; input id -- name ; run;
04-03-2017 09:15 PM
/* T1003490 Using SAS passthru to excel to determine type and length before importing If you import the csv to excel you can query the columns for type and length this post https://goo.gl/Ap0A0p https://communities.sas.com/t5/Base-SAS-Programming/Import-from-comma-delimited-txt-file-variable-length-truncated/m-p/346869 Limitations Max number of columns 255 (but you can select a subset) Max character length 255 HAVE d:/xls/class.xlsx SASHELP.CLASS ======================================== d:/xls/class.xlsx (19 obs) +------+-----------+------+------+ | | A | B | C | +------+-----------+------+------+ | | | | | | 1 | NAME | AGE| SEX | | 2 | Alfred | 14 | M | | 3 | Alice | 13 | F | | 4 | Barbara | 13 | F | | 5 | Carol | 14 | F | | 6 | Henry | 14 | M | | ... | ... | ... | ... | +------------------+------+------+ [CLASS} or [sheet1] untested (but I have used it) WANT (create sas dataset(work.class) with correct type and length) =================================================================== TYPE All character All character No Character name_ sex_ age_ character character character ---------------------------------------- 19 19 0 LENGTH name_length sex_length age_length ------------------------------------- 7 1 2 work.class # Variable Type Len 1 NAME Char 7 2 SEX Char 1 3 AGE Num 8 WORKING CODE ============ create table class as select name length=7 ,sex length=1 ,age SOLUTION ======== * _ _ _ _ __ ___ ___| |_ __ _ __| | __ _| |_ __ _ | '_ ` _ \ / _ \ __/ _` |_____ / _` |/ _` | __/ _` | | | | | | | __/ || (_| |_____| (_| | (_| | || (_| | |_| |_| |_|\___|\__\__,_| \__,_|\__,_|\__\__,_| ; /* HOW MANY COLUMNS ARE CHARACTER */ /* HOW MANY COLUMNS HAVE AT LEAST ONE CHARACTER CELL */ proc sql dquote=ansi; connect to excel (Path="d:\xls\class.xlsx"); select * from connection to Excel ( Select count(*) + sum(isnumeric(name)) as name_character ,count(*) + sum(isnumeric(sex)) as sex_character ,count(*) + sum(isnumeric(age)) as age_character from class ); disconnect from Excel; quit; /* Since there are 19 obs name and sex are charcter and age is numeric Up to 40 obs from xel.class total obs=19 name_ sex_ age_ character character character ---------------------------------------- 19 19 0 */ /* LETS GET THE MAX LENGTH OF EACH VARIABLE FIRST */ proc sql dquote=ansi; connect to excel (Path="d:\xls\class.xlsx"); select * from connection to Excel ( Select max(len(name)) as name_length ,max(len(sex)) as sex_length ,max(len(age)) as age_length from class ); disconnect from Excel; quit; /* Up to 40 obs from xel.class total obs=&tob name_length sex_length age_length ------------------------------------- 7 1 2 */ * _ _ _ __ __ _ ___ ___| |_| |__ _ __ _ _ | '_ \ / _` / __/ __| __| '_ \| '__| | | | | |_) | (_| \__ \__ \ |_| | | | | | |_| | | .__/ \__,_|___/___/\__|_| |_|_| \__,_| |_| ; proc sql dquote=ansi; connect to excel (Path="d:\xls\class.xlsx"); create table class as select name length=7 ,sex length=1 ,age from connection to Excel ( Select * from class ); disconnect from Excel; Quit; * _ _ __ ___ ___ ___ __ _| | | '_ ` _ \/ __|_____/ __|/ _` | | | | | | | \__ \_____\__ \ (_| | | |_| |_| |_|___/ |___/\__, |_| |_| ; https://ss64.com/access/ a Abs The absolute value of a number (ignore negative sign). .AddMenu Add a custom menu bar/shortcut bar. .AddNew Add a new record to a recordset. .ApplyFilter Apply a filter clause to a table, form, or report. Array Create an Array. Asc The Ascii code of a character. AscW The Unicode of a character. Atn Display the ArcTan of an angle. Avg (SQL) Average. b .Beep (DoCmd) Sound a tone. .BrowseTo(DoCmd) Navigate between objects. c Call Call a procedure. .CancelEvent (DoCmd) Cancel an event. .CancelUpdate Cancel recordset changes. Case If Then Else. CBool Convert to boolean. CByte Convert to byte. CCur Convert to currency (number) CDate Convert to Date. CVDate Convert to Date. CDbl Convert to Double (number) CDec Convert to Decimal (number) Choose Return a value from a list based on position. ChDir Change the current directory or folder. ChDrive Change the current drive. Chr Return a character based on an ASCII code. .ClearMacroError (DoCmd) Clear MacroError. .Close (DoCmd) Close a form/report/window. .CloseDatabase (DoCmd) Close the database. CInt Convert to Integer (number) CLng Convert to Long (number) Command Return command line option string. .CopyDatabaseFile(DoCmd) Copy to an SQL .mdf file. .CopyObject (DoCmd) Copy an Access database object. Cos Display Cosine of an angle. Count (SQL) Count records. CSng Convert to Single (number.) CStr Convert to String. CurDir Return the current path. CurrentDb Return an object variable for the current database. CurrentUser Return the current user. CVar Convert to a Variant. d Date The current date. DateAdd Add a time interval to a date. DateDiff The time difference between two dates. DatePart Return part of a given date. DateSerial Return a date given a year, month, and day. DateValue Convert a string to a date. DAvg Average from a set of records. Day Return the day of the month. DCount Count the number of records in a table/query. Delete (SQL) Delete records. .DeleteObject (DoCmd) Delete an object. DeleteSetting Delete a value from the users registry .DoMenuItem (DoCmd) Display a menu or toolbar command. DFirst The first value from a set of records. Dir List the files in a folder. DLast The last value from a set of records. DLookup Get the value of a particular field. DMax Return the maximum value from a set of records. DMin Return the minimum value from a set of records. DoEvents Allow the operating system to process other events. DStDev Estimate Standard deviation for domain (subset of records) DStDevP Estimate Standard deviation for population (subset of records) DSum Return the sum of values from a set of records. DVar Estimate variance for domain (subset of records) DVarP Estimate variance for population (subset of records) e .Echo Turn screen updating on or off. Environ Return the value of an OS environment variable. EOF End of file input. Error Return the error message for an error No. Eval Evaluate an expression. Execute(SQL/VBA) Execute a procedure or run SQL. Exp Exponential e raised to the nth power. f FileDateTime Filename last modified date/time. FileLen The size of a file in bytes. .FindFirst/Last/Next/Previous Record. .FindRecord(DoCmd) Find a specific record. First (SQL) Return the first value from a query. Fix Return the integer portion of a number. For Loop. Format Format a Number/Date/Time. FreeFile The next file No. available to open. From Specify the table(s) to be used in an SQL query. FV Future Value of an annuity. g GetAllSettings List the settings saved in the registry. GetAttr Get file/folder attributes. GetObject Return a reference to an ActiveX object GetSetting Retrieve a value from the users registry. form.GoToPage Move to a page on specific form. .GoToRecord (DoCmd)Move to a specific record in a dataset. h Hex Convert a number to Hex. Hour Return the hour of the day. .Hourglass (DoCmd) Display the hourglass icon. HyperlinkPart Return information about data stored as a hyperlink. i If Then Else If-Then-Else IIf If-Then-Else function. Input Return characters from a file. InputBox Prompt for user input. Insert (SQL) Add records to a table (append query). InStr Return the position of one string within another. InstrRev Return the position of one string within another. Int Return the integer portion of a number. IPmt Interest payment for an annuity IsArray Test if an expression is an array IsDate Test if an expression is a date. IsEmpty Test if an expression is Empty (unassigned). IsError Test if an expression is returning an error. IsMissing Test if a missing expression. IsNull Test for a NULL expression or Zero Length string. IsNumeric Test for a valid Number. IsObject Test if an expression is an Object. L Last (SQL) Return the last value from a query. LBound Return the smallest subscript from an array. LCase Convert a string to lower-case. Left Extract a substring from a string. Len Return the length of a string. LoadPicture Load a picture into an ActiveX control. Loc The current position within an open file. .LockNavigationPane(DoCmd) Lock the Navigation Pane. LOF The length of a file opened with Open() Log Return the natural logarithm of a number. LTrim Remove leading spaces from a string. m Max (SQL) Return the maximum value from a query. .Maximize (DoCmd) Enlarge the active window. Mid Extract a substring from a string. Min (SQL) Return the minimum value from a query. .Minimize (DoCmd) Minimise a window. Minute Return the minute of the hour. MkDir Create directory. Month Return the month for a given date. MonthName Return a string representing the month. .Move Move through a Recordset. .MoveFirst/Last/Next/Previous Record .MoveSize (DoCmd) Move or Resize a Window. MsgBox Display a message in a dialogue box. n Next Continue a for loop. Now Return the current date and time. Nz Detect a NULL value or a Zero Length string. o Oct Convert an integer to Octal. OnClick, OnOpen Events. .OpenForm (DoCmd) Open a form. .OpenQuery (DoCmd) Open a query. .OpenRecordset Create a new Recordset. .OpenReport (DoCmd) Open a report. .OutputTo (DoCmd) Export to a Text/CSV/Spreadsheet file. p Partition (SQL) Locate a number within a range. .PrintOut (DoCmd) Print the active object (form/report etc.) q Quit Quit Microsoft Access r .RefreshRecord (DoCmd) Refresh the data in a form. .Rename (DoCmd) Rename an object. .RepaintObject (DoCmd) Complete any pending screen updates. Replace Replace a sequence of characters in a string. .Requery Requery the data in a form or a control. .Restore (DoCmd) Restore a maximized or minimized window. RGB Convert an RGB color to a number. Right Extract a substring from a string. Rnd Generate a random number. Round Round a number to n decimal places. RTrim Remove trailing spaces from a string. .RunCommand Run an Access menu or toolbar command. .RunDataMacro (DoCmd) Run a named data macro. .RunMacro (DoCmd) Run a macro. .RunSavedImportExport (DoCmd) Run a saved import or export specification. .RunSQL (DoCmd) Run an SQL query. s .Save (DoCmd) Save a database object. SaveSetting Store a value in the users registry .SearchForRecord(DoCmd) Search for a specific record. Second Return the seconds of the minute. Seek The position within a file opened with Open. Select (SQL) Retrieve data from one or more tables or queries. Select Into (SQL) Make-table query. Select-Subquery (SQL) SubQuery. .SelectObject (DoCmd) Select a specific database object. .SendObject (DoCmd) Send an email with a database object attached. SendKeys Send keystrokes to the active window. SetAttr Set the attributes of a file. .SetDisplayedCategories (DoCmd) Change Navigation Pane display options. .SetFilter (DoCmd) Apply a filter to the records being displayed. SetFocus Move focus to a specified field or control. .SetMenuItem (DoCmd) Set the state of menubar items (enabled /checked) .SetOrderBy (DoCmd) Apply a sort to the active datasheet, form or report. .SetParameter (DoCmd) Set a parameter before opening a Form or Report. .SetWarnings (DoCmd) Turn system messages on or off. Sgn Return the sign of a number. .ShowAllRecords(DoCmd) Remove any applied filter. .ShowToolbar (DoCmd) Display or hide a custom toolbar. Shell Run an executable program. Sin Display Sine of an angle. SLN Straight Line Depreciation. Space Return a number of spaces. Sqr Return the square root of a number. StDev (SQL) Estimate the standard deviation for a population. Str Return a string representation of a number. StrComp Compare two strings. StrConv Convert a string to Upper/lower case or Unicode. String Repeat a character n times. Sum (SQL) Add up the values in a query result set. Switch Return one of several values. SysCmd Display a progress meter. t Tan Display Tangent of an angle. Time Return the current system time. Timer Return a number (single) of seconds since midnight. TimeSerial Return a time given an hour, minute, and second. TimeValue Convert a string to a Time. .TransferDatabase (DoCmd) Import or export data to/from another database. .TransferSharePointList(DoCmd) Import or link data from a SharePoint Foundation site. .TransferSpreadsheet (DoCmd) Import or export data to/from a spreadsheet file. .TransferSQLDatabase (DoCmd) Copy an entire SQL Server database. .TransferText (DoCmd) Import or export data to/from a text file. Transform (SQL) Create a crosstab query. Trim Remove leading and trailing spaces from a string. TypeName Return the data type of a variable. u UBound Return the largest subscript from an array. UCase Convert a string to upper-case. Undo Undo the last data edit. Union (SQL) Combine the results of two SQL queries. Update (SQL) Update existing field values in a table. .Update Save a recordset. v Val Extract a numeric value from a string. Var (SQL) Estimate variance for sample (all records) VarP (SQL) Estimate variance for population (all records) VarType Return a number indicating the data type of a variable. w Weekday Return the weekday (1-7) from a date. WeekdayName Return the day of the week. y Year Return the year for a given date.
04-04-2017 10:44 AM
Note that Proc Import for CSV will generate data step code to read the file. You can use that a basis for a custom read solution as @Tom suggests. Also you can see by the assigned informat the length of a variable and adjust as desired. Just copy the code from the log and paste into the editor. Change as desired.
If you will be reading multiple files with the same layout then you only have to change the INFILE statement and the Data set name to have multiple datasets that have the same characteristics.
I recommend adjusting the informats to lengths that are reasonable for maximum expected text if reading multiple files as you may have the longest value of a last name in the first CSV file of 10 characters but then the next is 18. Proc Import would guess with each one resulting in different lengths. Also a varible that should be numeric but doesn't have any values for a specific input file will end up as character not numeric. Which leads to another set of problems with combining data later.
You also have an opportunity to use custom informats to normalize data or check on correct data entry.