BookmarkSubscribeRSS Feed
b0guna01
Calcite | Level 5

Hello,

I am trying to import a csv file into SAS, but my code is not working properly. The log window shows the data set was successfully created, but it was not.  The data set has 124 variables. The log window shows it has only 13 variables. Could you help me to import the data?  Here is the log window message. 


271 LIBNAME P_RAKEE "D:\Rakee\Rakee";
NOTE: Libref P_RAKEE was successfully assigned as follows:
Engine: V9
Physical Name: D:\Rakee\Rakee
272 PROC IMPORT OUT=DICAID_2013_1 DATAFILE="D:\Rakee\E-UL-211606-Data
272! Request_2013.csv"
273 DBMS=CSV REPLACE;
274 GETNAMES=YES;
275 guessingrows=max;
276 RUN;

Number of names found is less than number of variables found.
Name
CLAIM_GUID NUM_DTL DTE_PTN DTE_FIRST_SVC DTE_LAST_SVC DTE_ADMISSION DTE_DISCHARGE BEN_ID CE12 CE
12_HE CDE_COUNTY ZIP_CODE ZIP_CODE_3 RUC_CODE ADD_REGION DTE_BIRTH DTE_DEATH AGE_DFS AGE_MONTHS_
DFS AGE_
truncated to CLAIM_GUID NUM_DTL DTE_PTN DTE_F.
Name
CLAIM_GUID NUM_DTL DTE_PTN DTE_FIRST_SVC DTE_LAST_SVC DTE_ADMISSION DTE_DISCHARGE BEN_ID CE12 CE
12_HE CDE_COUNTY ZIP_CODE ZIP_CODE_3 RUC_CODE ADD_REGION DTE_BIRTH DTE_DEATH AGE_DFS AGE_MONTHS_
DFS AGE_
is not a valid SAS name.
Problems were detected with provided names. See LOG.
277 /**********************************************************************
278 * PRODUCT: SAS
279 * VERSION: 9.4
280 * CREATOR: External File Interface
281 * DATE: 14MAY21
282 * DESC: Generated SAS Datastep Code
283 * TEMPLATE SOURCE: (None Specified.)
284 ***********************************************************************/
285 data WORK.DICAID_2013_1 ;
286 %let _EFIERR_ = 0; /* set the ERROR detection macro variable */
287 infile 'D:\Rakee\E-UL-211606-Data Request_2013.csv' delimiter = ','
287! MISSOVER DSD lrecl=32767 firstobs=2 ;
288 informat VAR1 $1485. ;
289 informat VAR2 $1106. ;
290 informat VAR3 $1113. ;
291 informat VAR4 $1067. ;
292 informat VAR5 $1037. ;
293 informat VAR6 $693. ;
294 informat VAR7 $786. ;
295 informat VAR8 $553. ;
296 informat VAR9 $337. ;
297 informat VAR10 $217. ;
298 informat VAR11 $103. ;
299 informat VAR12 $142. ;
300 informat VAR13 $158. ;
301 format VAR1 $1485. ;
302 format VAR2 $1106. ;
303 format VAR3 $1113. ;
304 format VAR4 $1067. ;
305 format VAR5 $1037. ;
306 format VAR6 $693. ;
307 format VAR7 $786. ;
308 format VAR8 $553. ;
309 format VAR9 $337. ;
310 format VAR10 $217. ;
311 format VAR11 $103. ;
312 format VAR12 $142. ;
313 format VAR13 $158. ;
314 input
315 VAR1 $
316 VAR2 $
317 VAR3 $
318 VAR4 $
319 VAR5 $
320 VAR6 $
321 VAR7 $
322 VAR8 $
323 VAR9 $
324 VAR10 $
325 VAR11 $
326 VAR12 $
327 VAR13 $
328 ;
329 if _ERROR_ then call symputx('_EFIERR_',1); /* set ERROR detection macro variable */
330 run;

NOTE: The infile 'D:\Rakee\E-UL-211606-Data Request_2013.csv' is:
Filename=D:\Rakee\E-UL-211606-Data Request_2013.csv,
RECFM=V,LRECL=32767,
File Size (bytes)=4975718400,
Last Modified=29Apr2021:12:06:18,
Create Time=29Apr2021:12:35:46

NOTE: 5522752 records were read from the infile 'D:\Rakee\E-UL-211606-Data
Request_2013.csv'.
The minimum record length was 330.
The maximum record length was 1634.
NOTE: The data set WORK.DICAID_2013_1 has 5522752 observations and 13 variables.
NOTE: DATA statement used (Total process time):
real time 53:32.57
cpu time 34.87 seconds


5522752 rows created in WORK.DICAID_2013_1 from D:\Rakee\E-UL-211606-Data
Request_2013.csv.

 

NOTE: WORK.DICAID_2013_1 data set was successfully created.
NOTE: The data set WORK.DICAID_2013_1 has 5522752 observations and 13 variables.
NOTE: PROCEDURE IMPORT used (Total process time):
real time 1:54:10.89
cpu time 58:44.54

 

2 REPLIES 2
ballardw
Super User

Copy the first 5 lines from you CSV file when it has been opened in a text editor (extremely important) then on the forum open a text box by clicking on the </> icon that appears above the message window and paste the text.

The text box is important as the message windows will reformat text on this forum and change the content so it could be impossible to diagnose anything.

 

Warning: If you have opened that file in Excel and saved it from Excel then you may have already changed the contents.

 

This bit of your log (which should be pasted into a text box)

Number of names found is less than number of variables found.
Name
CLAIM_GUID NUM_DTL DTE_PTN DTE_FIRST_SVC DTE_LAST_SVC DTE_ADMISSION DTE_DISCHARGE BEN_ID CE12 CE
12_HE CDE_COUNTY ZIP_CODE ZIP_CODE_3 RUC_CODE ADD_REGION DTE_BIRTH DTE_DEATH AGE_DFS AGE_MONTHS_
DFS AGE_
truncated to CLAIM_GUID NUM_DTL DTE_PTN DTE_F.

Tells me that the first row of your data file does not have the column headings separated by commas.

Two reasons for that conclusion "number of names found is less than the number of variables". That means the body of the data had more items separated by commas than the header row that would be used to create variable names and that the text was truncated to one string 32 characters long that SAS may have attempted to use as a variable.

 

You also have:

288 informat VAR1 $1485. ;

which means SAS thinks that the first variable should be 1,485 characters long.

 

Possible causes of some of this kind of behavior.

1) the file is not indeed actually separated by commas

2) you have unbalanced quotes in the data so that many of the commas are considered parts of text

 

Which is why you want to copy example data from the text file so we can see what exactly is going on.

 

Once upon a time CSV actually meant "comma separated values". However there has been a tendency for some sources to use it as "character separated values" and you have to actually open the files and verify what character they are using. I have seen "CSV" files delimited with semicolons, pipe character |, TAB characters (which take a bit of sleuthing to find sometimes).

 

Tom
Super User Tom
Super User

Are you sure your file is a comma delimited file like you told PROC IMPORT it was?

Take a look at the file with a text editor.

Or just use a simple data step to look at the file.  For example this data step will read the first two lines and dump them to the LOG.  If any of the characters on the lines are un-printable then it will also show the hexadecimal code for all of the characters in that part of the line.

data _null_;
  infile "D:\Rakee\E-UL-211606-DataRequest_2013.csv" obs=2;
  input;
  list;
run;

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 354 views
  • 0 likes
  • 3 in conversation