Help using Base SAS procedures

Importing CSV file

Reply
Super Contributor
Posts: 272

Importing CSV file

[ Edited ]

Dear,

 

Lase week I posted a question on importing CSV file. Some one posted  some suggestions. But it didnot help. I did some work and have some questions on it.

In the dataset, why it showing only first two variable names and rest as VAR3 to VAR13. The third variable name is shown under second varianle.

 

I am attaching the csv file that I am trying to download. Please help how to get the dataset with all variable names and without any errors

 

The code from log when I downloaded maually:

 

data WORK.dv ;
%let _EFIERR_ = 0; /* set the ERROR detection macro variable */
infile 'C:\Users\rr\Desktop\PEC.csv' delimiter = ',' missover DSD lrecl=32767
firstobs=2 ;
informat Deviation__ best32.;
informat Screening $6. ;
informat VAR3 best32. ;
informat VAR4 $171. ;
informat VAR5 $9. ;
informat VAR6 DATE8. ;
informat VAR7 DATE9. ;
informat VAR8 $5. ;
informat VAR9 $32. ;
informat VAR10 $9. ;
informat VAR11 $2. ;
informat VAR12 $2. ;
informat VAR13 $4. ;
format Deviation__ best12. ;
format Screening $6. ;
format VAR3 best12. ;
format VAR4 $171. ;
format VAR5 $9. ;
format VAR6 DATE8. ;
format VAR7 DATE9. ;
format VAR8 $5. ;
format VAR9 $32. ;
format VAR10 $9. ;
format VAR11 $2. ;
format VAR12 $2. ;
format VAR13 $4. ;
input
Deviation__ $
Screening $
VAR3
VAR4 $
VAR5 $
VAR6
VAR7
VAR8 $
VAR9 $
VAR10 $
VAR11 $
VAR12 $
VAR13 $
;
if _ERROR_ then call symputx('_EFIERR_',1); /* set ERROR detection macro variable */
run;

 

I also tried this code, which resulted in error message "Import procedure not successful."

 

proc import datafile="C:\Users\nn\Desktop\pp.csv" DBMS=csv out=mm;

   guessingrows = 100;

run;

Super User
Posts: 17,823

Re: Importing CSV file

[ Edited ]

Attach a CSV file by changing the extension to txt. 

Many people won't open an XLSX file. 

Trusted Advisor
Posts: 1,375

Re: Importing CSV file

I assume you have no column labels in your excel/csv file columnns C to M,

or labels were non valid SAS names.

Super User
Super User
Posts: 6,500

Re: Importing CVS file

Why did you post an Excel workbook if your source is a CSV file?

 

Super Contributor
Posts: 272

Re: Importing CVS file

Sorry. I could not upload the CSV file.

Super User
Super User
Posts: 6,500

Re: Importing CVS file

[ Edited ]

You don't even need to upload a CSV file. Just copy some sample lines and paste them into the editor.  Open the file with WordPad or NotePad or other text editor tool.  Copy a the header row and a couple of data rows and paste them into a code block in your message. Like this:

Id,Var1,Var2
1234,3.4,Great
2345,5.6,Terrible

In reality just going through the process to attempt to do that will probably show you the answer to your original question. 

Super Contributor
Posts: 272

Re: Importing CVS file

Thank you

 

Data

 

Deviation #,"Screening
Number",Subject ID Number,Description of Deviation - Physical Examination,Study Visit Day,Date Deviation Occurred,Date PD Identified by CRA,Classification (Minor or Major),Category,Outcome,Reported to RA,Reported to IRB/IEC,Action required
1,Q-0000,1000,"During the screening on 1JUN11 the alcohol breath test was not performed, however the subject returned to the clinic to perform this assessment on the 10th of June 2006.",Screening,3-Jun-06,23-Jun-06,Minor,alcohol breath test,continued,NA,NA,None

Super User
Super User
Posts: 6,500

Re: Importing CVS file

Looks like your CSV file has a line break in the middle of the header for the second column.  Hence the confusion since it looks like two lines.  If you generated this on Windows then there is a chance that you can use the TERMSTR= option to get SAS to find the lines properly.

 

filename myfile  'myfilename.csv' termstr=crlf ;

proc import datafile=myfile ....  
Super User
Super User
Posts: 6,500

Re: Importing CVS file

[ Edited ]

Much easier to read a XLSX file than a CSV (as long as you are using a recent version of SAS).

But your column headers do not make very good variable names.

 

8     libname x xlsx "&path\PEC1.xlsx";
NOTE: Libref X was successfully assigned as follows:
      Engine:        XLSX
      Physical Name: C:\Users\XXXXXX\Downloads\PEC1.xlsx
9     proc copy inlib=x outlib=work; run;

NOTE: Copying X.PEC to WORK.PEC (memtype=DATA).
NOTE:    Variable Name Change.  Deoooooon # -> Deoooooon__
NOTE:    Variable Name Change.  Sooooning 
N00ber -> Sooooning___N00ber
NOTE:    Variable Name Change.  Sooooot ID                     N -> Sooooot_ID_____________________N
NOTE:    Variable Name Change.  Deoooooooon of Devooooon - Phooo -> Deoooooooon_of_Devooooon___Phooo
NOTE:    Variable Name Change.  Stooy Vioot Doy -> Stooy_Vioot_Doy
NOTE:    Variable Name Change.  Date Devioooon Ocooored -> Date_Devioooon_Ocooored
NOTE:    Variable Name Change.  Dooe PO Ideooooood by COA -> Dooe_PO_Ideooooood_by_COA
NOTE:    Variable Name Change.  Clasooooooooon (Mooor or Mooor) -> Clasooooooooon__Mooor_or_Mooor_
NOTE:    Variable Name Change.  Rooorted to RA -> Rooorted_to_RA
NOTE:    Variable Name Change.  Rooorted to IRB/IEC -> Rooorted_to_IRB_IEC
NOTE:    Variable Name Change.  Aoooon roooired -> Aoooon_roooired
NOTE: BUFSIZE is not cloned when copying across different engines. System Option for BUFSIZE was used.
NOTE: The import data set has 1 observations and 13 variables.
NOTE: There were 1 observations read from the data set X.PEC.
NOTE: The data set WORK.PEC has 1 observations and 13 variables.
NOTE: PROCEDURE COPY used (Total process time):
      real time           0.03 seconds
      cpu time            0.03 seconds


10    proc print; run;

WARNING: Data too long for column "Deoooooooon_of_Devooooon___Phooo"; truncated to 103 characters to fit.
NOTE: There were 1 observations read from the data set WORK.PEC.
NOTE: PROCEDURE PRINT used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds
Super User
Posts: 10,500

Re: Importing CSV file

There are 2 main reasons you would get VARx for variables when using proc import for a csv and all are related to the content of the column heading.

1) Missing  if your csv header row has any consecutive commas not enclosed in quotes then there is nothing SAS sees to assign the variable and basically uses VAR plus the column number.  I suspect that your header looked like:

Deviation  ,Screening,,,,,,,

if you open the CSV in Wordpad.

2) The text of the column heading is identical to another column and exceeds 32 characters. Something

"CLIENT_PERSONAL_LANGUAGE_0_FIRST_INTAKE","CLIENT_PERSONAL_LANGUAGE_0_FIRST_UPDATE"

The underscore after FIRST is at position 33 so the first 32 characters are the same. Since SAS has a current limit of 32 characters for a variable name the first one gets truncated to 32 characters. The second gets VARx named as SAS doesn't attempt to guess how to truncate an supply suffix.

Note that if you have columns with the same name in your CSV such as INCOME you can end up with INCOME, INCOME2, INCOME3 because SAS has room to add the counter.

 

You may also have some interesting side effects if your column headings have many characters that are not allowed in SAS variable names as they will all be replaced by underscores.

 

One of the major advantages of writing, or modifying Proc Import generated code, is that YOU can assign meaningful or at least short variable names and add LABEL statements to provide more text for documenting what that variable means.

I routinely deal with exported data that has column headings like "When is this information being collected" and assign a variable name like CollectionDate (if a date variable) or CollectionOccasion if something like Intake or Update values.

 

 

Ask a Question
Discussion stats
  • 9 replies
  • 423 views
  • 2 likes
  • 5 in conversation