Proc import text file with double quoted missing data

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 15
Accepted Solution

Proc import text file with double quoted missing data

Hi,

I used PROC IMPORT to import a comma delimited text file. In the text file, there are some fields with missing values. For these values, they are double quoted as the character values. However because the missing value has no space in between, I can not import the data properly. For example, the data is as following:

19,99,15,"","ABC",3,"008038"

I have to make the space for the missing value to " " or take out double quotes from the data to import data properly.

PROC IMPORT DATAFILE = "C:\test.txt" OUT = mydata dbms=dlm REPLACE; DELIMITER = ","; GETNAMES = no; GUESSINGROWS = 100; RUN;

Does anyone know how to revise above script to solve the issue?

Thanks!

Regards,

Weizhong


Accepted Solutions
Solution
‎06-05-2013 06:36 AM
Super User
Posts: 5,426

Re: Proc import text file with double quoted missing data

Posted in reply to weizhongma

This is a known issue, and unfortunately there is no fix, just an ugly work-around...

45235 - PROC IMPORT, the Import Wizard, and EFI do not import all columns when a missing value is re...

Data never sleeps

View solution in original post


All Replies
Solution
‎06-05-2013 06:36 AM
Super User
Posts: 5,426

Re: Proc import text file with double quoted missing data

Posted in reply to weizhongma

This is a known issue, and unfortunately there is no fix, just an ugly work-around...

45235 - PROC IMPORT, the Import Wizard, and EFI do not import all columns when a missing value is re...

Data never sleeps
Super Contributor
Posts: 644

Re: Proc import text file with double quoted missing data

try (replace 9999 with a number larger than the length of any row in your text file)

Data _Null_ ;

  Infile "C:\test.txt" LRECL = 9999 ;

  Input ;

  NewRec = Transtrn(_Infile_, '"",', ',') ;

  File "C:\test2.txt" LRECL = 9999 ;

  Put NewRec ;

Run ;

PROC IMPORT DATAFILE = "C:\test2.txt" OUT = mydata dbms=dlm REPLACE; DELIMITER = ","; GETNAMES = no; GUESSINGROWS = 100; RUN;

be aware that this solution will have problems with any quoted values enclosed in quotes (inner quotes will be doubled, as in """Testing""", which should be read as the value "Testing".  This can be avoided by using transtrn to change the value of """, to something else and then changing it back before writing to test2.

Richard

Occasional Contributor
Posts: 15

Re: Proc import text file with double quoted missing data

Posted in reply to RichardinOz

Hi Richard,

It works! Thanks a lot for your help!

Regards,

Weizhong

Super User
Super User
Posts: 7,039

Re: Proc import text file with double quoted missing data

Posted in reply to weizhongma

So just don't use the PROC IMPORT wizard.

You will need to make your own decisions on variable type and length. Looks like you need to assign variable names anyway since you are using GETNAMES = no.

data mydata ;

  infile 'c:\test.txt' dsd truncover;

  length var1-var3 8 var4 $10 var5 $20 var6 8 var7 $10;

  input var1--var7 ;

run;


If the columns in the input file don't change then you can define the program once and run it as many times as you need on new data files.

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 1803 views
  • 5 likes
  • 4 in conversation