Desktop productivity for business analysts and programmers

When importing a comma delimited file, how do I prevent SAS from getting confused when the value for a character variable contains a comma?

Accepted Solution Solved
Reply
Contributor
Posts: 50
Accepted Solution

When importing a comma delimited file, how do I prevent SAS from getting confused when the value for a character variable contains a comma?

For example, if my data set is:

"name" , " address" , "phone number"

"daffy duck" , "123 pizza lane, ohio" , "555-555-5555"

"ronald mcdonald" , "435 awesome street, michigan" , "555-555-5555"

When importing this data, SAS will assume that daffy duck and ronald mcdonald are names, 123 pizza lane and 435 awesome street are addresses, and that ohio and michigan are phone numbers.

All my data is enclosed in quotations.  Is there a way to get SAS to allow commas as long as they fall within the quotation marks?

Thanks for any help!

HERE'S MY CODING - (This is from the PROC IMPORT)

note - the problem occurs with address1

data WORK.HQI_HOSP    ;

    %let _EFIERR_ = 0; /* set the ERROR detection macro variable */

    infile "C:\Users\mconover\Desktop\COURSE WORK\718 Data\Hospital_flatfiles\Hospital_flatfiles\HQI_Hosp.csv" delimiter = ',' MISSOVER DSD lrecl=32767 firstobs=2 ;

       informat Provider_Number  10. ;

       informat Hospital_Name $73. ;

       informat Address1 $51. ;

       informat Address2 $5. ;

       informat Address3 $32. ;

       informat City $4. ;

       informat State $7. ;

       informat ZIP_Code 22. ;

       informat County_Name $38. ;

       informat Phone_Number $38. ;

       informat Hospital_Type $45. ;

       informat Hospital_Ownership $15. ;

    informat Emergency_Service $25. ;

       format Provider_Number 10. ;

       format Hospital_Name $73. ;

       format Address1 $51. ;

       format Address2 $5. ;

       format Address3 $32. ;

       format City $4. ;

       format State $7. ;

       format ZIP_Code 22. ;

       format County_Name $38. ;

       format Phone_Number $38. ;

       format Hospital_Type $45. ;

       format Hospital_Ownership $15. ;

    format Emergency_Service $25. ;

  input @;

  _infile_=compress(compress(_infile_,"'"),'"');

  input

               Provider_Number

                Hospital_Name $

                Address1 $

                Address2 $

                Address3 $

                City $

                State $

                ZIP_Code

                County_Name $

                Phone_Number $

                Hospital_Type $

                Hospital_Ownership $

                 Emergency_Service $  

    ;

  run;


Accepted Solutions
Solution
‎08-27-2012 09:16 AM
Super User
Super User
Posts: 6,364

Re: When importing a comma delimited file, how do I prevent SAS from getting confused when the value for a character variable contains a comma?

Why not only compress out the single quote characters that are causing the trouble instead of removing the double quotes that are protecting the embedded commas?

_infile_=compress(_infile_,"'")


Or  if you have some single quotes inside strings then try just removing those that open and close strings.

_infile_=tranwrd(tranwrd(_infile_,"""'",'"'),"'""",'"');


View solution in original post


All Replies
Grand Advisor
Posts: 17,396

Re: When importing a comma delimited file, how do I prevent SAS from getting confused when the value for a character variable contains a comma?

Why are you removing the quotation marks with the compress statement?

DSD should allow it to read properly, with the quotations and the comma's, that is the ideal format for embedded delimiters.

Assuming you leave the quotation marks in.

Contributor
Posts: 50

Re: When importing a comma delimited file, how do I prevent SAS from getting confused when the value for a character variable contains a comma?

I have to remove quotation marks to make sure that the provider number, which has quotations within a quotation mark (e.g. " ' 10000 ' ") reads as a numeric variable rather than a character variable.  I will be merging with other data sets for which the provider number is a numeric variable without quotations.

Grand Advisor
Posts: 17,396

Re: When importing a comma delimited file, how do I prevent SAS from getting confused when the value for a character variable contains a comma?

Does everything else read properly besides, the phone number being a character string?

If so then I'd change that to a numeric afterwards in the same step.

Contributor
Posts: 50

Re: When importing a comma delimited file, how do I prevent SAS from getting confused when the value for a character variable contains a comma?

Maybe it will be easier if I rephrase my question.

Consider the sample data set with variables X, Y, and Z:

This is what my data looks like:

"X" , "Y" , "Z"

"A" , "3,B" , "C"

"B" , "2,D" , "D"

"C" , "4,F" , "B"

"A" , "4,B" , "C"

Now what I want to import is a table that looks like this

X     Y      Z

A     3,B   C

B     2,D   D

C     4,F   B

A     4,B   C

I don't need the commas in the table but I do need these double data points (3,B and 2,D etc...) to be contained within a single cell.  The problem is that it's assuming anything after the comma falls under the next column.

I'm sorry if I'm not making this very clear.

note: the quotations absolutely must be removed from the provider number in order to make it match the rest of my data (enabling a future merge) so excluding the compress statement is not an option.

Solution
‎08-27-2012 09:16 AM
Super User
Super User
Posts: 6,364

Re: When importing a comma delimited file, how do I prevent SAS from getting confused when the value for a character variable contains a comma?

Why not only compress out the single quote characters that are causing the trouble instead of removing the double quotes that are protecting the embedded commas?

_infile_=compress(_infile_,"'")


Or  if you have some single quotes inside strings then try just removing those that open and close strings.

_infile_=tranwrd(tranwrd(_infile_,"""'",'"'),"'""",'"');


Contributor
Posts: 50

Re: When importing a comma delimited file, how do I prevent SAS from getting confused when the value for a character variable contains a comma?

Thanks guys for all your help.  Tom I went with your suggestion since it seemed to be the most simple and easy to understand.  Thanks to Arthur for helping me a second time and thanks to Reeza for staying up with me trying to figure this out.  I can't tell you how helpful all this help is. 

Esteemed Advisor
Posts: 7,296

Re: When importing a comma delimited file, how do I prevent SAS from getting confused when the value for a character variable contains a comma?

@mconover: You should take a look at DN's comments to the question you initially raised yesterday.  Since provider_number contains leading zeros, not treating it as a number and using the $quote informat will likely be your best and easiest solution.

Regardless, I think the various responses you got should give you ideas that you can apply in the future.

New Contributor
Posts: 3

Re: When importing a comma delimited file, how do I prevent SAS from getting confused when the value for a character variable contains a comma?

I think a mixed of dsd and colon modifier along with LRECL will help you solve the problem.

Here is what I have written for you for the dataset you mentioned.

I think you can read any address line with a embeded comma with this code.

Please let me know if you have any questions.

data mydata;

infile "C:\mydata.csv"

        dlm = ","        /*Specifying the delimeter*/

        dsd                /*reading any missing value within line*/

        missover        /*Reading missing value at the end of the line*/

        LRECL= 32760    /*changing the default length to 32K*/

        firstobs=2         /*reading the data lines only*/

        ;

input    name :$30.         /*specifiying colon modifier for reading data with comma*/

        address :$30.

        phone :$30.;

run;

Contributor
Posts: 50

Re: When importing a comma delimited file, how do I prevent SAS from getting confused when the value for a character variable contains a comma?

hmmm   still didn't work  I really appreciate your help.  Anybody else have any ideas?  This has to be a common problem. 

data WORK.HQI_HOSP    ;

    %let _EFIERR_ = 0; /* set the ERROR detection macro variable */

    infile "C:\Users\mconover\Desktop\COURSE WORK\718 Data\Hospital_flatfiles\Hospital_flatfiles\HQI_Hosp.csv" DLMSTR = ',' MISSOVER DSD lrecl=32767 firstobs=2 ;

       informat Provider_Number  10. ;

       informat Hospital_Name $73. ;

       informat Address1 comma51. ;

       informat Address2 $5. ;

       informat Address3 $32. ;

       informat City $4. ;

       informat State $7. ;

       informat ZIP_Code 22. ;

       informat County_Name $38. ;

       informat Phone_Number $38. ;

       informat Hospital_Type $45. ;

       informat Hospital_Ownership $15. ;

    informat Emergency_Service $25. ;

       format Provider_Number 10. ;

       format Hospital_Name $73. ;

       format Address1 $51. ;

       format Address2 $5. ;

       format Address3 $32. ;

       format City $4. ;

       format State $7. ;

       format ZIP_Code 22. ;

       format County_Name $38. ;

       format Phone_Number $38. ;

       format Hospital_Type $45. ;

       format Hospital_Ownership $15. ;

    format Emergency_Service $25. ;

  input @;

  _infile_=compress(compress(_infile_,"'"),'"');

  input

               Provider_Number

                Hospital_Name $

                Address1 $

                Address2 $

                Address3 $

                City $

                State $

                ZIP_Code

                County_Name $

                Phone_Number $

                Hospital_Type $

                Hospital_Ownership $

                 Emergency_Service $  

    ;

  run;

PROC PRINT DATA=work.HQI_Hosp;

RUN;

Super Contributor
Posts: 349

Re: When importing a comma delimited file, how do I prevent SAS from getting confused when the value for a character variable contains a comma?

Hi,

Just try this...

For external file you can just remove cards and external file path.

data have;

infile cards  dlm = ","  dsd;            

  informat x y z $10.;

  format x y z $10.;

input x y z;

cards;

"A" , "3,B" , "C"

"B" , "2,D" , "D"

"C" , "4,F" , "B"

"A" , "4,B" , "C"

;

run;

Thanks,

Shiva

Esteemed Advisor
Posts: 7,296

Re: When importing a comma delimited file, how do I prevent SAS from getting confused when the value for a character variable contains a comma?

Since this is still the problem you posted yesterday, but with one extra twist, I'd suggest the same approach as I did yesterday, but using a different function.  According to your input statement, the Provider_Number is always as the start of each record and it is the only field that includes single quotes.  As such, why not just rid of the quotes around that one field?  e.g.:

data WORK.HQI_HOSP  (drop=x y) ;

    %let _EFIERR_ = 0; /* set the ERROR detection macro variable */

    infile "C:\Users\mconover\Desktop\COURSE WORK\718 Data\Hospital_flatfiles\Hospital_flatfiles\HQI_Hosp.csv" delimiter = ',' MISSOVER DSD lrecl=32767 firstobs=2 ;

    informat Provider_Number  10. ;

    informat Hospital_Name $73. ;

    informat Address1 $51. ;

    informat Address2 $5. ;

    informat Address3 $32. ;

    informat City $4. ;

    informat State $7. ;

    informat ZIP_Code 22. ;

    informat County_Name $38. ;

    informat Phone_Number $38. ;

    informat Hospital_Type $45. ;

    informat Hospital_Ownership $15. ;

    informat Emergency_Service $25. ;

    format Provider_Number 10. ;

    format Hospital_Name $73. ;

    format Address1 $51. ;

    format Address2 $5. ;

    format Address3 $32. ;

    format City $4. ;

    format State $7. ;

    format ZIP_Code 22. ;

    format County_Name $38. ;

    format Phone_Number $38. ;

    format Hospital_Type $45. ;

    format Hospital_Ownership $15. ;

    format Emergency_Service $25. ;

  input @;

    x=find(_infile_,"'",1);

    y=find(_infile_,"'",-255);

  _infile_=catt(substr(_infile_,3,y-3),substr(_infile_,y+2));

  input

                Provider_Number

                Hospital_Name $

                Address1 $

                Address2 $

                Address3 $

                City $

                State $

                ZIP_Code

                County_Name $

                Phone_Number $

                Hospital_Type $

                Hospital_Ownership $

                Emergency_Service $  

    ;

  run;

N/A
Posts: 1

Re: When importing a comma delimited file, how do I prevent SAS from getting confused when the value for a character variable contains a comma?

Hey u can use comma5 or comma10 to remove the comma between them and to put in single cell

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 12 replies
  • 10020 views
  • 0 likes
  • 7 in conversation