DATA Step, Macro, Functions and more

Ignoring comma within quotes - Reading CSV

Accepted Solution Solved
Reply
Contributor
Posts: 30
Accepted Solution

Ignoring comma within quotes - Reading CSV

I am trying to read a CSV file with commas within quotes, but my code is not reading it correctly. One of the articles suggested that if I removed the DLM="," option, it should work but it didn't do the trick.

 

Below is the my log (includes code and line with error). I have bolded the log to highlight the characters and fields where the error is occuring. I'd appreciate your help with this.

 

14683  data ZipCode_to_City;
14684      %let _EFIERR_ = 0; /* set the ERROR detection macro variable */
14685      
SYMBOLGEN:  Macro variable BASE_DIR_PATH resolves to /export/appl/ao1_data1_r/MAPD
SYMBOLGEN:  Macro variable BOOKING_MONTH resolves to Dec2015
14685!  infile "&base_dir_path./Input/&booking_month./MapBook/zip_code_database.csv" dsd missover firstobs=2;
14686      input
14687          zipcpde : $5.
14688          type : $10.
14689          City : $30.
14690          acceptable_cities : $100.
14691          unacceptable_cities : $100.
14692          state : $2.
14693          county : $30.
14694          timezone : $30.
14695          area_codes : $30.
14696          latitude : comma.
14697          longitude : comma.
14698          world_region : $20.
14699          country_code : $2.
14700          decommissioned : comma.
14701          estimated_population : comma.
14702          notes : $50.
14703          dummy :$1.
14704          ;
14705      if _ERROR_ then call symputx('_EFIERR_',1);  /* set ERROR detection macro variable */
14706      City = propcase( strip(City) );
14707      drop type acceptable_cities    unacceptable_cities    state county_code timezone area_codes
14708          latitude longitude world_region country decommissioned estimated_population    notes dummy;
14709  run;


WARNING: The variable county_code in the DROP, KEEP, or RENAME list has never been referenced.
WARNING: The variable country in the DROP, KEEP, or RENAME list has never been referenced.
NOTE: The infile "/export/appl/ao1_data1_r/MAPD/Input/Dec2015/MapBook/zip_code_database.csv" is:
      Filename=/export/appl/ao1_data1_r/MAPD/Input/Dec2015/MapBook/zip_code_database.csv,
      Owner Name=gauasr,Group Name=cchome091,
      Access Permission=rw-rw----,
      Last Modified=Wed Feb  3 13:17:50 2016,
      File Size (bytes)=4568636

NOTE: Invalid data for latitude in line 5 99-111.
NOTE: Invalid data for longitude in line 5 113-126.
NOTE: Invalid data for decommissioned in line 5 160-177.
NOTE: Invalid data for estimated_population in line 5 179-194.
RULE:     ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0
5         00603,STANDARD,Aguadilla,Ramey,"Bda Caban, Bda Esteves, Bo Borinquen, Bo Ceiba Baja, Ext El Prado, E
     101  xt Marbella, Repto Jimenez, Repto Juan Aguiar, Repto Lopez, Repto Tres Palmas, Sect Las Villas, Urb
     201  Borinquen, Urb El Prado, Urb Esteves, Urb Garcia, Urb La
zipcpde=00603 type=STANDARD City=Aguadilla acceptable_cities=Ramey unacceptable_cities="Bda Caban state=Bd county=Bo Borinquen
timezone=Bo Ceiba Baja area_codes=Ext El Prado latitude=. longitude=. world_region=Repto Juan Aguiar country_code=Re
decommissioned=. estimated_population=. notes=Urb Borinquen dummy=U _ERROR_=1 _N_=4


Accepted Solutions
Solution
‎02-03-2016 02:11 PM
Super User
Posts: 10,500

Re: Ignoring comma within quotes - Reading CSV

It looks like your issue may be reading only part of the line and not finding the ending quote begining at "Bda Caban, Bda Esteves

 

Possibly you need to add  LRECL=10000 (or some other number large enough to hold the longest line in your data) to your INFILE statement to read the entire line.

View solution in original post


All Replies
Solution
‎02-03-2016 02:11 PM
Super User
Posts: 10,500

Re: Ignoring comma within quotes - Reading CSV

It looks like your issue may be reading only part of the line and not finding the ending quote begining at "Bda Caban, Bda Esteves

 

Possibly you need to add  LRECL=10000 (or some other number large enough to hold the longest line in your data) to your INFILE statement to read the entire line.

Contributor
Posts: 30

Re: Ignoring comma within quotes - Reading CSV

Ballardw,

 

You are right. The LRECL option fixed it. Thank you very much!

Super Contributor
Posts: 490

Re: Ignoring comma within quotes - Reading CSV

[ Edited ]

 



WARNING: The variable county_code in the DROP, KEEP, or RENAME list has never been referenced.
WARNING: The variable country in the DROP, KEEP, or RENAME list has never been referenced.

 

You typed your variables names wrong in the drop statment. They are country code and county not as you added. So you are getting two warnnings for that.

 

Regarding adding or deleting the option DELIMITER=',' . You do not have to add it if you used the option DSD as it by default deal with comma as delimiter. And it handle the comma within qoutes also.

 

 

Can you post the complete observation mentioned in the log 

 



5         00603,STANDARD,Aguadilla,Ramey,"Bda Caban, Bda Esteves, Bo Borinquen, Bo Ceiba Baja, Ext El Prado, E

 

 

Contributor
Posts: 30

Re: Ignoring comma within quotes - Reading CSV

mohamed_zaki,

 

Thank you for pointing out the warnings! Smiley Happy I fixed them.

 

I'll make anote about the use of DSD with Delimiter option. As I replied earlier, LRECL option fixed it, as I was not reading the entire line.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 303 views
  • 1 like
  • 3 in conversation