I am trying to read a CSV file with commas within quotes, but my code is not reading it correctly. One of the articles suggested that if I removed the DLM="," option, it should work but it didn't do the trick.
Below is the my log (includes code and line with error). I have bolded the log to highlight the characters and fields where the error is occuring. I'd appreciate your help with this.
14683 data ZipCode_to_City;
14684 %let _EFIERR_ = 0; /* set the ERROR detection macro variable */
14685
SYMBOLGEN: Macro variable BASE_DIR_PATH resolves to /export/appl/ao1_data1_r/MAPD
SYMBOLGEN: Macro variable BOOKING_MONTH resolves to Dec2015
14685! infile "&base_dir_path./Input/&booking_month./MapBook/zip_code_database.csv" dsd missover firstobs=2;
14686 input
14687 zipcpde : $5.
14688 type : $10.
14689 City : $30.
14690 acceptable_cities : $100.
14691 unacceptable_cities : $100.
14692 state : $2.
14693 county : $30.
14694 timezone : $30.
14695 area_codes : $30.
14696 latitude : comma.
14697 longitude : comma.
14698 world_region : $20.
14699 country_code : $2.
14700 decommissioned : comma.
14701 estimated_population : comma.
14702 notes : $50.
14703 dummy :$1.
14704 ;
14705 if _ERROR_ then call symputx('_EFIERR_',1); /* set ERROR detection macro variable */
14706 City = propcase( strip(City) );
14707 drop type acceptable_cities unacceptable_cities state county_code timezone area_codes
14708 latitude longitude world_region country decommissioned estimated_population notes dummy;
14709 run;
WARNING: The variable county_code in the DROP, KEEP, or RENAME list has never been referenced.
WARNING: The variable country in the DROP, KEEP, or RENAME list has never been referenced.
NOTE: The infile "/export/appl/ao1_data1_r/MAPD/Input/Dec2015/MapBook/zip_code_database.csv" is:
Filename=/export/appl/ao1_data1_r/MAPD/Input/Dec2015/MapBook/zip_code_database.csv,
Owner Name=gauasr,Group Name=cchome091,
Access Permission=rw-rw----,
Last Modified=Wed Feb 3 13:17:50 2016,
File Size (bytes)=4568636
NOTE: Invalid data for latitude in line 5 99-111.
NOTE: Invalid data for longitude in line 5 113-126.
NOTE: Invalid data for decommissioned in line 5 160-177.
NOTE: Invalid data for estimated_population in line 5 179-194.
RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0
5 00603,STANDARD,Aguadilla,Ramey,"Bda Caban, Bda Esteves, Bo Borinquen, Bo Ceiba Baja, Ext El Prado, E
101 xt Marbella, Repto Jimenez, Repto Juan Aguiar, Repto Lopez, Repto Tres Palmas, Sect Las Villas, Urb
201 Borinquen, Urb El Prado, Urb Esteves, Urb Garcia, Urb La
zipcpde=00603 type=STANDARD City=Aguadilla acceptable_cities=Ramey unacceptable_cities="Bda Caban state=Bd county=Bo Borinquen
timezone=Bo Ceiba Baja area_codes=Ext El Prado latitude=. longitude=. world_region=Repto Juan Aguiar country_code=Re
decommissioned=. estimated_population=. notes=Urb Borinquen dummy=U _ERROR_=1 _N_=4
It looks like your issue may be reading only part of the line and not finding the ending quote begining at "Bda Caban, Bda Esteves
Possibly you need to add LRECL=10000 (or some other number large enough to hold the longest line in your data) to your INFILE statement to read the entire line.
It looks like your issue may be reading only part of the line and not finding the ending quote begining at "Bda Caban, Bda Esteves
Possibly you need to add LRECL=10000 (or some other number large enough to hold the longest line in your data) to your INFILE statement to read the entire line.
Ballardw,
You are right. The LRECL option fixed it. Thank you very much!
WARNING: The variable county_code in the DROP, KEEP, or RENAME list has never been referenced.
WARNING: The variable country in the DROP, KEEP, or RENAME list has never been referenced.
You typed your variables names wrong in the drop statment. They are country code and county not as you added. So you are getting two warnnings for that.
Regarding adding or deleting the option DELIMITER=',' . You do not have to add it if you used the option DSD as it by default deal with comma as delimiter. And it handle the comma within qoutes also.
Can you post the complete observation mentioned in the log
5 00603,STANDARD,Aguadilla,Ramey,"Bda Caban, Bda Esteves, Bo Borinquen, Bo Ceiba Baja, Ext El Prado, E
mohamed_zaki,
Thank you for pointing out the warnings! 🙂 I fixed them.
I'll make anote about the use of DSD with Delimiter option. As I replied earlier, LRECL option fixed it, as I was not reading the entire line.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.