Hi, I am working on Social Media Data. I have my input file as ';' delimiter file. In twitter and Facebook contains lots of symbols :smileyplain:. I wrote a import file code like: data WORK.Test ; %let _EFIERR_ = 0; /* set the ERROR detection macro variable */ infile 'D:\Rajesh_sun\POCData\SrBachchan.csv' delimiter = ';' MISSOVER DSD lrecl=13106 firstobs=2 TERMSTR=CRLF; informat level $3. ; informat id $3. ; informat parent_id $3. ; informat object_id $20. ; informat object_type $6. ; informat query_status $15. ; informat query_time $28. ; informat query_type $23. ; informat created_at $32. ; informat user_screen_name $17. ; informat favorite_count $3. ; informat retweet_count $4. ; informat entities_hashtags___text $12. ; informat entities_user_mentions___name $43. ; informat entities_urls___display_url $2. ; informat in_reply_to_user_id $11. ; informat in_reply_to_screen_name $12. ; informat in_reply_to_status_id $2. ; informat text $142. ; format level $3. ; format id $3. ; format parent_id $3. ; format object_id $20. ; format object_type $6. ; format query_status $15. ; format query_time $28. ; format query_type $23. ; format created_at $32. ; format user_screen_name $17. ; format favorite_count $3. ; format retweet_count $4. ; format entities_hashtags___text $12. ; format entities_user_mentions___name $43. ; format entities_urls___display_url $2. ; format in_reply_to_user_id $11. ; format in_reply_to_screen_name $12. ; format in_reply_to_status_id $2. ; format text $142. ; input level $ id $ parent_id $ object_id $ object_type $ query_status $ query_time $ query_type $ created_at $ user_screen_name $ favorite_count $ retweet_count $ entities_hashtags___text $ entities_user_mentions___name $ entities_urls___display_url $ in_reply_to_user_id $ in_reply_to_screen_name $ in_reply_to_status_id $ text $ ; if _ERROR_ then call symputx('_EFIERR_',1); /* set ERROR detection macro variable */ run; While I execute the code i am getting an error like NOTE: A byte-order mark in the file "D:\Rajesh_sun\POCData\SrBachchan.csv" (for fileref "#LN00023") indicates that the data is encoded in "utf-8". This encoding will be used to process the file. NOTE: The infile 'D:\Rajesh_sun\POCData\SrBachchan.csv' is: Filename=D:\Rajesh_sun\POCData\SrBachchan.csv, RECFM=V,LRECL=52424,File Size (bytes)=3390532, Last Modified=17Oct2014:21:42:06, Create Time=17Oct2014:21:51:09 ERROR: Invalid string. FATAL: Unrecoverable I/O error detected in the execution of the DATA step program. Aborted during the EXECUTION phase. NOTE: 9 records were read from the infile 'D:\Rajesh_sun\POCData\SrBachchan.csv'. The minimum record length was 72. The maximum record length was 364. NOTE: The SAS System stopped processing this step because of errors. WARNING: The data set WORK.TEST may be incomplete. When this step was stopped there were 9 observations and 19 variables. WARNING: Data set WORK.TEST was not replaced because this step was stopped. NOTE: DATA statement used (Total process time): real time 0.03 seconds cpu time 0.03 seconds It have various types of Symbols. I attached a sample pics and file as well. Please help me to overcome this issue.
... View more