BookmarkSubscribeRSS Feed
skavli
Calcite | Level 5

Hi, 

I'm working on a dataset that contains french characters such as é à ç etc... In SAS OnDemand for academics as I run my proc freq with the variables containing these characters I end up having this error : 

Invalid characters were present in the data

I've tried the solutions proposed to other user like changing the language of my browser to english but it does not work or saving my csv file into an xlsx file and using and I already checked in the settings, I'm already on a UTF-8 encoding.

here is my code : 

ods graphics on;
libname brady "/home/u44653218/brady";
options validvarname= v7;

proc import datafile="/home/u44653218/brady/brady.xlsx" 
out=brady.data
dbms=xls;
run;

proc contents data=brady.data;
run;

%let y = Converted;
%let numeric_vars = Nb_Days Total_VisitDuration Newsletter_Subscription 
					Nb_Catalog_Requests Account_Creation Nb_VisitedPages Nb_checkout  
					Nb_Add_To_cart NB_Products Nb_WebCat1 Nb_WebCat2 Nb_WebCat3
					NB_Visits Source_affiliate Source_cpc Source_direct Source_email
					Source_organic Source_referral InquireCount catalogCount
					OrderCount QuoteCount EmailCount;
%let categ_vars = IndustrySector SIC Employee_Size Acquired_Site_Flag Sister_Company_Flag
				  ContactDeptDesc Last_Purchase_Period Type_visitor Top_WebCat1 Top_WebCat2
				  Top_ProductName Top_ProductFamily;
				 
/*Basic summary statistics*/
proc means data=brady.data nmiss min max mean median mode;
run;	

/*Let's look at our taget variable*/
proc freq data=brady.data;
tables &y / plots=freqplot;
run;

/***************************************/
/*       QUALITATIVE VARIABLES        */
/*************************************/

/*crossing the qualitative variables with the target variable*/
proc freq data=brady.data;
table &categ_vars*&y / chisq nocum nocol nopercent;
run;

And here is my log : 

 
 1          OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 NOTE: ODS statements in the SAS Studio environment may disable some output features.
 73         
 74         ods graphics on;
 75         libname brady "/home/u44653218/brady";
 NOTE: Libref BRADY refers to the same physical library as _TEMP0.
 NOTE: Libref BRADY was successfully assigned as follows: 
       Engine:        V9 
       Physical Name: /home/u44653218/brady
 76         options validvarname= v7;
 77         
 78         proc import datafile="/home/u44653218/brady/brady.xlsx"
 79         out=brady.data
 80         dbms=xls;
 81         run;
 
 NOTE: Import cancelled.  Output dataset BRADY.DATA already exists.  Specify REPLACE option to overwrite it.
 NOTE: The SAS System stopped processing this step because of errors.
 NOTE: PROCEDURE IMPORT used (Total process time):
       real time           0.00 seconds
       user cpu time       0.01 seconds
       system cpu time     0.00 seconds
       memory              687.78k
       OS Memory           38620.00k
       Timestamp           04/17/2020 07:53:49 AM
       Step Count                        70  Switch Count  0
       Page Faults                       0
       Page Reclaims                     181
       Page Swaps                        0
       Voluntary Context Switches        2
       Involuntary Context Switches      0
       Block Input Operations            0
       Block Output Operations           0
       
 82         
 
 
 83         proc contents data=brady.data;
 84         run;
 
 NOTE: PROCEDURE CONTENTS used (Total process time):
       real time           0.11 seconds
       user cpu time       0.10 seconds
       system cpu time     0.00 seconds
       memory              4848.46k
       OS Memory           40180.00k
       Timestamp           04/17/2020 07:53:49 AM
       Step Count                        71  Switch Count  0
       Page Faults                       0
       Page Reclaims                     447
       Page Swaps                        0
       Voluntary Context Switches        3
       Involuntary Context Switches      0
       Block Input Operations            0
       Block Output Operations           40
       
 
 85         
 86         %let y = Converted;
 87         %let numeric_vars = Nb_Days Total_VisitDuration Newsletter_Subscription
 88         Nb_Catalog_Requests Account_Creation Nb_VisitedPages Nb_checkout
 89         Nb_Add_To_cart NB_Products Nb_WebCat1 Nb_WebCat2 Nb_WebCat3
 90         NB_Visits Source_affiliate Source_cpc Source_direct Source_email
 91         Source_organic Source_referral InquireCount catalogCount
 92         OrderCount QuoteCount EmailCount;
 93         %let categ_vars = IndustrySector SIC Employee_Size Acquired_Site_Flag Sister_Company_Flag
 94           ContactDeptDesc Last_Purchase_Period Type_visitor Top_WebCat1 Top_WebCat2
 95           Top_ProductName Top_ProductFamily;
 96         
 97         /*Basic summary statistics*/
 98         proc means data=brady.data nmiss min max mean median mode;
 99         run;
 
 NOTE: There were 45906 observations read from the data set BRADY.DATA.
 NOTE: PROCEDURE MEANS used (Total process time):
       real time           0.17 seconds
       user cpu time       0.21 seconds
       system cpu time     0.01 seconds
       memory              14307.68k
       OS Memory           49692.00k
       Timestamp           04/17/2020 07:53:50 AM
       Step Count                        72  Switch Count  5
       Page Faults                       0
       Page Reclaims                     2672
       Page Swaps                        0
       Voluntary Context Switches        142
       Involuntary Context Switches      0
       Block Input Operations            0
       Block Output Operations           24
       
 
 99       !     
 100        
 101        /*Let's look at our taget variable*/
 102        proc freq data=brady.data;
 103        tables &y / plots=freqplot;
 104        run;
 
 NOTE: There were 45906 observations read from the data set BRADY.DATA.
 NOTE: PROCEDURE FREQ used (Total process time):
       real time           0.25 seconds
       user cpu time       0.11 seconds
       system cpu time     0.03 seconds
       memory              20747.03k
       OS Memory           55544.00k
       Timestamp           04/17/2020 07:53:50 AM
       Step Count                        73  Switch Count  6
       Page Faults                       0
       Page Reclaims                     4098
       Page Swaps                        0
       Voluntary Context Switches        303
       Involuntary Context Switches      3
       Block Input Operations            0
       Block Output Operations           1232
       
 
 105        
 106        /***************************************/
 107        /*       QUALITATIVE VARIABLES        */
 108        /*************************************/
 109        
 110        /*crossing the qualitative variables with the target variable*/
 111        proc freq data=brady.data;
 112        table &categ_vars*&y / chisq nocum nocol nopercent;
 113        run;
 
 ERROR: Invalid characters were present in the data.
 ERROR: An error occurred while processing text data.
 NOTE: The SAS System stopped processing this step because of errors.
 ERROR: Invalid characters were present in the data.
 NOTE: There were 45906 observations read from the data set BRADY.DATA.
 NOTE: PROCEDURE FREQ used (Total process time):
       real time           0.16 seconds
       user cpu time       0.16 seconds
       system cpu time     0.01 seconds
       memory              12492.10k
       OS Memory           70044.00k
       Timestamp           04/17/2020 07:53:50 AM
       Step Count                        74  Switch Count  8
       Page Faults                       0
       Page Reclaims                     3628
       Page Swaps                        0
       Voluntary Context Switches        33
       Involuntary Context Switches      0
       Block Input Operations            0
       Block Output Operations           7960
       
 114        
 115        
 116        
 117        OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 129        

If you could help me that would be great. Have a nice day y'all.

2 REPLIES 2
Jagadishkatam
Amethyst | Level 16

Could you please provide the sample data of the same to understand the special character. We can alternatively compress the special characters.

Thanks,
Jag
Kurt_Bremser
Super User

First, you need to use the REPLACE option in your PROC IMPORT, or you won't be able to update your dataset:

 78         proc import datafile="/home/u44653218/brady/brady.xlsx"
 79         out=brady.data
 80         dbms=xls;
 81         run;
 
 NOTE: Import cancelled.  Output dataset BRADY.DATA already exists.  Specify REPLACE option to overwrite it.
 NOTE: The SAS System stopped processing this step because of errors.

Next, please provide examples on which we can try our code. Post the data into a window opened with </>, so that the forum software does not change anything:

Bildschirmfoto 2020-04-07 um 08.32.59.png

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 1799 views
  • 0 likes
  • 3 in conversation