Hi,
I'm working on a dataset that contains french characters such as é à ç etc... In SAS OnDemand for academics as I run my proc freq with the variables containing these characters I end up having this error :
Invalid characters were present in the data
I've tried the solutions proposed to other user like changing the language of my browser to english but it does not work or saving my csv file into an xlsx file and using and I already checked in the settings, I'm already on a UTF-8 encoding.
here is my code :
ods graphics on;
libname brady "/home/u44653218/brady";
options validvarname= v7;
proc import datafile="/home/u44653218/brady/brady.xlsx"
out=brady.data
dbms=xls;
run;
proc contents data=brady.data;
run;
%let y = Converted;
%let numeric_vars = Nb_Days Total_VisitDuration Newsletter_Subscription
Nb_Catalog_Requests Account_Creation Nb_VisitedPages Nb_checkout
Nb_Add_To_cart NB_Products Nb_WebCat1 Nb_WebCat2 Nb_WebCat3
NB_Visits Source_affiliate Source_cpc Source_direct Source_email
Source_organic Source_referral InquireCount catalogCount
OrderCount QuoteCount EmailCount;
%let categ_vars = IndustrySector SIC Employee_Size Acquired_Site_Flag Sister_Company_Flag
ContactDeptDesc Last_Purchase_Period Type_visitor Top_WebCat1 Top_WebCat2
Top_ProductName Top_ProductFamily;
/*Basic summary statistics*/
proc means data=brady.data nmiss min max mean median mode;
run;
/*Let's look at our taget variable*/
proc freq data=brady.data;
tables &y / plots=freqplot;
run;
/***************************************/
/* QUALITATIVE VARIABLES */
/*************************************/
/*crossing the qualitative variables with the target variable*/
proc freq data=brady.data;
table &categ_vars*&y / chisq nocum nocol nopercent;
run;
And here is my log :
1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
NOTE: ODS statements in the SAS Studio environment may disable some output features.
73
74 ods graphics on;
75 libname brady "/home/u44653218/brady";
NOTE: Libref BRADY refers to the same physical library as _TEMP0.
NOTE: Libref BRADY was successfully assigned as follows:
Engine: V9
Physical Name: /home/u44653218/brady
76 options validvarname= v7;
77
78 proc import datafile="/home/u44653218/brady/brady.xlsx"
79 out=brady.data
80 dbms=xls;
81 run;
NOTE: Import cancelled. Output dataset BRADY.DATA already exists. Specify REPLACE option to overwrite it.
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE IMPORT used (Total process time):
real time 0.00 seconds
user cpu time 0.01 seconds
system cpu time 0.00 seconds
memory 687.78k
OS Memory 38620.00k
Timestamp 04/17/2020 07:53:49 AM
Step Count 70 Switch Count 0
Page Faults 0
Page Reclaims 181
Page Swaps 0
Voluntary Context Switches 2
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 0
82
83 proc contents data=brady.data;
84 run;
NOTE: PROCEDURE CONTENTS used (Total process time):
real time 0.11 seconds
user cpu time 0.10 seconds
system cpu time 0.00 seconds
memory 4848.46k
OS Memory 40180.00k
Timestamp 04/17/2020 07:53:49 AM
Step Count 71 Switch Count 0
Page Faults 0
Page Reclaims 447
Page Swaps 0
Voluntary Context Switches 3
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 40
85
86 %let y = Converted;
87 %let numeric_vars = Nb_Days Total_VisitDuration Newsletter_Subscription
88 Nb_Catalog_Requests Account_Creation Nb_VisitedPages Nb_checkout
89 Nb_Add_To_cart NB_Products Nb_WebCat1 Nb_WebCat2 Nb_WebCat3
90 NB_Visits Source_affiliate Source_cpc Source_direct Source_email
91 Source_organic Source_referral InquireCount catalogCount
92 OrderCount QuoteCount EmailCount;
93 %let categ_vars = IndustrySector SIC Employee_Size Acquired_Site_Flag Sister_Company_Flag
94 ContactDeptDesc Last_Purchase_Period Type_visitor Top_WebCat1 Top_WebCat2
95 Top_ProductName Top_ProductFamily;
96
97 /*Basic summary statistics*/
98 proc means data=brady.data nmiss min max mean median mode;
99 run;
NOTE: There were 45906 observations read from the data set BRADY.DATA.
NOTE: PROCEDURE MEANS used (Total process time):
real time 0.17 seconds
user cpu time 0.21 seconds
system cpu time 0.01 seconds
memory 14307.68k
OS Memory 49692.00k
Timestamp 04/17/2020 07:53:50 AM
Step Count 72 Switch Count 5
Page Faults 0
Page Reclaims 2672
Page Swaps 0
Voluntary Context Switches 142
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 24
99 !
100
101 /*Let's look at our taget variable*/
102 proc freq data=brady.data;
103 tables &y / plots=freqplot;
104 run;
NOTE: There were 45906 observations read from the data set BRADY.DATA.
NOTE: PROCEDURE FREQ used (Total process time):
real time 0.25 seconds
user cpu time 0.11 seconds
system cpu time 0.03 seconds
memory 20747.03k
OS Memory 55544.00k
Timestamp 04/17/2020 07:53:50 AM
Step Count 73 Switch Count 6
Page Faults 0
Page Reclaims 4098
Page Swaps 0
Voluntary Context Switches 303
Involuntary Context Switches 3
Block Input Operations 0
Block Output Operations 1232
105
106 /***************************************/
107 /* QUALITATIVE VARIABLES */
108 /*************************************/
109
110 /*crossing the qualitative variables with the target variable*/
111 proc freq data=brady.data;
112 table &categ_vars*&y / chisq nocum nocol nopercent;
113 run;
ERROR: Invalid characters were present in the data.
ERROR: An error occurred while processing text data.
NOTE: The SAS System stopped processing this step because of errors.
ERROR: Invalid characters were present in the data.
NOTE: There were 45906 observations read from the data set BRADY.DATA.
NOTE: PROCEDURE FREQ used (Total process time):
real time 0.16 seconds
user cpu time 0.16 seconds
system cpu time 0.01 seconds
memory 12492.10k
OS Memory 70044.00k
Timestamp 04/17/2020 07:53:50 AM
Step Count 74 Switch Count 8
Page Faults 0
Page Reclaims 3628
Page Swaps 0
Voluntary Context Switches 33
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 7960
114
115
116
117 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
129
If you could help me that would be great. Have a nice day y'all.
Could you please provide the sample data of the same to understand the special character. We can alternatively compress the special characters.
First, you need to use the REPLACE option in your PROC IMPORT, or you won't be able to update your dataset:
78 proc import datafile="/home/u44653218/brady/brady.xlsx" 79 out=brady.data 80 dbms=xls; 81 run; NOTE: Import cancelled. Output dataset BRADY.DATA already exists. Specify REPLACE option to overwrite it. NOTE: The SAS System stopped processing this step because of errors.
Next, please provide examples on which we can try our code. Post the data into a window opened with </>, so that the forum software does not change anything:
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.