BookmarkSubscribeRSS Feed
BLockwood
Calcite | Level 5

Hey, I am trying to run a proc discrim on an excel file I've imported.

This is the code that I am trying to run:

proc discrim data = CCData2;
	class Sample;
	var pH HCO3 Ca Mg Na SO4 Cl Temp Conductivity Flow;
run;

 And this is the error message I get:

188  proc discrim data = CCData2;
189      class Sample;
190      var pH HCO3 Ca Mg Na SO4 Cl Temp Conductivity Flow;
ERROR: Variable pH in list does not match type prescribed for this list.
ERROR: Variable HCO3 in list does not match type prescribed for this list.
ERROR: Variable Ca in list does not match type prescribed for this list.
ERROR: Variable Mg in list does not match type prescribed for this list.
ERROR: Variable Na in list does not match type prescribed for this list.
ERROR: Variable SO4 in list does not match type prescribed for this list.
ERROR: Variable Cl in list does not match type prescribed for this list.
ERROR: Variable Temp in list does not match type prescribed for this list.
ERROR: Variable Conductivity in list does not match type prescribed for this list.
ERROR: Variable Flow in list does not match type prescribed for this list.
191  run;

NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE DISCRIM used (Total process time):
      real time           0.00 seconds
      cpu time            0.01 seconds

From what I've read, I think that SAS thinks my variables are numerical values and is trying to run them as such. But I don't know how to change that as I am quite new to SAS.

Any help would be appreciated,

Thanks!

7 REPLIES 7
Reeza
Super User

Discriminant analysis does require numeric variables, as does the SAS procedure. 

Is your data numeric? If not, then you need to find the appropriate statistical analysis method. 

PGStats
Opal | Level 21

Your variables likely were imported as character. You could use a data step such as

 

data CCData3;
set CCData2;
num_ph = input(pH, best.);
num_HCO3 = input(HCO3, best.);
....
drop ph HCO3 ...;
rename num_ph=pH num_HCO3=HCO3 ....;
run;

proc discrim data = CCData3;
	class Sample;
	var pH HCO3 Ca Mg Na SO4 Cl Temp Conductivity Flow;
run;
PG
BLockwood
Calcite | Level 5

I ran that and got this:

WARNING: Variable num_ph cannot be renamed to pH because pH already exists.
WARNING: Variable num_HCO3 cannot be renamed to HCO3 because HCO3 already exists.
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.CCDATA3 may be incomplete.  When this step was stopped there were 0
         observations and 27 variables.
WARNING: Data set WORK.CCDATA3 was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds

 

PGStats
Opal | Level 21

Sorry. Make it simpler then...

 

data CCData3;
set CCData2;
num_ph = input(pH, best.);
num_HCO3 = input(HCO3, best.);
....
run;

proc discrim data = CCData3;
	class Sample;
	var num_:;
run;
PG
Rick_SAS
SAS Super FREQ

I suspect if you run PROC CONTENTS on your data you will discover that the variables are character:

 

proc contents data = CCData2; run;

 

There could be several reasons for this. In Excel, columns have 'types' like "General" or "Text." My guess is that your original spreadsheet is using text or something similar. Try using "General" or explicitly setting the columns to "Number". Then rerun the step where you import the excel file into SAS (maybe an import wizard? or PROC IMPORT?).

 

Region Capture.png

BLockwood
Calcite | Level 5

Yeah, you're right. When I ran the proc contents, it shows that they are all characters. However, even when I change the settings to general or numerical in excel and then re-import it, it still says they're characters. 


Capture.PNG
ballardw
Super User

If you are using Proc Import to read XLS or XLSX files then it is a good idea to insure that column headings are on row 1 and only row 1, data starts in row2 and is not blank. A number of blank values for column might result in assigning the value as character.

 

One option is to save the file as CSV and import that. One advantage is that you can use the guessingrows parameter to use more than the default 20 rows of data to determine data type when guessing and "blank" data will be missing so not influencing the choice of numeric or character. The next thing is the procedure generated datastep code that appears in the log that you can examine to see the choices made. if you disagree with the code you can copy and paste it into the editor and change informats, formats and such.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 1867 views
  • 2 likes
  • 5 in conversation