Programming the statistical procedures from SAS

Discrim variables issue

Reply
New Contributor
Posts: 3

Discrim variables issue

Hey, I am trying to run a proc discrim on an excel file I've imported.

This is the code that I am trying to run:

proc discrim data = CCData2;
	class Sample;
	var pH HCO3 Ca Mg Na SO4 Cl Temp Conductivity Flow;
run;

 And this is the error message I get:

188  proc discrim data = CCData2;
189      class Sample;
190      var pH HCO3 Ca Mg Na SO4 Cl Temp Conductivity Flow;
ERROR: Variable pH in list does not match type prescribed for this list.
ERROR: Variable HCO3 in list does not match type prescribed for this list.
ERROR: Variable Ca in list does not match type prescribed for this list.
ERROR: Variable Mg in list does not match type prescribed for this list.
ERROR: Variable Na in list does not match type prescribed for this list.
ERROR: Variable SO4 in list does not match type prescribed for this list.
ERROR: Variable Cl in list does not match type prescribed for this list.
ERROR: Variable Temp in list does not match type prescribed for this list.
ERROR: Variable Conductivity in list does not match type prescribed for this list.
ERROR: Variable Flow in list does not match type prescribed for this list.
191  run;

NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE DISCRIM used (Total process time):
      real time           0.00 seconds
      cpu time            0.01 seconds

From what I've read, I think that SAS thinks my variables are numerical values and is trying to run them as such. But I don't know how to change that as I am quite new to SAS.

Any help would be appreciated,

Thanks!

Super User
Posts: 18,498

Re: Discrim variables issue

Discriminant analysis does require numeric variables, as does the SAS procedure. 

Is your data numeric? If not, then you need to find the appropriate statistical analysis method. 

Respected Advisor
Posts: 4,742

Re: Discrim variables issue

Your variables likely were imported as character. You could use a data step such as

 

data CCData3;
set CCData2;
num_ph = input(pH, best.);
num_HCO3 = input(HCO3, best.);
....
drop ph HCO3 ...;
rename num_ph=pH num_HCO3=HCO3 ....;
run;

proc discrim data = CCData3;
	class Sample;
	var pH HCO3 Ca Mg Na SO4 Cl Temp Conductivity Flow;
run;
PG
New Contributor
Posts: 3

Re: Discrim variables issue

I ran that and got this:

WARNING: Variable num_ph cannot be renamed to pH because pH already exists.
WARNING: Variable num_HCO3 cannot be renamed to HCO3 because HCO3 already exists.
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.CCDATA3 may be incomplete.  When this step was stopped there were 0
         observations and 27 variables.
WARNING: Data set WORK.CCDATA3 was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds

 

Respected Advisor
Posts: 4,742

Re: Discrim variables issue

Sorry. Make it simpler then...

 

data CCData3;
set CCData2;
num_ph = input(pH, best.);
num_HCO3 = input(HCO3, best.);
....
run;

proc discrim data = CCData3;
	class Sample;
	var num_:;
run;
PG
SAS Super FREQ
Posts: 3,538

Re: Discrim variables issue

I suspect if you run PROC CONTENTS on your data you will discover that the variables are character:

 

proc contents data = CCData2; run;

 

There could be several reasons for this. In Excel, columns have 'types' like "General" or "Text." My guess is that your original spreadsheet is using text or something similar. Try using "General" or explicitly setting the columns to "Number". Then rerun the step where you import the excel file into SAS (maybe an import wizard? or PROC IMPORT?).

 

Region Capture.png

New Contributor
Posts: 3

Re: Discrim variables issue

Yeah, you're right. When I ran the proc contents, it shows that they are all characters. However, even when I change the settings to general or numerical in excel and then re-import it, it still says they're characters. 


Capture.PNG
Super User
Posts: 10,819

Re: Discrim variables issue

If you are using Proc Import to read XLS or XLSX files then it is a good idea to insure that column headings are on row 1 and only row 1, data starts in row2 and is not blank. A number of blank values for column might result in assigning the value as character.

 

One option is to save the file as CSV and import that. One advantage is that you can use the guessingrows parameter to use more than the default 20 rows of data to determine data type when guessing and "blank" data will be missing so not influencing the choice of numeric or character. The next thing is the procedure generated datastep code that appears in the log that you can examine to see the choices made. if you disagree with the code you can copy and paste it into the editor and change informats, formats and such.

Ask a Question
Discussion stats
  • 7 replies
  • 352 views
  • 2 likes
  • 5 in conversation