BookmarkSubscribeRSS Feed
zsmith93
Calcite | Level 5

For starts, yes my variable is misspelled...

 

Also, I am extremely novice at SAS and SQL so I'm not sure what I'm doing wrong.

 

Can someone enlighten me what I'm doing wrong? I thought that making the input with quotation marks would make it a character type; or is there something like string(n) for SAS?

 

proc sql;
alter table collegepayback
add Classifciation varchar(10);
run;

 

proc sql;
update collegepayback
SET Classifciation =
CASE
When m_major = 'Aerospace Engineering'
or m_major = 'Biology'
or m_major = 'Chemical Engineering'
or m_major = 'Chemistry'
or m_major = 'Computer Engineering'
or m_major = 'Computer Science'
or m_major = 'Geography'
or m_major = 'Geology'
or m_major = 'Industrial Engineering'
or m_major = 'Information Technology'
or m_major = 'Math'
or m_major = 'Mechanical Engineering'
or m_major = 'Physics'
THEN Classifciation = 'STEM'
ELSE Classifciation = 'nonSTEM'
END;
run;

7 REPLIES 7
Reeza
Super User
You're trying to add a new column with an ALTER statement? Are you stuck with using SQL and doing it this way or do you have other options? Which part is giving you an error?
zsmith93
Calcite | Level 5

Oh, I'm using an ALTER statement because I'm more comfortable with running SQL commands than SAS commands, since my knowledge of both is atrocious but slightly better in regards to SQL, or so I believe. 

 

when I run the second query I get the title of the thread as an error:

 

'ERROR: Classification, a character column, can only be updated with a character expression.'

AMSAS
SAS Super FREQ

Remove the classifciation= from the THEN & ELSE statements

Here's an example

data collegepayback ;
	input m_major : $20. ;
cards ;
Physics
Geography
Arts
English
;
run ;


proc sql;
alter table collegepayback
add Classifciation character(10);
quit ;
 

proc sql;
update collegepayback
SET Classifciation = 
	CASE
		WHEN (m_major = 'Physics'
		   or m_major = 'Geography') then "STEM"
		ELSE "nonSTEM"
	END;
quit ;
run;
zsmith93
Calcite | Level 5

I don't understand why your third query executes whereas mine does not, am I not allowed to keep using or conditions? 

Reeza
Super User

VARCHAR(10) is not correct format. Note that @AMSAS  solution uses CHARACTER instead.

 

You should find this message in your log. 

 

 74         
 75         proc sql;
 76         alter table class
 77         add Classifciation varchar(10);
 NOTE: One or more variables were converted because the data type is not supported by the V9 engine. For more details, run with 
       options MSGLEVEL=I.
 NOTE: Table WORK.CLASS has been modified, with 6 columns.

Also, your query starts by setting CLASSIFCATION = so you don't need to specify it in the THEN statements. 

 

And you should end PROC SQL with a QUIT statement.

 

You may also want to use IN so you can avoid multiple IF statements with the OR

 

data class;
set sashelp.class;
run;

proc sql;
alter table class
add Classifciation varchar(10);
quit;

proc sql;
update class
SET Classifciation =
CASE
When (name in ("Alfred", "Jane", "Susan") )
THEN 'STEM'
ELSE  'nonSTEM'
END;
quit;

and the data step solution, which is easier IMO:

 

data want;
set class;

if name in ('Alfred', 'Jane', 'Susan') then classification = 'Stem';
else classification='NonStem';

run;
Tom
Super User Tom
Super User

In your first example the CASE statement is returning a number. The result of one of the two boolean expressions testing what value CLASSIFICATION has.  But the logic of the statement is also wrong.

 

If you don't really know either SAS or SQL probably would be much easier to learn SAS since it has a more natural way of thinking about data.  Also don't overwrite your input data (especially if you are stilling learning the language).

data collegepayback_fixed;
    set collegepayback;
    length Classification $10;
    if m_major in 
        ('Aerospace Engineering'
        ,'Biology'
        ,'Chemical Engineering'
        ,'Chemistry'
        ,'Computer Engineering'
        ,'Computer Science'
        ,'Geography'
        ,'Geology'
        ,'Industrial Engineering'
        ,'Information Technology'
        ,'Math'
        ,'Mechanical Engineering'
        ,'Physics'
        ) 
    THEN Classification = 'STEM' ;
    ELSE Classification = 'nonSTEM';
run;

If did want to keep trying to use the UPDATE statement perhaps you just want to use the WHERE clause to pick which rows to update?

Try this little example:

proc sql;
create table class as select * from sashelp.class;
alter table class add NewVar varchar(10);
update class set NewVar='STEM' where Name='Alfred';
update class set NewVar='nonSTEM' where not (Name='Alfred');
quit;

 

 

 

AMSAS
SAS Super FREQ

@zsmith93 There is another approach you can use in SAS, using a custom FORMAT and the PUTC function

Click on links for documentation.

If for example you just want to produce a report, then there is no need to generate the extra variable (classification) you could just output the m_major variable in a report and apply the custom format ($stem). This saves space in your dataset and processing time, as you don't have to run through the dataset twice to generate a report.

 

/* Create example dataset */
data collegepayback ;
	input m_major : $20. ;
cards ;
Physics
Geography
Arts
English
;
run ;

/* Create a Custom Format */
proc format ;
	value $stem 
		"Physics","Geography" = "STEM" 
		OTHER = "nonSTEM" ;
run ;


data newTable ;
	set collegepayback ;
	/* PUTC function applies the $stem format to m_major, converting it to STEM/nonSTEM */
	classification=putc(m_major,"$stem.") ;
run ;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 2823 views
  • 1 like
  • 4 in conversation