BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
VarunD
Obsidian | Level 7

Hi,

 

I am wondering if it is possible to use case when expression in Proc SQL to replace missing values of a variable based on a single condition without creating a new variable.

I know that I can work around this by creating a new variable and then dropping it later if needed but I wanted to find out if it is possible without creating a new variable.

 

For e.g. If I have to replace missing values of Car_Model variable to Unspecified  from Car_History dataset, can I do something like this ? 

 

 

Proc Sql;

create table New_Cars as

select distinct *,

case when Car_Model is missing then 'Unspecified' 

end

from Car_History;

run;

 

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

Yes, it will.

 

If you don't want that, then you have to replace 

 

select distinct *,

with a list of variables that does not include car_model (where I assume that you have variables named VARIABLE1, VARIABLE2, etc.) and the car_model is specifically left out of this list.

 

select distinct variable1,variable2,
--
Paige Miller

View solution in original post

12 REPLIES 12
Amir
PROC Star

Does the following help:

 

/* set all names beginning with 'J' to missing */
data have;
   set sashelp.class;
   
   if name =: 'J' then
      call missing(name);
run;

/* update missing values */
proc sql;
   update
      have
   set
      name = 'N/A'
   where
      missing(name)
   ;
quit;

 

Amir.

VarunD
Obsidian | Level 7
Thanks for your help. I appreciate it. It sound like it should work buy I wanted to find out if case when expression can be used in this case.
PaigeMiller
Diamond | Level 26

@VarunD wrote:

 

For e.g. If I have to replace missing values of Car_Model variable to Unspecified  from Car_History dataset, can I do something like this ? 

 

 

Proc Sql;

create table New_Cars as

select distinct *,

case when Car_Model is missing then 'Unspecified' 

end

from Car_History;

run;

 


Looks like it should run, but maybe you want this:

 

case when missing(Car_Model) then 'Unspecified' else car_model end as car_model
--
Paige Miller
VarunD
Obsidian | Level 7

Code runs but the missing values are still missing.

I get a warning that "Variable Car_Model already exists on file WORK.Car_History" 

PaigeMiller
Diamond | Level 26

Try this:

 

case when missing(Car_Model) then 'Unspecified' else car_model end as car_model1

 

If that doesn't fix the problem, then you need to provide a portion of your actual data set following these instructions: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat...

--
Paige Miller
VarunD
Obsidian | Level 7
But this will create a new variable car_model1, isn't it ?
PaigeMiller
Diamond | Level 26

Yes, it will.

 

If you don't want that, then you have to replace 

 

select distinct *,

with a list of variables that does not include car_model (where I assume that you have variables named VARIABLE1, VARIABLE2, etc.) and the car_model is specifically left out of this list.

 

select distinct variable1,variable2,
--
Paige Miller
VarunD
Obsidian | Level 7
Thanks a lot. That was the problem. Great catch.
Amir
PROC Star

I think the reason you're getting the 'already exists' message is because you are selecting all columns to start with using '*'.

 

Instead, try listing the columns individually and in place of 'Car_Model' use the logic suggested by others so it won't try to create a duplicate.

 

Amir.

Reeza
Super User

You can, but maybe you want the COALESCE() function instead? It goes through a list of variables and picks the first non-missing, but you can also include a string.

 

select distinct *, coalesce(car_model, 'Unspecified') as Car_Model
from car_history;

@VarunD wrote:

Hi,

 

I am wondering if it is possible to use case when expression in Proc SQL to replace missing values of a variable based on a single condition without creating a new variable.

I know that I can work around this by creating a new variable and then dropping it later if needed but I wanted to find out if it is possible without creating a new variable.

 

For e.g. If I have to replace missing values of Car_Model variable to Unspecified  from Car_History dataset, can I do something like this ? 

 

 

Proc Sql;

create table New_Cars as

select distinct *,

case when Car_Model is missing then 'Unspecified' 

end

from Car_History;

run;

 


 

aaronh
Quartz | Level 8

Hi,

 

If all you want to do here is to update the "Car_model" variable by changing missing values to 'Unspecified', you could consider using a format, such as:

 

**Make a format**;
proc format;
  value $model_miss (default = 50)
  ' ' = 'Unspecified'
  ;
run;


**Apply the format to car_model and save as a new dataset**;
data car_history2;
  set car_history;
  car_model = put(car_model, $model_miss.);
run;

VarunD
Obsidian | Level 7
Very Interesting ! Thanks a lot for your help.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 12 replies
  • 1300 views
  • 3 likes
  • 5 in conversation