Wondered if there is an easy way to recode a variable, and keep the same name and position. For example, using the attached code, I can recode variable Type in sashelp.cars, after that I want to call it Type and keep the same position. Imagine there are hundreds of variables. I don't want to list all of them to keep the order. Is there any easy way to do it with too many coding?
PROC SQL;
CREATE TABLE want AS
SELECT Make,
Model,
case when Type in ("Hybrid", "Wagon")
then "Wagon"
else type
end as type,
Origin,
DriveTrain,
MSRP,
Invoice,
EngineSize,
Cylinders,
Horsepower,
MPG_City,
MPG_Highway,
Weight,
Wheelbase,
Length
FROM SASHELP.CARS;
QUIT;
Why does it have to have the same position? Please explain.
SAS doesn't care. All of your work will be done properly regardless of position.
No specific reason to keep the position. Our dataset is always arranged by different group of variables. Just wanted to keep the recoded var in the same group. like said wondered if there is an easy wat to do it without many additional codes.
@sasecn wrote:
No specific reason to keep the position. Our dataset is always arranged by different group of variables. Just wanted to keep the recoded var in the same group. like said wondered if there is an easy wat to do it without many additional codes.
Then why spend time on it at all?
@PaigeMiller wrote:
@sasecn wrote:
No specific reason to keep the position. Our dataset is always arranged by different group of variables. Just wanted to keep the recoded var in the same group. like said wondered if there is an easy wat to do it without many additional codes.
Then why spend time on it at all?
I don't know about you but I cannot keep the details of the contents of the 100's of thousands of individual datasets I might have to examine. If they are organized rationally then they are a h*** of lot easier to work with. You can open them and look at the first few records. Poke around. See what is going on.
For a vast majority of cases there is no reason to create a new data set for such a simple activity.
When displaying the values of one or more variables with the same value and desired display then often creating a custom format and applying that will do what you want. No new data set. Flexible as a change in the definition of that format is automatically applied at us.
Example:
proc format; value $alttype "Hybrid", "Wagon" = "Wagon" ; run; Proc freq data=sashelp.cars; table type; format type $alttype.; run;
The above format only affects the given values, none of the the other values of a variable would be affected.
Note that the group counts with that format applied combine the two types as one. Most of the graphing, reporting and analysis procedures will honor this behavior.
No need to create a new data set. See the example.
You could use Proc Datasets to change the default format of variables which avoids the time of rebuilding a data set, which can be significant with large sets.
Doing that is trivial if you use normal SAS code.
data want;
set sashelp.cars;
if Type in ("Hybrid", "Wagon") then type="Wagon";
run;
And if you actually had to change the type (or the length for character variables) it is not that much harder. You will need to know the name of the FIRST variable in the existing dataset.
data want;
set sahelp.cars(keep=make--type rename=(type=oldtype));
length type $100;
set sashelp.cars(drop=type);
if oldtype in ("Hybrid", "Wagon") then type="Wagon";
else type=oldtype;
drop oldtype;
run;
Thank you!
The order of variables in a tables shouldn't matter. This is more of an "Excel" thinking. But understand the convenience a defined order can provide.
If you want to create a totally new variable and have it added at a defined location in the target table then below some options.
For a data step (the two dashes in the variable list are important!):
data test_ds;
/* column mapping */
if 0 then
do;
set sashelp.class(keep=name--age);
length new_col $3;
set sashelp.class(keep=height--weight);
end;
call missing(of _all_);
/* read the source data */
set sashelp.class;
new_col='XXX';
run;
It's a bit more involved for a SQL as there the order of variables in the SELECT clause defines the order in the created table so you need to list all the variables.
To reduce typing if there are too many variables in the source table below some "lazy" way to get there.
1. Create list of all columns in source table
options ps=max ls=max;
proc sql feedback noexec;
select *
from sashelp.class as t
;
quit;
2. Copy/paste expanded SQL found in SAS Log into Notepad++ and run plugin "Poor Man's SQL Formatter"
In SAS log:
Result in Notepad++ after running the plugin
SELECT T.Name ,T.Sex ,T.Age ,T.Height ,T.Weight FROM SASHELP.CLASS T;
3. Copy/past the formatted code from Notepad++ back into your SAS Editor and add the new column at the desired location
proc sql;
create table test_sql as
SELECT T.Name
,T.Sex
,T.Age
,'XXX' as new_col length=3
,T.Height
,T.Weight
FROM SASHELP.CLASS T;
quit;
First, if you are using DATA step to recode variables, then variable order will be preserved.
data wnt;
set sashelp.cars;
if type='Hybrid' then type='Hyb'; else
if type='Sedan' then type='Sed'; else
if type='Sports' then type='Spo'; else
if type='Truck' then type='Trk';
run;
But if you MUST use proc sql, then
proc sql noprint;
create view vcars as select
case when type='Hybrid' then 'Hyb'
when type='Sedan' then 'Sed'
when type='Sports' then 'Spo'
when type='Truck' then 'Trk'
else type
end as type
,*
from sashelp.cars;
quit;
data want;
set sashelp.cars (obs=0)
vcars;
run;
The first idea here is that creating a view is just defining a set of rules. The data is not processed until the view is used. So you are not wasting resources writing an intermediate data set file to disk. The recoding doesn't take place until the subsequent DATA step.
The second idea is using the "obs=0" option for the first dataset named in the SET statement. This will read no observations directly from sashelp.cars but the second argument instructs SAS to read all the data from VCARS. Because the SAS compiler will construct the program data vector ordering variables as they are encountered, they will be ordered as in sashelp.cars, since it is the first argument of the SET statement.
Thanks for the details!
In a case like yours, where the type and length of the variable do not change, this will do it:
data want;
set sashelp.cars;
type = ifc(type = 'Hybrid','Wagon`,type);
run;
Thanks.
Thank you for all the help! Great explanation from everyone 😀
@sasecn - Please help us by updating your post as answered if you are happy with the responses.
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.