I have dataset with 200 columns. Out of that I need to replace values in original columns with modified columns which always start with _M_.
E.g. out of 200 columns in datasets, Account_ID, ACCOUNT_CLASSIFICATION_CODE ,BS_CENTRE_ID has corresponding columns as _M_ACCOUNT_ID,_M_ACCOUNT_CLASSIFICATION_CODE,_M_BS_CENTRE_ID. However, there could be any number of columns which has such corresponding _M_ pair
So we need to identify how many columns have such pairs and then replace values in original columns with modified columns.
Below is the sample Have dataset
MONTH_KEY | PERIOD_ID | ACCOUNT_ID | ACCOUNT_CLASSIFICATION_CODE | BILLING_CODE | PRODUCT_CODE | BS_CENTRE_ID | _M_ACCOUNT_ID | _M_ACCOUNT_CLASSIFICATION_CODE | _M_BS_CENTRE_ID |
202109 | 20210930 | CC 5163104001923574 | 65031 | 871 | 13334 | 161730 | MAJ12345 | 11433 | 113 |
202109 | 20210930 | CC 5163190010421759 | 66031 | V05 | 13255 | 161730 | MAJ87564 | 67845 | 345 |
202109 | 20210930 | CC 5163230005429159 | 28051 | G02 | 13253 | 161730 | JAK89167 | 98365 | 765 |
202109 | 20210930 | CC 5163202036863805 | 43151 | V05 | 13249 | 161730 | PPT11765 | 41897 | 897 |
202109 | 20210930 | CC 4715726286102320 | 12011 | 105 | 13124 | 161730 | RAM90145 | 34950 | 950 |
202109 | 20210930 | CC 4564729003710930 | 95021 | S02 | 13126 | 161730 | TPT66387 | 12854 | 684 |
And below is what is expected in Want dataset
MONTH_KEY | PERIOD_ID | ACCOUNT_ID | ACCOUNT_CLASSIFICATION_CODE | BILLING_CODE | PRODUCT_CODE | BS_CENTRE_ID |
202109 | 20210930 | MAJ12345 | 11433 | 871 | 13334 | 113 |
202109 | 20210930 | MAJ87564 | 67845 | V05 | 13255 | 345 |
202109 | 20210930 | JAK89167 | 98365 | G02 | 13253 | 765 |
202109 | 20210930 | PPT11765 | 41897 | V05 | 13249 | 897 |
202109 | 20210930 | RAM90145 | 34950 | 105 | 13124 | 950 |
202109 | 20210930 | TPT66387 | 12854 | S02 | 13126 | 684 |
I have tried using coalesce and also using rename , but somehow I am not able to find correct solution.
Any help is really appreciated
Unfortunately you didn't provide data in usable form, so i used a modified version sashelp.class. The subsequent steps extract the information of interest.
data work.class;
set sashelp.class;
length _m_Age _m_Weight 8;
_m_Age = Age + 10;
_m_Weight = round(Weight * 1.5, 0.01);
run;
data work.Variables;
set sashelp.vcolumn(where=(libname = 'WORK' and MemName = 'CLASS' and not ('_m_' =: Name)));
bob = _n_;
keep Name bob;
run;
data work.M_Variables;
set sashelp.vcolumn(where=(libname = 'WORK' and MemName = 'CLASS' and '_m_' =: Name));
length mVar $ 32;
mVar = Name;
Name = substr(Name, 4);
keep Name mVar;
run;
The first solution will not change your dataset, but will have a higher runtime, if the dataset is large.
data _null_;
set work.Variables end= jobDone;
if _n_ = 1 then do;
if 0 then set work.M_Variables;
declare hash h(dataset: 'work.M_Variables');
h.defineKey('Name');
h.defineData('mVar');
h.defineDone();
call execute('proc sql noprint; create table work.fixed as select');
end;
mVar = ' ';
if _n_ > 1 then do;
call execute(',');
end;
if h.find() = 0 then do;
call execute(catx(' ', mVar, 'as', Name));
end;
else do;
call execute(Name);
end;
if jobDone then do;
call execute(' from work.class; quit;');
end;
run;
And here's another solution modifying your dataset, because there is no undo, you should be 100% sure, that you understand what happens.
data _null_;
set work.Variables end= jobDone;
if _n_ = 1 then do;
if 0 then set work.M_Variables;
declare hash h(dataset: 'work.M_Variables');
h.defineKey('Name');
h.defineData('mVar');
h.defineDone();
end;
length
dropList $ 6600 /* 200 Variables, max length 32 chars, +1 to hold the separator */
renameList $ 14000 /* 66 chars for each pair oldname/newname incl. separators */
;
retain dropList renameList;
if h.find() = 0 then do;
dropList = catx(',', dropList, Name);
renameList = catx(' ', renameList, cats(mVar, '=', Name));
end;
if jobDone and not missing(dropList) then do;
call execute(catx(' ', 'proc sql noprint; alter table work.Class drop', dropList, '; quit;'));
call execute(catx(' ', 'proc datasets nolist library=work; modify Class; rename', renameList, ';quit;'));
end;
run;
Hello
In the first para of your question para of your question you are requesting assistance in renaming columns.
Then you want assistance if finding columns to be renamed.
Can you please be more clear?
Renaming columns is a simple process in SAS if that is what you are looking for. I have given two examples below.
proc sql;
create table class1 as
select Name as _M_Name, Sex as _M_Sex , Age as _M_Age
from sashelp.class;
quit;
data class2;
set sashelp.class(Rename=(Name=_M_Name Sex=_M_ Age=_M_Age) );
run;
Hello @Swapnil_21
Once can find the columns of any data set using SAS dictionaty table or proc contents.
I am giveng and example below using sashelp.class.
Here libname is the library (SAS HELP) where the data set memname (CLASS in the example )resides.
proc sql;
select name from dictionary.columns
where libname="SASHELP" and memname="CLASS" and memtype="DATA";
quit;
Thus once you have the column names, you can decide what columns to rename ans shown earlier.
@Swapnil_21 wrote:
So precisely my question is
1. Out of 200 column how do I know which columns start with _M_ and then remaining portion matches with actual column
2. When matches, rename them with original column.
Sample datasets provided earlier
Your second step won't work without dropping the original variable first.
Before starting to solve such task, i recommend writing the code for one m-Variable, skipping any existence checks. The most obvious, but also not working solution is:
proc datasets nolist;
modify your_dataset;
drop old_var;
rename _m_old_var = old_var;
quit;
Unfortunately proc datasets doesn't support the drop statement. Sad.
Dropping variable is possible using proc sql alter table, but the alter statement can't rename variables. Sad²
So you need both: proc sql to drop the unwanted variables and proc datasets to rename the m-variables. If the dataset is really small, you could use a data null step with call execute generating a proc sql to creating a new dataset.
Will return later, as soon as i had the time to create something working.
Unfortunately you didn't provide data in usable form, so i used a modified version sashelp.class. The subsequent steps extract the information of interest.
data work.class;
set sashelp.class;
length _m_Age _m_Weight 8;
_m_Age = Age + 10;
_m_Weight = round(Weight * 1.5, 0.01);
run;
data work.Variables;
set sashelp.vcolumn(where=(libname = 'WORK' and MemName = 'CLASS' and not ('_m_' =: Name)));
bob = _n_;
keep Name bob;
run;
data work.M_Variables;
set sashelp.vcolumn(where=(libname = 'WORK' and MemName = 'CLASS' and '_m_' =: Name));
length mVar $ 32;
mVar = Name;
Name = substr(Name, 4);
keep Name mVar;
run;
The first solution will not change your dataset, but will have a higher runtime, if the dataset is large.
data _null_;
set work.Variables end= jobDone;
if _n_ = 1 then do;
if 0 then set work.M_Variables;
declare hash h(dataset: 'work.M_Variables');
h.defineKey('Name');
h.defineData('mVar');
h.defineDone();
call execute('proc sql noprint; create table work.fixed as select');
end;
mVar = ' ';
if _n_ > 1 then do;
call execute(',');
end;
if h.find() = 0 then do;
call execute(catx(' ', mVar, 'as', Name));
end;
else do;
call execute(Name);
end;
if jobDone then do;
call execute(' from work.class; quit;');
end;
run;
And here's another solution modifying your dataset, because there is no undo, you should be 100% sure, that you understand what happens.
data _null_;
set work.Variables end= jobDone;
if _n_ = 1 then do;
if 0 then set work.M_Variables;
declare hash h(dataset: 'work.M_Variables');
h.defineKey('Name');
h.defineData('mVar');
h.defineDone();
end;
length
dropList $ 6600 /* 200 Variables, max length 32 chars, +1 to hold the separator */
renameList $ 14000 /* 66 chars for each pair oldname/newname incl. separators */
;
retain dropList renameList;
if h.find() = 0 then do;
dropList = catx(',', dropList, Name);
renameList = catx(' ', renameList, cats(mVar, '=', Name));
end;
if jobDone and not missing(dropList) then do;
call execute(catx(' ', 'proc sql noprint; alter table work.Class drop', dropList, '; quit;'));
call execute(catx(' ', 'proc datasets nolist library=work; modify Class; rename', renameList, ';quit;'));
end;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.