I have monhtly datasets, say data1601 to data1612. Variables in these datasets have a same suffix of the month. For example, variables in data1601 will have a suffix of 1601 and data1612 will have a suffix of 1612, such as hight1601, weight1601, income1601...
I want to drop 1601,1602 and 1612 so that I can put 12 month data in to one dataset and have these names:
hight, weight, income... How can I do that?
Thanks!
OK, it is my fault.
change in the loop:
do i=1 to nvars;
varname = varname(dsid, i);
ix = index(varname, trim(symget('suffix'))) -1;
if ix = length(trim(varname)) - 4 then
ren = combl(ren || trim(varname) || ' = ' || substr(varname,1,ix) );
end;
dsid = close(dsid);
end;
Do the monthly data sets have a variable inside that would identify the appropriate month?
How many variables are involved? A solution that may work for two or three could be too cumbersome for 50 or 60.
And if you have any say in the manner it is a better idea in the long run to have variables with the same characteristics and meaning to have the same name. Then add additional fields to hold the descriptive data such as month, sales, location and such.
You can create a macro program to create a RENAME statment supplying list of prefixes only:
%macro ren(prefixes=, mon=);
%global ren;
%let nvars = %sysfunc( countw("&prefixes"));
%let ren = ;
%do i=1 %to &nvars;
%let pfx = %scan(&prefixses, &i);
%let ren = %trim(&ren) &pfx.&mon = &pfx;
%end;
%mend ren;
%ren(prefixes=hight weight income ..., mon=1601);
data want / view=want;
set have(RENAME=(&ren));
...your code ...
run;
Do you mean that you have 200 different prefixes ?
In that case there is another way to create the REN macro variable:
%let suffix = 1601;
data _NULL_;
length ren $5000 varname $32 ;
dsid = open("have"); /* replace "have" to your input dataset name in a format of lib.dsname */
if dsid then do;
nvars = attrn(dsid , 'nvars');
do i=1 to nvars;
varname = varname(dsid, i);
if index(varname, trim(symget('suffix'))) = length(trim(varname)) - 4 +1 then /* 4 = length of "1601" */
ren = combl(ren || trim(varname) || "&suffix" || ' = ' || trim(varname) );
end;
dsid = close(dsid);
end;
call symput('REN' , ren);
run;
data want / view=want;
set have(rename=(&ren));
... your code ...
run;
what is COMBL? seems I don't know this function.
sorry. it should be COMPBL (compress with blank);
Thank you so much! I think the logic is good but it is little off the target. This is what I have but I think we can fix it:
%put &ren;
crea16011601 = crea1601 ncwb16011601 = ncwb1601 bbtt16011601 = bbtt1601 aclt16011601 = aclt1601 nchb16011601 = nchb1601 mobo16011601 = mobo1601
mcro16011601 = mcro1601 ncro16011601 = ncro1601 aclh16011601 = aclh1601 ncbu16011601 = ncbu1601 mbdq16011601 = mbdq1601
it works if I modify it to
ren = compbl(ren || trim(varname) || ' = ' || substr(varname,1,length(varname)-4));
crea1601 = crea ncwb1601 = ncwb bbtt1601 = bbtt aclt1601 = aclt nchb1601 = nchb mobo1601 = mobo mcro1601 = mcro ncro1601 = ncro aclh1601 = aclh ncbu1601 = ncbu mbdq1601 = mbdq bart1601 = bart ruti1601 = ruti ...........
Thanks you soooooooooo much!
OK, it is my fault.
change in the loop:
do i=1 to nvars;
varname = varname(dsid, i);
ix = index(varname, trim(symget('suffix'))) -1;
if ix = length(trim(varname)) - 4 then
ren = combl(ren || trim(varname) || ' = ' || substr(varname,1,ix) );
end;
dsid = close(dsid);
end;
@jackblack wrote:
seems I have to give 200 variable names in the list when I call ren? Is any way that we can create the list automatically?
This is a point where I ask how are you getting this data? If you are reading it from an external data source then you should be able to read it into sensible variables without the month information attached to the variable name and have a variable that contains that information. If someone else created the SAS data sets you may want to backtrack and see how those names come about. If there is not a real good reason for that naming, which from this topic sounds not to be the case, then you may be able to head it off earlier in the process.
There may also be a question of 200 variables in general.
That sounds like an very bad and time consuming process. Why is the process set up in this way. What does your data transfer agreement say? How are you using the data going forward? I would adivise that the data be manipulated at import time into a usable format, i.e. havinig month information in a column, and a column for each of the paramters, and then all data set together:
MONTH HEIGHT WEIGHT INCOME
1016 ...
This will aid you in any further processing. To further this, is this clinical data, which it appears so from the headings. If so, why are you not using one of the CDIS models for data - SEND, SDTM, ADaM etc. This is industry standard nowadays.
Mainly I absolutly agree with the comments given by @ballardw and @RW9.
The only situation, I can think of, to leave the variable names as are, is
in case of dealing with merged datasets of different months and have a seperate variable per month.
Even though, having the MONTH as a column in the datasets, instead of - as a part of the variable name -
enables all analysis and reports with more simple code.
data data1601;
height1601=1;
weight1601=1;
output;
run;
data data1612;
height1612=2;
wight1612=2;
output;
run;
data _null_;
set sashelp.vcolumn(where=(libname='WORK')) end=last;
by memname;
if _n_=1 then call execute('proc datasets library=work nolist nodetails;');
if first.memname then call execute('modify '||memname||';rename ');
call execute(name||'='||compress(name,'0123456789'));
if last.memname then call execute(';');
if last then call execute('quit;');
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.