BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
jackblack
Calcite | Level 5

I have monhtly datasets, say data1601 to data1612.  Variables in these datasets have a same suffix  of the month.  For example, variables in data1601 will have a suffix of 1601 and data1612 will have a suffix of 1612, such as hight1601, weight1601, income1601...

I want to drop 1601,1602 and 1612 so that I can put 12 month data in to one dataset and have these names:

hight, weight, income...  How can I do that?

 

Thanks!

 

jackblack1688@gmail.com

1 ACCEPTED SOLUTION

Accepted Solutions
Shmuel
Garnet | Level 18

OK, it is my fault.

 

change in the loop:

 

do i=1 to nvars;

               varname = varname(dsid, i);

               ix = index(varname, trim(symget('suffix'))) -1;

               if  ix = length(trim(varname)) - 4  then      

                   ren = combl(ren || trim(varname) || ' = ' || substr(varname,1,ix) );

          end;

          dsid = close(dsid);

      end;

  

View solution in original post

17 REPLIES 17
ballardw
Super User

Do the monthly data sets have a variable inside that would identify the appropriate month?

 

How many variables are involved? A solution that may work for two or three could be too cumbersome for 50 or 60.

 

And if you have any say in the manner it is a better idea in the long run to have variables with the same characteristics and meaning to have the same name. Then add additional fields to hold the descriptive data such as month, sales, location and such.

jackblack
Calcite | Level 5
it has 200 variables.
Do the monthly data sets have a variable inside that would identify the appropriate month? YES
Shmuel
Garnet | Level 18

You can create a macro program to create a RENAME statment supplying list of prefixes only:

 

%macro ren(prefixes=,  mon=);

     %global ren;

     %let nvars = %sysfunc( countw("&prefixes"));

     %let ren = ;

      %do i=1 %to &nvars;

              %let pfx = %scan(&prefixses, &i);

              %let ren = %trim(&ren) &pfx.&mon = &pfx; 

      %end;

%mend ren;

%ren(prefixes=hight weight income ...,  mon=1601);

 

data want / view=want;

  set have(RENAME=(&ren));

   ...your code ...

run;

jackblack
Calcite | Level 5
seems I have to give 200 variable names in the list when I call ren? Is any way that we can create the list automatically?
Shmuel
Garnet | Level 18

Do you mean that you have 200 different prefixes ?

 

In that case there is another way to create the REN macro variable:

 

%let suffix = 1601;

data _NULL_;

       length ren $5000  varname $32 ;

       dsid = open("have");   /* replace "have" to your input dataset name in a format of lib.dsname */

       if dsid then do;

          nvars = attrn(dsid , 'nvars');

          do i=1 to nvars;

               varname = varname(dsid, i);

               if  index(varname, trim(symget('suffix'))) = length(trim(varname)) - 4 +1 then       /* 4 = length of "1601" */

                   ren = combl(ren || trim(varname) || "&suffix" || ' = ' || trim(varname) );

          end;

          dsid = close(dsid);

      end;

      call symput('REN' , ren);

run;

 

data want / view=want;

  set have(rename=(&ren));

       ... your code ...

run;

               

 

 

          

jackblack
Calcite | Level 5

what is COMBL?  seems I don't know this function.

Shmuel
Garnet | Level 18

sorry. it should be COMPBL (compress with blank); 

jackblack
Calcite | Level 5

Thank you so much!  I think the logic is good but it is little off the target.  This is what I have but I think we can fix it:

 

 %put &ren;

crea16011601 = crea1601 ncwb16011601 = ncwb1601 bbtt16011601 = bbtt1601 aclt16011601 = aclt1601 nchb16011601 = nchb1601 mobo16011601 = mobo1601

mcro16011601 = mcro1601 ncro16011601 = ncro1601 aclh16011601 = aclh1601 ncbu16011601 = ncbu1601 mbdq16011601 = mbdq1601

 

jackblack
Calcite | Level 5

it works if I modify it to

 

ren = compbl(ren || trim(varname) || ' = ' || substr(varname,1,length(varname)-4));

 

crea1601 = crea ncwb1601 = ncwb bbtt1601 = bbtt aclt1601 = aclt nchb1601 = nchb mobo1601 = mobo mcro1601 = mcro ncro1601 = ncro aclh1601 = aclh ncbu1601 = ncbu mbdq1601 = mbdq bart1601 = bart ruti1601 = ruti ...........

 

Thanks you soooooooooo much!

 

Shmuel
Garnet | Level 18

OK, it is my fault.

 

change in the loop:

 

do i=1 to nvars;

               varname = varname(dsid, i);

               ix = index(varname, trim(symget('suffix'))) -1;

               if  ix = length(trim(varname)) - 4  then      

                   ren = combl(ren || trim(varname) || ' = ' || substr(varname,1,ix) );

          end;

          dsid = close(dsid);

      end;

  

ballardw
Super User

@jackblack wrote:
seems I have to give 200 variable names in the list when I call ren? Is any way that we can create the list automatically?

This is a point where I ask how are you getting this data? If you are reading it from an external data source then you should be able to read it into sensible variables without the month information attached to the variable name and have a variable that contains that information. If someone else created the SAS data sets you may want to backtrack and see how those names come about. If there is not a real good reason for that naming, which from this topic sounds not to be the case, then you may be able to head it off earlier in the process.

 

There may also be a question of 200 variables in general.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

That sounds like an very bad and time consuming process.  Why is the process set up in this way.  What does your data transfer agreement say?  How are you using the data going forward?  I would adivise that the data be manipulated at import time into a usable format, i.e. havinig month information in a column, and a column for each of the paramters, and then all data set together:

MONTH      HEIGHT    WEIGHT     INCOME

1016           ...

 

This will aid you in any further processing.  To further this, is this clinical data, which it appears so from the headings.  If so, why are you not using one of the CDIS models for data - SEND, SDTM, ADaM etc.  This is industry standard nowadays.

Shmuel
Garnet | Level 18

Mainly I absolutly agree with the comments given by @ballardw and @RW9.

 

The only situation, I can think of, to leave the variable names as are, is

in case of dealing with merged datasets of different months and have a seperate variable per month.

 

Even though, having the MONTH as a column in the datasets, instead of - as a part of the variable name -

enables all analysis and reports with more simple code.

 

Ksharp
Super User
data data1601;
 height1601=1;
 weight1601=1;
 output;
run;
data data1612;
 height1612=2;
 wight1612=2;
 output;
run;
data _null_;
 set sashelp.vcolumn(where=(libname='WORK')) end=last;
 by memname;
 if _n_=1 then call execute('proc datasets library=work nolist nodetails;');
 if first.memname then call execute('modify '||memname||';rename ');
 call execute(name||'='||compress(name,'0123456789'));
 if last.memname then call execute(';');
 if last then call execute('quit;');
run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 17 replies
  • 5826 views
  • 2 likes
  • 5 in conversation