DATA Step, Macro, Functions and more

how to remove all variable names from datasets

Accepted Solution Solved
Reply
Contributor
Posts: 37
Accepted Solution

how to remove all variable names from datasets

Hi,

I need to PROC APPEND a few hundred datasets, but the variable names in all datasets are different.  is there a way to remove all variable names so they all become VAR1, VAR2, VAR3...VAR50?

dataset 1 looks like this:

name     age     ID

------------------------

Cindy     15     101

Kenny    10     102

...

dataset 2 looks like this:

firstname     yrs     Identification

------------------------------------

Liz               13     103

Harry           17     104

...

there is a problem with PROC APPEND when variables are named differently.  is there a way to remove all variable names from all datasets, so all variables  become Var1, Var2...only?

Thanks!


Accepted Solutions
Solution
‎04-13-2012 01:30 PM
Valued Guide
Posts: 2,175

Re: how to remove all variable names from datasets

Use SQL. UNION ALL without  corresponding

THen column order is used rather than name

View solution in original post


All Replies
Contributor
Posts: 37

Re: how to remove all variable labels from datasets

so the final appended dataset can become?

VAR1      VAR2      VAR3

-----------------------------------

Cindy     15     101

Kenny    10     102

...

Liz         13     103

Harry     17     104

...

i can label all variables after appending all datasets.  any suggestions?

thanks.

Frequent Contributor
Posts: 138

Re: how to remove all variable labels from datasets

Hi, You can append the datasets and then can apply label.I have tried and i don't find any problem.

Can you elaborate what is the issue if we append with different labels.

Contributor
Posts: 37

Re: how to remove all variable labels from datasets

sorry...i meant variable names instead.  i try to append all datasets that have different variable names.  not labeling issue.

Super User
Posts: 9,687

Re: how to remove all variable labels from datasets

data class1;
  set sashelp.class;
  label age='AGE GROUP';
run;

data class2;
  set sashelp.class;
  label sex='Gender';
run;


data _null_;
 set sashelp.vmember(keep=memname) end=last;
 if _n_ eq 1 then call execute('proc datasets library=work nolist;');
 call execute('modify '||trim(memname)||'; attrib _all_ label=" ";');
 if last then call execute('quit;');
run;





Ksharp

Super Contributor
Posts: 349

Re: how to remove all variable labels from datasets

Hi Cyndia,

Try this code...Hope it helps.

data one;

input produce $   weight        bags;

cards;

carrots     10              500

potatos     5               200

garlic        2               100

;

run;

data two;

input fruit  $   wt  containers;

cards;

apples     15              100

pears        9                20

;

run;

%macro test;

proc sql;

      create table libtables as select memname from sashelp.vtable

      where libname eq %upcase("work");

      quit;

        proc sql;

select count(*) into :i from libtables;

select memname into :j1- :j%left(&i)  from libtables;

quit;

%do ii=1 %to %left(&i);

%global renlstⅈ

proc sql noprint;

   select compress(name || '= v_'||compress(left(put(varnum,5.))))  into: renlst&ii separated by ' '

       from dictionary.columns

       where libname='WORK' and memname="&&j&ii.";

         

quit;

%end;

%mend test;

%test;

%put &renlst1 &renlst2 ;

data want;

set one(rename=(&renlst1)) two(rename=(&renlst2));

run;

Thanks,

Shiva

Super Contributor
Posts: 1,636

Re: how to remove all variable labels from datasets

you don't have to have same labels to run proc append as long as variable names are the same.

Super Contributor
Posts: 264

Re: how to remove all variable labels from datasets

Hi Cyndia,

it seems, that you are mixing up the terms "label" and "name".

Assuming, that the datasets are stored in the library "bob":

proc sql noprint;

       create table work.UglyColumns as

              select MemName as DatasetName, Name as Variable, VarNum as Position

              from sashelp.vcolumn

              where lowcase(LibName) = 'bob'

              order by DatasetName, VarNum

       ;

quit;

data _null_;

       set work.UglyColumns end=done;

       by DatasetName;

       if _n_ = 1 then do;

              call execute('proc datasets library=bob nodetails nolist;');

       end;

       if first.DatasetName then do;

              call execute('modify ' || DatasetName || ';');

              call execute('rename ');

       end;

       call execute(catt(Variable, ' = Var', Position));

       if last.DatasetName then do;

              call execute(';'); /* rename-statement */

       end;

       if done then do;

              call execute('run;');

       end;

run;

Contributor
Posts: 37

Re: how to remove all variable labels from datasets

Sorry, I meant variable names are all different.  

Solution
‎04-13-2012 01:30 PM
Valued Guide
Posts: 2,175

Re: how to remove all variable names from datasets

Use SQL. UNION ALL without  corresponding

THen column order is used rather than name

Respected Advisor
Posts: 3,899

Re: how to remove all variable names from datasets

As Peter C suggests a SQL UNION ALL should do the trick IF your variables with different names but the same meaning are in ALL data sets in the same order.

As you're having hundreds of data sets consider generating the SQL code like done in code below.

/* create test data */
data ds1 ds10 ds5;
   input name $ age ID $;
   datalines;
Cindy     15     101
Kenny    10     102
;
run;
data ds2 ds7 ds4;
   input firstname $ yrs Identification $;
   datalines;
Liz               13     103
Harry           17     104
;
run;

/* DataSets: View containing all data sets for concatenation */
proc sql noprint;
   create view DataSets as
      select cats(libname,'.',memname) length=65 format=$65. as table
      from dictionary.tables
      where libname='WORK' and memname like 'DS%'
    ;
quit;

/* Variable mapping for output data set (="want") */
data mapping;
   attrib
      Name  length=$20 label='Name'
      Age   length=8   label='Age'
      ID    length=$10 label='ID'
   ;
   call missing(of _all_);
   stop;
run;

/* create concatenation code and write it to temporary file */
filename mycode temp;
data _null_;
/*   file print;*/
   file mycode;
   set DataSets end=last;
   if _n_=1 then
   do;
      put
         @1  'proc sql;'
         /@3 'create table want as'
         /@6 'select * from mapping'
         ;
   end;
   put
      @6   'union all'
      /@6  'select * from ' table
      ;
   if last then
   do;
      put
         @3 ';'
        /@1 'quit;'
        ;  
   end;
run;

/* include temporary file so that code gets executed */
%include mycode;

Contributor
Posts: 37

Re: how to remove all variable names from datasets

Yes, UNION ALL is a good method to concatenate multiple datasets that have the same attributes and the same order.

Smiley Happy

Contributor
Posts: 37

Re: how to remove all variable names from datasets

the datasets are already in a library.  I tried to make the code to UNION ALL the datasets in my library, but encountered errors.

libname temp 'c:\test';

run;

/* create a list of the datasets */

filename indata pipe 'dir C:\all_datasets /b';

data work.list;

length fname $60;

infile indata truncover;

input fname $60.;

call symput ('num_files',_n_);

run;

/*using a DATA _NULL_ step, we can now read the

data set names in our database and save them in macro

variables */

data _null_;

set work.list;

call

symput('DataSet'||left(_n_),memname);

run;

/* DataSets: View containing all data sets for concatenation */

proc sql noprint;

   create view DataSets as

      select cats(libname,'.',memname) length=65 format=$65. as table

      from dictionary.tables

      where libname='temp' and memname  like _&       /* my dataset names are like this _384_abc_*   not sure of this line   */

    ;

quit;

/* Variable mapping for output data set (="want") */

data mapping;

   attrib

ID_number    length=$15    label='ID Number'

First        length=$25    label='First Name'

...

   ;

  call missing(of _all_);

   stop;

run;

/* create concatenation code and write it to temporary file */

filename alldata temp;

data _null_;

/*   file print;*/

   file alldata;

   set work.list end=last;

   if _n_=1 then

   do;

      put

         @1  'proc sql;'

         /@3 'create table want as'

         /@6 'select * from mapping'

         ;

   end;

   put

      @6   'union all'

      /@6  'select * from ' table      /*  got message "variable table is uninitialized, why? */

      ;

   if last then

   do;

      put

         @3 ';'

        /@1 'quit;'

        ; 

   end;

run;

/* include temporary file so that code gets executed */

%include alldata;

Respected Advisor
Posts: 3,899

Re: how to remove all variable names from datasets

"where libname='temp' and memname  like _&       /* my dataset names are like this _384_abc_*   not sure of this line   */"

Very often values in dictionary tables are in uppercase. This is also true for "libname" and "memname".

If I understand your dataset naming pattern correct then all table names start with "_384_abc_" followed by 0 or more other characters (eg: _384_abc_somthing).

In SQL the character '%' is a wildcard for n characters, '_' is a wildcard for exactly one character. But as you have actually underscores in your name as a character, we can't use '_' as this would stand for any character not only an underscore.

One can mask these special characters (escape '<escape character>).

http://support.sas.com/documentation/cdl/en/proc/61895/HTML/default/viewer.htm#a002473693.htm

That's how I believe your line of code should look like in order for things to work:

where libname ='TEMP' and memname like '\_384\_ABC\_%'  escape '\'

Valued Guide
Posts: 2,175

Re: how to remove all variable names from datasets

Patrick is right these data sets will have uppercase names. The physical names are normally lower case.  So why would you want read the physical directory when the sashelp.vmember view would provide an already prepared list ?

I don't know if anyone has checked but I think there is a limit to the number of tables read in an SQL statement! SELECT use to be limited to 16 and that was doubled when SAS8 came in.  I expect it may have more than doubled since then. However, it might not perform for "100s" of tables.

Good luck.

Peter

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 14 replies
  • 2769 views
  • 4 likes
  • 8 in conversation