- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I need to PROC APPEND a few hundred datasets, but the variable names in all datasets are different. is there a way to remove all variable names so they all become VAR1, VAR2, VAR3...VAR50?
dataset 1 looks like this:
name age ID
------------------------
Cindy 15 101
Kenny 10 102
...
dataset 2 looks like this:
firstname yrs Identification
------------------------------------
Liz 13 103
Harry 17 104
...
there is a problem with PROC APPEND when variables are named differently. is there a way to remove all variable names from all datasets, so all variables become Var1, Var2...only?
Thanks!
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Use SQL. UNION ALL without corresponding
THen column order is used rather than name
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
so the final appended dataset can become?
VAR1 VAR2 VAR3
-----------------------------------
Cindy 15 101
Kenny 10 102
...
Liz 13 103
Harry 17 104
...
i can label all variables after appending all datasets. any suggestions?
thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi, You can append the datasets and then can apply label.I have tried and i don't find any problem.
Can you elaborate what is the issue if we append with different labels.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
sorry...i meant variable names instead. i try to append all datasets that have different variable names. not labeling issue.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data class1; set sashelp.class; label age='AGE GROUP'; run; data class2; set sashelp.class; label sex='Gender'; run; data _null_; set sashelp.vmember(keep=memname) end=last; if _n_ eq 1 then call execute('proc datasets library=work nolist;'); call execute('modify '||trim(memname)||'; attrib _all_ label=" ";'); if last then call execute('quit;'); run;
Ksharp
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi Cyndia,
Try this code...Hope it helps.
data one;
input produce $ weight bags;
cards;
carrots 10 500
potatos 5 200
garlic 2 100
;
run;
data two;
input fruit $ wt containers;
cards;
apples 15 100
pears 9 20
;
run;
%macro test;
proc sql;
create table libtables as select memname from sashelp.vtable
where libname eq %upcase("work");
quit;
proc sql;
select count(*) into :i from libtables;
select memname into :j1- :j%left(&i) from libtables;
quit;
%do ii=1 %to %left(&i);
%global renlstⅈ
proc sql noprint;
select compress(name || '= v_'||compress(left(put(varnum,5.)))) into: renlst&ii separated by ' '
from dictionary.columns
where libname='WORK' and memname="&&j&ii.";
quit;
%end;
%mend test;
%test;
%put &renlst1 &renlst2 ;
data want;
set one(rename=(&renlst1)) two(rename=(&renlst2));
run;
Thanks,
Shiva
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
you don't have to have same labels to run proc append as long as variable names are the same.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi Cyndia,
it seems, that you are mixing up the terms "label" and "name".
Assuming, that the datasets are stored in the library "bob":
proc sql noprint;
create table work.UglyColumns as
select MemName as DatasetName, Name as Variable, VarNum as Position
from sashelp.vcolumn
where lowcase(LibName) = 'bob'
order by DatasetName, VarNum
;
quit;
data _null_;
set work.UglyColumns end=done;
by DatasetName;
if _n_ = 1 then do;
call execute('proc datasets library=bob nodetails nolist;');
end;
if first.DatasetName then do;
call execute('modify ' || DatasetName || ';');
call execute('rename ');
end;
call execute(catt(Variable, ' = Var', Position));
if last.DatasetName then do;
call execute(';'); /* rename-statement */
end;
if done then do;
call execute('run;');
end;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Sorry, I meant variable names are all different. andreas_lds is right about the mix-up.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Use SQL. UNION ALL without corresponding
THen column order is used rather than name
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
As Peter C suggests a SQL UNION ALL should do the trick IF your variables with different names but the same meaning are in ALL data sets in the same order.
As you're having hundreds of data sets consider generating the SQL code like done in code below.
/* create test data */
data ds1 ds10 ds5;
input name $ age ID $;
datalines;
Cindy 15 101
Kenny 10 102
;
run;
data ds2 ds7 ds4;
input firstname $ yrs Identification $;
datalines;
Liz 13 103
Harry 17 104
;
run;
/* DataSets: View containing all data sets for concatenation */
proc sql noprint;
create view DataSets as
select cats(libname,'.',memname) length=65 format=$65. as table
from dictionary.tables
where libname='WORK' and memname like 'DS%'
;
quit;
/* Variable mapping for output data set (="want") */
data mapping;
attrib
Name length=$20 label='Name'
Age length=8 label='Age'
ID length=$10 label='ID'
;
call missing(of _all_);
stop;
run;
/* create concatenation code and write it to temporary file */
filename mycode temp;
data _null_;
/* file print;*/
file mycode;
set DataSets end=last;
if _n_=1 then
do;
put
@1 'proc sql;'
/@3 'create table want as'
/@6 'select * from mapping'
;
end;
put
@6 'union all'
/@6 'select * from ' table
;
if last then
do;
put
@3 ';'
/@1 'quit;'
;
end;
run;
/* include temporary file so that code gets executed */
%include mycode;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Yes, UNION ALL is a good method to concatenate multiple datasets that have the same attributes and the same order.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
the datasets are already in a library. I tried to make the code to UNION ALL the datasets in my library, but encountered errors.
libname temp 'c:\test';
run;
/* create a list of the datasets */
filename indata pipe 'dir C:\all_datasets /b';
data work.list;
length fname $60;
infile indata truncover;
input fname $60.;
call symput ('num_files',_n_);
run;
/*using a DATA _NULL_ step, we can now read the
data set names in our database and save them in macro
variables */
data _null_;
set work.list;
call
symput('DataSet'||left(_n_),memname);
run;
/* DataSets: View containing all data sets for concatenation */
proc sql noprint;
create view DataSets as
select cats(libname,'.',memname) length=65 format=$65. as table
from dictionary.tables
where libname='temp' and memname like _& /* my dataset names are like this _384_abc_* not sure of this line */
;
quit;
/* Variable mapping for output data set (="want") */
data mapping;
attrib
ID_number length=$15 label='ID Number'
First length=$25 label='First Name'
...
;
call missing(of _all_);
stop;
run;
/* create concatenation code and write it to temporary file */
filename alldata temp;
data _null_;
/* file print;*/
file alldata;
set work.list end=last;
if _n_=1 then
do;
put
@1 'proc sql;'
/@3 'create table want as'
/@6 'select * from mapping'
;
end;
put
@6 'union all'
/@6 'select * from ' table /* got message "variable table is uninitialized, why? */
;
if last then
do;
put
@3 ';'
/@1 'quit;'
;
end;
run;
/* include temporary file so that code gets executed */
%include alldata;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
"where libname='temp' and memname like _& /* my dataset names are like this _384_abc_* not sure of this line */"
Very often values in dictionary tables are in uppercase. This is also true for "libname" and "memname".
If I understand your dataset naming pattern correct then all table names start with "_384_abc_" followed by 0 or more other characters (eg: _384_abc_somthing).
In SQL the character '%' is a wildcard for n characters, '_' is a wildcard for exactly one character. But as you have actually underscores in your name as a character, we can't use '_' as this would stand for any character not only an underscore.
One can mask these special characters (escape '<escape character>).
http://support.sas.com/documentation/cdl/en/proc/61895/HTML/default/viewer.htm#a002473693.htm
That's how I believe your line of code should look like in order for things to work:
where libname ='TEMP' and memname like '\_384\_ABC\_%' escape '\'
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Patrick is right these data sets will have uppercase names. The physical names are normally lower case. So why would you want read the physical directory when the sashelp.vmember view would provide an already prepared list ?
I don't know if anyone has checked but I think there is a limit to the number of tables read in an SQL statement! SELECT use to be limited to 16 and that was doubled when SAS8 came in. I expect it may have more than doubled since then. However, it might not perform for "100s" of tables.
Good luck.
Peter