I want to append all the datasets I have in my WORK library together. They all have the same column names, but they start on row 6. I have too many datasets to delete first 5 rows with a macro. Is there a good shortcut I can use for this, similar to the code I use to append them to 1 dataset:
PROC SQL;
SELECT distinct MEMNAME INTO : MEMNAMES SEPARATED BY ' ' from dictionary.tables where libname='WORK';
quit;
DATA full;
SET &MEMNAMES.;
RUN;
include the translate function to convert the spaces to underscores. e.g.:
data test1;
input (col1-col3) (&$);
cards;
kjkj jkjk ljknl
kjcio doik roukg
kjor ytiukk lkjf
ruie eooik piijmf
item 1 item 2 item 3
1 2 3
4 5 6
;
data test2;
input (col1-col3) (&$);
cards;
kjkj jkjk ljknl
kjcio doik roukg
kjor ytiukk lkjf
ruie eooik piijmf
item 1 item 2 item 3
7 8 9
2 2 2
;
data _null_;
array namelist $32. col1-col3;
set test1(firstobs=5 obs=5);
length names $32000;
do over namelist;
names=catx(' ',names,vname(namelist),'=',
translate(strip(vvalue(namelist)),'_',' '));
end;
call symput('renames',names);
run;
%put &renames.;
proc sql;
create table dslist as
select memname from dictionary.tables
where libname='WORK'
;
quit;
data _null_;
call execute ('data want; set ') ;
do until (done);
set dslist end=done;
call execute (memname);
call execute("(firstobs=6 rename=(&renames.))");
end;
call execute ('; run;') ;
stop;
run;
Art, CEO, AnalystFinder.com
Sure. Let's start with something similar to what you used before:
proc sql;
create table dslist as
select memname from dictionary.tables
where libname='WORK';
quit;
Instead of creating a macro variable, this creates a data set holding the names of your data sets. Then construct and execute the rest of the program using:
data _null_;
call execute ('data want; set ') ;
do until (done);
set dslist end=done;
call execute (memname);
call execute('(firstobs=6)');
end;
call execute ('; run;') ;
stop;
run;
Thank you - almost got it. It is selecting the first row as the variable names. Can I edit this to select row 5 as variable names (firstobs is still at 6)?
If all of your files have the same variables, you could just modify @Astounding's code as follows:
data test1;
input (col1-col3) ($);
cards;
kjkj jkjk ljknl
kjcio doik roukg
kjor ytiukk lkjf
ruie eooik piijmf
item1 item2 item3
1 2 3
4 5 6
;
data test2;
input (col1-col3) ($);
cards;
kjkj jkjk ljknl
kjcio doik roukg
kjor ytiukk lkjf
ruie eooik piijmf
item1 item2 item3
7 8 9
2 2 2
;
data _null_;
array namelist $32. col1-col3;
set test1(firstobs=5 obs=5);
length names $32000;
do over namelist;
names=catx(' ',names,vname(namelist),'=',vvalue(namelist));
end;
call symput('renames',names);
run;
proc sql;
create table dslist as
select memname from dictionary.tables
where libname='WORK'
;
quit;
data _null_;
call execute ('data want; set ') ;
do until (done);
set dslist end=done;
call execute (memname);
call execute("(firstobs=6 rename=(&renames.))");
end;
call execute ('; run;') ;
stop;
run;
Of course, that would have to be expanded if the problem is more complex.
Art, CEO, AnalystFinder.com
Yes! Instead of:
array namelist $32. col1-col3;
col1-col3 should be replaced with your actual variable names
Art, CEO, AnalystFinder.com
Ah, it is because the actual variable names in line 5 have spaces...
include the translate function to convert the spaces to underscores. e.g.:
data test1;
input (col1-col3) (&$);
cards;
kjkj jkjk ljknl
kjcio doik roukg
kjor ytiukk lkjf
ruie eooik piijmf
item 1 item 2 item 3
1 2 3
4 5 6
;
data test2;
input (col1-col3) (&$);
cards;
kjkj jkjk ljknl
kjcio doik roukg
kjor ytiukk lkjf
ruie eooik piijmf
item 1 item 2 item 3
7 8 9
2 2 2
;
data _null_;
array namelist $32. col1-col3;
set test1(firstobs=5 obs=5);
length names $32000;
do over namelist;
names=catx(' ',names,vname(namelist),'=',
translate(strip(vvalue(namelist)),'_',' '));
end;
call symput('renames',names);
run;
%put &renames.;
proc sql;
create table dslist as
select memname from dictionary.tables
where libname='WORK'
;
quit;
data _null_;
call execute ('data want; set ') ;
do until (done);
set dslist end=done;
call execute (memname);
call execute("(firstobs=6 rename=(&renames.))");
end;
call execute ('; run;') ;
stop;
run;
Art, CEO, AnalystFinder.com
I can't see what you are looking at. You will have to show me the first 6 observations from one of the data sets for me to participate in this. (I wouldn't need all the variables, but 3 or 4 would be necessary.)
In general, it would not be a bad idea to do this as two separate steps ... combine all the data values as one step, and go back and rename variables as a second step. As Art indicated, it can be automated but I can't see what you are looking at.
This is a question the justifies use of global options firstobs=, as in:
PROC SQL;
SELECT distinct MEMNAME INTO : MEMNAMES SEPARATED BY ' ' from dictionary.tables where libname='WORK';
quit;
options firstobs=6;
DATA full;
SET &MEMNAMES.;
RUN;
options firstobs=1;
The firstobs=6 option will be applied to every data set subsequently being read in. After the successful append, reset option to its default (firstobs=1).
firstobs=6 is working but it is still labeling the columns based on some text in A1 and just default excel column letters B..C..D.... for the rest. firstobs=1 is not changing the column labels to variable names in row 5.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.