I want to append all the datasets I have in my WORK library together. They all have the same column names, but they start on row 6. I have too many datasets to delete first 5 rows with a macro. Is there a good shortcut I can use for this, similar to the code I use to append them to 1 dataset:
PROC SQL;
SELECT distinct MEMNAME INTO : MEMNAMES SEPARATED BY ' ' from dictionary.tables where libname='WORK';
quit;
DATA full;
SET &MEMNAMES.;
RUN;
include the translate function to convert the spaces to underscores. e.g.:
data test1; input (col1-col3) (&$); cards; kjkj jkjk ljknl kjcio doik roukg kjor ytiukk lkjf ruie eooik piijmf item 1 item 2 item 3 1 2 3 4 5 6 ; data test2; input (col1-col3) (&$); cards; kjkj jkjk ljknl kjcio doik roukg kjor ytiukk lkjf ruie eooik piijmf item 1 item 2 item 3 7 8 9 2 2 2 ; data _null_; array namelist $32. col1-col3; set test1(firstobs=5 obs=5); length names $32000; do over namelist; names=catx(' ',names,vname(namelist),'=', translate(strip(vvalue(namelist)),'_',' ')); end; call symput('renames',names); run; %put &renames.; proc sql; create table dslist as select memname from dictionary.tables where libname='WORK' ; quit; data _null_; call execute ('data want; set ') ; do until (done); set dslist end=done; call execute (memname); call execute("(firstobs=6 rename=(&renames.))"); end; call execute ('; run;') ; stop; run;
Art, CEO, AnalystFinder.com
Sure. Let's start with something similar to what you used before:
proc sql;
create table dslist as
select memname from dictionary.tables
where libname='WORK';
quit;
Instead of creating a macro variable, this creates a data set holding the names of your data sets. Then construct and execute the rest of the program using:
data _null_;
call execute ('data want; set ') ;
do until (done);
set dslist end=done;
call execute (memname);
call execute('(firstobs=6)');
end;
call execute ('; run;') ;
stop;
run;
Thank you - almost got it. It is selecting the first row as the variable names. Can I edit this to select row 5 as variable names (firstobs is still at 6)?
If all of your files have the same variables, you could just modify @Astounding's code as follows:
data test1; input (col1-col3) ($); cards; kjkj jkjk ljknl kjcio doik roukg kjor ytiukk lkjf ruie eooik piijmf item1 item2 item3 1 2 3 4 5 6 ; data test2; input (col1-col3) ($); cards; kjkj jkjk ljknl kjcio doik roukg kjor ytiukk lkjf ruie eooik piijmf item1 item2 item3 7 8 9 2 2 2 ; data _null_; array namelist $32. col1-col3; set test1(firstobs=5 obs=5); length names $32000; do over namelist; names=catx(' ',names,vname(namelist),'=',vvalue(namelist)); end; call symput('renames',names); run; proc sql; create table dslist as select memname from dictionary.tables where libname='WORK' ; quit; data _null_; call execute ('data want; set ') ; do until (done); set dslist end=done; call execute (memname); call execute("(firstobs=6 rename=(&renames.))"); end; call execute ('; run;') ; stop; run;
Of course, that would have to be expanded if the problem is more complex.
Art, CEO, AnalystFinder.com
Yes! Instead of:
array namelist $32. col1-col3;
col1-col3 should be replaced with your actual variable names
Art, CEO, AnalystFinder.com
Ah, it is because the actual variable names in line 5 have spaces...
include the translate function to convert the spaces to underscores. e.g.:
data test1; input (col1-col3) (&$); cards; kjkj jkjk ljknl kjcio doik roukg kjor ytiukk lkjf ruie eooik piijmf item 1 item 2 item 3 1 2 3 4 5 6 ; data test2; input (col1-col3) (&$); cards; kjkj jkjk ljknl kjcio doik roukg kjor ytiukk lkjf ruie eooik piijmf item 1 item 2 item 3 7 8 9 2 2 2 ; data _null_; array namelist $32. col1-col3; set test1(firstobs=5 obs=5); length names $32000; do over namelist; names=catx(' ',names,vname(namelist),'=', translate(strip(vvalue(namelist)),'_',' ')); end; call symput('renames',names); run; %put &renames.; proc sql; create table dslist as select memname from dictionary.tables where libname='WORK' ; quit; data _null_; call execute ('data want; set ') ; do until (done); set dslist end=done; call execute (memname); call execute("(firstobs=6 rename=(&renames.))"); end; call execute ('; run;') ; stop; run;
Art, CEO, AnalystFinder.com
I can't see what you are looking at. You will have to show me the first 6 observations from one of the data sets for me to participate in this. (I wouldn't need all the variables, but 3 or 4 would be necessary.)
In general, it would not be a bad idea to do this as two separate steps ... combine all the data values as one step, and go back and rename variables as a second step. As Art indicated, it can be automated but I can't see what you are looking at.
This is a question the justifies use of global options firstobs=, as in:
PROC SQL;
SELECT distinct MEMNAME INTO : MEMNAMES SEPARATED BY ' ' from dictionary.tables where libname='WORK';
quit;
options firstobs=6;
DATA full;
SET &MEMNAMES.;
RUN;
options firstobs=1;
The firstobs=6 option will be applied to every data set subsequently being read in. After the successful append, reset option to its default (firstobs=1).
firstobs=6 is working but it is still labeling the columns based on some text in A1 and just default excel column letters B..C..D.... for the rest. firstobs=1 is not changing the column labels to variable names in row 5.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.