I want to append all the datasets I have in my WORK library together. They all have the same column names, but they start on row 6. I have too many datasets to delete first 5 rows with a macro. Is there a good shortcut I can use for this, similar to the code I use to append them to 1 dataset:
PROC SQL;
SELECT distinct MEMNAME INTO : MEMNAMES SEPARATED BY ' ' from dictionary.tables where libname='WORK';
quit;
DATA full;
SET &MEMNAMES.;
RUN;
include the translate function to convert the spaces to underscores. e.g.:
data test1; input (col1-col3) (&$); cards; kjkj jkjk ljknl kjcio doik roukg kjor ytiukk lkjf ruie eooik piijmf item 1 item 2 item 3 1 2 3 4 5 6 ; data test2; input (col1-col3) (&$); cards; kjkj jkjk ljknl kjcio doik roukg kjor ytiukk lkjf ruie eooik piijmf item 1 item 2 item 3 7 8 9 2 2 2 ; data _null_; array namelist $32. col1-col3; set test1(firstobs=5 obs=5); length names $32000; do over namelist; names=catx(' ',names,vname(namelist),'=', translate(strip(vvalue(namelist)),'_',' ')); end; call symput('renames',names); run; %put &renames.; proc sql; create table dslist as select memname from dictionary.tables where libname='WORK' ; quit; data _null_; call execute ('data want; set ') ; do until (done); set dslist end=done; call execute (memname); call execute("(firstobs=6 rename=(&renames.))"); end; call execute ('; run;') ; stop; run;
Art, CEO, AnalystFinder.com
Sure. Let's start with something similar to what you used before:
proc sql;
create table dslist as
select memname from dictionary.tables
where libname='WORK';
quit;
Instead of creating a macro variable, this creates a data set holding the names of your data sets. Then construct and execute the rest of the program using:
data _null_;
call execute ('data want; set ') ;
do until (done);
set dslist end=done;
call execute (memname);
call execute('(firstobs=6)');
end;
call execute ('; run;') ;
stop;
run;
Thank you - almost got it. It is selecting the first row as the variable names. Can I edit this to select row 5 as variable names (firstobs is still at 6)?
If all of your files have the same variables, you could just modify @Astounding's code as follows:
data test1; input (col1-col3) ($); cards; kjkj jkjk ljknl kjcio doik roukg kjor ytiukk lkjf ruie eooik piijmf item1 item2 item3 1 2 3 4 5 6 ; data test2; input (col1-col3) ($); cards; kjkj jkjk ljknl kjcio doik roukg kjor ytiukk lkjf ruie eooik piijmf item1 item2 item3 7 8 9 2 2 2 ; data _null_; array namelist $32. col1-col3; set test1(firstobs=5 obs=5); length names $32000; do over namelist; names=catx(' ',names,vname(namelist),'=',vvalue(namelist)); end; call symput('renames',names); run; proc sql; create table dslist as select memname from dictionary.tables where libname='WORK' ; quit; data _null_; call execute ('data want; set ') ; do until (done); set dslist end=done; call execute (memname); call execute("(firstobs=6 rename=(&renames.))"); end; call execute ('; run;') ; stop; run;
Of course, that would have to be expanded if the problem is more complex.
Art, CEO, AnalystFinder.com
Yes! Instead of:
array namelist $32. col1-col3;
col1-col3 should be replaced with your actual variable names
Art, CEO, AnalystFinder.com
Ah, it is because the actual variable names in line 5 have spaces...
include the translate function to convert the spaces to underscores. e.g.:
data test1; input (col1-col3) (&$); cards; kjkj jkjk ljknl kjcio doik roukg kjor ytiukk lkjf ruie eooik piijmf item 1 item 2 item 3 1 2 3 4 5 6 ; data test2; input (col1-col3) (&$); cards; kjkj jkjk ljknl kjcio doik roukg kjor ytiukk lkjf ruie eooik piijmf item 1 item 2 item 3 7 8 9 2 2 2 ; data _null_; array namelist $32. col1-col3; set test1(firstobs=5 obs=5); length names $32000; do over namelist; names=catx(' ',names,vname(namelist),'=', translate(strip(vvalue(namelist)),'_',' ')); end; call symput('renames',names); run; %put &renames.; proc sql; create table dslist as select memname from dictionary.tables where libname='WORK' ; quit; data _null_; call execute ('data want; set ') ; do until (done); set dslist end=done; call execute (memname); call execute("(firstobs=6 rename=(&renames.))"); end; call execute ('; run;') ; stop; run;
Art, CEO, AnalystFinder.com
I can't see what you are looking at. You will have to show me the first 6 observations from one of the data sets for me to participate in this. (I wouldn't need all the variables, but 3 or 4 would be necessary.)
In general, it would not be a bad idea to do this as two separate steps ... combine all the data values as one step, and go back and rename variables as a second step. As Art indicated, it can be automated but I can't see what you are looking at.
This is a question the justifies use of global options firstobs=, as in:
PROC SQL;
SELECT distinct MEMNAME INTO : MEMNAMES SEPARATED BY ' ' from dictionary.tables where libname='WORK';
quit;
options firstobs=6;
DATA full;
SET &MEMNAMES.;
RUN;
options firstobs=1;
The firstobs=6 option will be applied to every data set subsequently being read in. After the successful append, reset option to its default (firstobs=1).
firstobs=6 is working but it is still labeling the columns based on some text in A1 and just default excel column letters B..C..D.... for the rest. firstobs=1 is not changing the column labels to variable names in row 5.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.