Help using Base SAS procedures

Append all csv file into one dataset

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 145
Accepted Solution

Append all csv file into one dataset

Hi,

I have folder with N number of CSV files with same structural format but with different names for the csv file.

I would like to append all csv files into one final dataset.

The folder is dynamic folder and new csv files arrive on day to day…So would like to put in a macro such that every time program runs it would append all the existing csv files in one dataset.

Thanks


Accepted Solutions
Solution
‎08-28-2014 02:10 PM
Trusted Advisor
Posts: 3,215

Re: Append all csv file into one dataset

Posted in reply to rakeshvvv

I don not like the proc import usage that way. lt is scanning first the data and than creating a datastep/input of data.
The result can be that with some creative input you are getting the sas-datasets in different structures.

In you datasetnaming you have session_1 batch_1 etc and they are separated by a _   Is the _ a variable separation or a level-naming separation.
As assuming it is replaced by real values I hope they will avoid the _ and the _ is being reliable as level indication.

The following code, is tested with two csv files. 

filename impcsv ' /folders/myshortcuts/ikke';

data orderlst (keep= x y z q order part eval) ;
  length filename filenmcr $250 ;    retain _nfl -1 eovfl  ;
  length z $3 q $1 order part eval $16   ; retain order part eval ; /* the targeted lays out */
  infile impcsv(12345*.csv) filename=filenmcr eov=eovfl eof=allfil end=endfl missover dsd dlm=";"  ;
  input @ ;
   if ( eovfl or _nfl <0 ) then link header;
   _nfl=_nfl+1;
   input x y z q ; 
   output ;
   if ( end ) then link tailer ;
  
return;

header:
    filename=filenmcr;
    put "-- " filename " --"; _nfl=1 ; eovfl=0;
    input  ; /* advance one line, ignoring the header */
    part=scan(filename,-6,"_");
    order=scan(filename,-4,"_"); 
    eval=scan(filename,-3,"_"); 
return;

tailer: 
    put "total records" part _nfl / ;
return; 
allfil: 
    put "processing done"l / ;
return; 
run;

proc print;
run;

NOTE: The infile library IMPCSV is:

NOTE: The infile IMPCSV(12345*.csv) is:

-- /folders/myshortcuts/ikke/12345_a_b_c_part-2_batch-1_session-1_reader-1_03aug12_dummy.csv  --

NOTE: The infile IMPCSV(12345*.csv) is:

-- /folders/myshortcuts/ikke/12345_a_b_c_part-2_batch-1_session-2_reader-2_03aug12_dummy.csv  --

processing done


       z    q                                        x   y

1UNENsession-1part-2reader-11560010101
2ENHNsession-1part-2reader-12220010102
3UNENsession-2part-2reader-21560010101
4ENHNsession-2part-2reader-22220010102
---->-- ja karman --<-----

View solution in original post


All Replies
Super User
Posts: 5,441

Re: Append all csv file into one dataset

Posted in reply to rakeshvvv

Ok, but what is your specific question?

You seems already be on the right path using macros. Give it a shot and get back when have some specific questions.

Data never sleeps
Frequent Contributor
Posts: 145

Re: Append all csv file into one dataset

I have CSV file with following name convention 12345_a_b_c_part_2_batch_1_session_1_reader_1_03aug12_dummy.csv

x;y;z;q----first row in csv---should be made column name

1;560010101;UNE;N----records or values for the above x,y,z,q variables

2;220010102;ENH;N----- records or values for the above x,y,z,q variables

I have to create dataset from CSV FILE…..part  of the name of the csv file should go variables like

batch_1= name of the dataset

ord=1(coming from session in the name of csv)

part=2(coming from part_2 in the name of csv)

eval=1((coming from reader_1 in the name of csv)

the dataset batch_ 1 should be like this

x        y                z           q     ord      part    eval

1   560010101       une       n       1         2          1

Frequent Contributor
Posts: 145

Re: Append all csv file into one dataset

Posted in reply to rakeshvvv

I have written parial code and would like to know

I would like to know how to run this macro for all the csv file files located in the folder. All csv file are of same structural format.

libname x 'library location;

option spool;

%let x=/by/data/ad/14607_randlist_2nd_read_part_2_batch_1_session_1_reader_1_03aug12_dummy.csv;

(location of folder)

%macro test(a=);

PROC IMPORT OUT=x.&a

DATAFILE="&x"

    DBMS=CSV REPLACE;

    delimiter = ";";

     GETNAMES=YES;

RUN;

data x.&a;

set x.&a;

a="&x";

part=substr(a,89,1);

order=substr(a,107,1);

eval=substr(a,116,1);

drop a;

run;

%mend;

%test(a=x);

Super User
Super User
Posts: 7,997

Re: Append all csv file into one dataset

Posted in reply to rakeshvvv

Does it have headings?  If not its straightforward:

x 'copy "s:\temp\rob\*.csv" "s:\temp\rob\total.csv"';

Then you program can import just one file (note I would put the above in a batch file and have a delete the total.csv before re-qriting it otherwise it will also get included).

So use the dos copy to create a complete file.

If it does have headings then its the same thing, you just need to either do a bit of scripting in the batch file, or post-process in your dataset.

Frequent Contributor
Posts: 145

Re: Append all csv file into one dataset

Hi RW9!

Can you give syntax how it work in sas....I coudnt do it in mine......

x 'copy "s:\temp\rob\*.csv" "s:\temp\rob\total.csv"';

Super User
Super User
Posts: 7,997

Re: Append all csv file into one dataset

Posted in reply to rakeshvvv

What couldn't you do?  The X syntax is base SAS.  It sends the command within quotes to the system for processing.  If you don't have admin rights to do that or X is locked out then just create a batch file outside of sas.

E.g.

In Windows explorer where your csv files are, right click create new text document.  Type in:

copy "s:\temp\rob\*.csv" "s:\temp\rob\total.csv"

Changing the paths obviously.

Now right click and rename it Job.bat.  Yes, extension of .bat.

When you double click on it a black screen will pop up - that is DOS - it will process each file and then close or wait to be closed.

Trusted Advisor
Posts: 3,215

Re: Append all csv file into one dataset

Posted in reply to rakeshvvv

No need for os commands sas is supporting wildcards  

filename impdir '/folders/myfolders' ;    /* define you input location */ *

data _null ;
  length filename filenmcr $250 ;
  retain _nfl 1 eovfl 1 ;
  infile impdir(*.csv) filename=filenmcr eov=eovfl eof=tailer end=endfl  ;   /* read all csv files */
  input   ............                                                      /*  your input statments for the csv file(s)  */
   ;  
   _nfl=_nfl+1;
   if ( eovfl ) then link header; eovfl=0;
return;

header:  /* just some processing to show wich csv files have been processed */
    filename=filenmcr;
    put "-- " filename " --"; _nfl=1 ;
return;
tailer: 
    put "total records" _nfl / ;
return; 
run;

---->-- ja karman --<-----
Solution
‎08-28-2014 02:10 PM
Trusted Advisor
Posts: 3,215

Re: Append all csv file into one dataset

Posted in reply to rakeshvvv

I don not like the proc import usage that way. lt is scanning first the data and than creating a datastep/input of data.
The result can be that with some creative input you are getting the sas-datasets in different structures.

In you datasetnaming you have session_1 batch_1 etc and they are separated by a _   Is the _ a variable separation or a level-naming separation.
As assuming it is replaced by real values I hope they will avoid the _ and the _ is being reliable as level indication.

The following code, is tested with two csv files. 

filename impcsv ' /folders/myshortcuts/ikke';

data orderlst (keep= x y z q order part eval) ;
  length filename filenmcr $250 ;    retain _nfl -1 eovfl  ;
  length z $3 q $1 order part eval $16   ; retain order part eval ; /* the targeted lays out */
  infile impcsv(12345*.csv) filename=filenmcr eov=eovfl eof=allfil end=endfl missover dsd dlm=";"  ;
  input @ ;
   if ( eovfl or _nfl <0 ) then link header;
   _nfl=_nfl+1;
   input x y z q ; 
   output ;
   if ( end ) then link tailer ;
  
return;

header:
    filename=filenmcr;
    put "-- " filename " --"; _nfl=1 ; eovfl=0;
    input  ; /* advance one line, ignoring the header */
    part=scan(filename,-6,"_");
    order=scan(filename,-4,"_"); 
    eval=scan(filename,-3,"_"); 
return;

tailer: 
    put "total records" part _nfl / ;
return; 
allfil: 
    put "processing done"l / ;
return; 
run;

proc print;
run;

NOTE: The infile library IMPCSV is:

NOTE: The infile IMPCSV(12345*.csv) is:

-- /folders/myshortcuts/ikke/12345_a_b_c_part-2_batch-1_session-1_reader-1_03aug12_dummy.csv  --

NOTE: The infile IMPCSV(12345*.csv) is:

-- /folders/myshortcuts/ikke/12345_a_b_c_part-2_batch-1_session-2_reader-2_03aug12_dummy.csv  --

processing done


       z    q                                        x   y

1UNENsession-1part-2reader-11560010101
2ENHNsession-1part-2reader-12220010102
3UNENsession-2part-2reader-21560010101
4ENHNsession-2part-2reader-22220010102
---->-- ja karman --<-----
Frequent Contributor
Posts: 145

Re: Append all csv file into one dataset

Perfect Sir...that worked perfect for me.

Thanks

Raghav

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 3293 views
  • 2 likes
  • 4 in conversation