BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
rakeshvvv
Quartz | Level 8

Hi,

I have folder with N number of CSV files with same structural format but with different names for the csv file.

I would like to append all csv files into one final dataset.

The folder is dynamic folder and new csv files arrive on day to day…So would like to put in a macro such that every time program runs it would append all the existing csv files in one dataset.

Thanks

1 ACCEPTED SOLUTION

Accepted Solutions
jakarman
Barite | Level 11

I don not like the proc import usage that way. lt is scanning first the data and than creating a datastep/input of data.
The result can be that with some creative input you are getting the sas-datasets in different structures.

In you datasetnaming you have session_1 batch_1 etc and they are separated by a _   Is the _ a variable separation or a level-naming separation.
As assuming it is replaced by real values I hope they will avoid the _ and the _ is being reliable as level indication.

The following code, is tested with two csv files. 

filename impcsv ' /folders/myshortcuts/ikke';

data orderlst (keep= x y z q order part eval) ;
  length filename filenmcr $250 ;    retain _nfl -1 eovfl  ;
  length z $3 q $1 order part eval $16   ; retain order part eval ; /* the targeted lays out */
  infile impcsv(12345*.csv) filename=filenmcr eov=eovfl eof=allfil end=endfl missover dsd dlm=";"  ;
  input @ ;
   if ( eovfl or _nfl <0 ) then link header;
   _nfl=_nfl+1;
   input x y z q ; 
   output ;
   if ( end ) then link tailer ;
  
return;

header:
    filename=filenmcr;
    put "-- " filename " --"; _nfl=1 ; eovfl=0;
    input  ; /* advance one line, ignoring the header */
    part=scan(filename,-6,"_");
    order=scan(filename,-4,"_"); 
    eval=scan(filename,-3,"_"); 
return;

tailer: 
    put "total records" part _nfl / ;
return; 
allfil: 
    put "processing done"l / ;
return; 
run;

proc print;
run;

NOTE: The infile library IMPCSV is:

NOTE: The infile IMPCSV(12345*.csv) is:

-- /folders/myshortcuts/ikke/12345_a_b_c_part-2_batch-1_session-1_reader-1_03aug12_dummy.csv  --

NOTE: The infile IMPCSV(12345*.csv) is:

-- /folders/myshortcuts/ikke/12345_a_b_c_part-2_batch-1_session-2_reader-2_03aug12_dummy.csv  --

processing done


       z    q                                        x   y

1UNENsession-1part-2reader-11560010101
2ENHNsession-1part-2reader-12220010102
3UNENsession-2part-2reader-21560010101
4ENHNsession-2part-2reader-22220010102
---->-- ja karman --<-----

View solution in original post

9 REPLIES 9
LinusH
Tourmaline | Level 20

Ok, but what is your specific question?

You seems already be on the right path using macros. Give it a shot and get back when have some specific questions.

Data never sleeps
rakeshvvv
Quartz | Level 8

I have CSV file with following name convention 12345_a_b_c_part_2_batch_1_session_1_reader_1_03aug12_dummy.csv

x;y;z;q----first row in csv---should be made column name

1;560010101;UNE;N----records or values for the above x,y,z,q variables

2;220010102;ENH;N----- records or values for the above x,y,z,q variables

I have to create dataset from CSV FILE…..part  of the name of the csv file should go variables like

batch_1= name of the dataset

ord=1(coming from session in the name of csv)

part=2(coming from part_2 in the name of csv)

eval=1((coming from reader_1 in the name of csv)

the dataset batch_ 1 should be like this

x        y                z           q     ord      part    eval

1   560010101       une       n       1         2          1

rakeshvvv
Quartz | Level 8

I have written parial code and would like to know

I would like to know how to run this macro for all the csv file files located in the folder. All csv file are of same structural format.

libname x 'library location;

option spool;

%let x=/by/data/ad/14607_randlist_2nd_read_part_2_batch_1_session_1_reader_1_03aug12_dummy.csv;

(location of folder)

%macro test(a=);

PROC IMPORT OUT=x.&a

DATAFILE="&x"

    DBMS=CSV REPLACE;

    delimiter = ";";

     GETNAMES=YES;

RUN;

data x.&a;

set x.&a;

a="&x";

part=substr(a,89,1);

order=substr(a,107,1);

eval=substr(a,116,1);

drop a;

run;

%mend;

%test(a=x);

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Does it have headings?  If not its straightforward:

x 'copy "s:\temp\rob\*.csv" "s:\temp\rob\total.csv"';

Then you program can import just one file (note I would put the above in a batch file and have a delete the total.csv before re-qriting it otherwise it will also get included).

So use the dos copy to create a complete file.

If it does have headings then its the same thing, you just need to either do a bit of scripting in the batch file, or post-process in your dataset.

rakeshvvv
Quartz | Level 8

Hi RW9!

Can you give syntax how it work in sas....I coudnt do it in mine......

x 'copy "s:\temp\rob\*.csv" "s:\temp\rob\total.csv"';

RW9
Diamond | Level 26 RW9
Diamond | Level 26

What couldn't you do?  The X syntax is base SAS.  It sends the command within quotes to the system for processing.  If you don't have admin rights to do that or X is locked out then just create a batch file outside of sas.

E.g.

In Windows explorer where your csv files are, right click create new text document.  Type in:

copy "s:\temp\rob\*.csv" "s:\temp\rob\total.csv"

Changing the paths obviously.

Now right click and rename it Job.bat.  Yes, extension of .bat.

When you double click on it a black screen will pop up - that is DOS - it will process each file and then close or wait to be closed.

jakarman
Barite | Level 11

No need for os commands sas is supporting wildcards  

filename impdir '/folders/myfolders' ;    /* define you input location */ *

data _null ;
  length filename filenmcr $250 ;
  retain _nfl 1 eovfl 1 ;
  infile impdir(*.csv) filename=filenmcr eov=eovfl eof=tailer end=endfl  ;   /* read all csv files */
  input   ............                                                      /*  your input statments for the csv file(s)  */
   ;  
   _nfl=_nfl+1;
   if ( eovfl ) then link header; eovfl=0;
return;

header:  /* just some processing to show wich csv files have been processed */
    filename=filenmcr;
    put "-- " filename " --"; _nfl=1 ;
return;
tailer: 
    put "total records" _nfl / ;
return; 
run;

---->-- ja karman --<-----
jakarman
Barite | Level 11

I don not like the proc import usage that way. lt is scanning first the data and than creating a datastep/input of data.
The result can be that with some creative input you are getting the sas-datasets in different structures.

In you datasetnaming you have session_1 batch_1 etc and they are separated by a _   Is the _ a variable separation or a level-naming separation.
As assuming it is replaced by real values I hope they will avoid the _ and the _ is being reliable as level indication.

The following code, is tested with two csv files. 

filename impcsv ' /folders/myshortcuts/ikke';

data orderlst (keep= x y z q order part eval) ;
  length filename filenmcr $250 ;    retain _nfl -1 eovfl  ;
  length z $3 q $1 order part eval $16   ; retain order part eval ; /* the targeted lays out */
  infile impcsv(12345*.csv) filename=filenmcr eov=eovfl eof=allfil end=endfl missover dsd dlm=";"  ;
  input @ ;
   if ( eovfl or _nfl <0 ) then link header;
   _nfl=_nfl+1;
   input x y z q ; 
   output ;
   if ( end ) then link tailer ;
  
return;

header:
    filename=filenmcr;
    put "-- " filename " --"; _nfl=1 ; eovfl=0;
    input  ; /* advance one line, ignoring the header */
    part=scan(filename,-6,"_");
    order=scan(filename,-4,"_"); 
    eval=scan(filename,-3,"_"); 
return;

tailer: 
    put "total records" part _nfl / ;
return; 
allfil: 
    put "processing done"l / ;
return; 
run;

proc print;
run;

NOTE: The infile library IMPCSV is:

NOTE: The infile IMPCSV(12345*.csv) is:

-- /folders/myshortcuts/ikke/12345_a_b_c_part-2_batch-1_session-1_reader-1_03aug12_dummy.csv  --

NOTE: The infile IMPCSV(12345*.csv) is:

-- /folders/myshortcuts/ikke/12345_a_b_c_part-2_batch-1_session-2_reader-2_03aug12_dummy.csv  --

processing done


       z    q                                        x   y

1UNENsession-1part-2reader-11560010101
2ENHNsession-1part-2reader-12220010102
3UNENsession-2part-2reader-21560010101
4ENHNsession-2part-2reader-22220010102
---->-- ja karman --<-----
rakeshvvv
Quartz | Level 8

Perfect Sir...that worked perfect for me.

Thanks

Raghav

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 6433 views
  • 2 likes
  • 4 in conversation