BookmarkSubscribeRSS Feed

Hi, I need to import multiple CSV''s. I have 22 folders named 201409 201412 201502 201503 etc, and within each folder there are 2 x csv's named like this (but year and month will differ)

Pool_Code_C_201409

Pool_Code_S_201409

 

Both csv's have the same variables - Mort, acc, sys, leg, pool

 

Can you please advise how I would go about importing these in with the condition - where pool ne blank

 

I want to create an individual dataset for each month passed in and NOT combine them togther.

 

Thanks

8 REPLIES 8
Shmuel
Garnet | Level 18

You need few steps:

1) create a code to run on just one specific file and check it to run as expected

2) enclose that program into a macro program geting the input file name as 1st argument

    and libref+dataset name of output as 2nd argument -
    adapting the program code to use the arguments as macro variables, instead the fixed names.

3) create a datastep to create 22*2 macro variables as combination of:

         file_name = cat(path,folder,filename,suffix);

    you can do it in a loop using arrays of: folders and of  file prefixes

4) create a macro program to loop across the 44 filenames in a loop

     calling previous macro, each time with appropriate arguments

     to create desired output.

 

After having your code, in case of having still difficulties come back to the forum.

    

I haven't tried to do this before so I have pieced this togther from Google.

 

I tried this but doesn't work -

 

%Macro ReadCSV (infile , outfile );

PROC IMPORT OUT= &ofile

DATAFILE= &infile

DBMS=CSV REPLACE;

GETNAMES=YES;

RUN;

%Mend ReadCSV;

%ReadCSV("O:Rep/Mort/EnData/201609/Pool_Code_C_201609.csv", work.ofile1);

%ReadCSV("O:Rep/Mort/EnData/201609/Pool_Code_S_201609.csv", work.ofile);

%ReadCSV("O:Rep/Mort/EnData/201608/Pool_Code_C_201608.csv", work.ofile1);

%ReadCSV("O:Rep/Mort/EnData/201608/Pool_Code_S_201608.csv", work.ofile);

 

I get this error messgage -

 

MPRINT(READCSV): PROC IMPORT OUT= &ofile DATAFILE= "O:Rep/Mort/EnData/201608/Pool_Code_S_201608.csv" DBMS=CSV REPLACE;

NOTE: The SAS System stopped processing this step because of errors.

MPRINT(READCSV): GETNAMES=YES;

MPRINT(READCSV): DATAROW=2;

MPRINT(READCSV): RUN;

MLOGIC(READCSV): Ending execution.

ERROR 22-322: Expecting a name.

ERROR 200-322: The symbol is not recognized and will be ignored.

28

29 GOPTIONS NOACCESSIBLE;

30 %LET _CLIENTTASKLABEL=;

31 %LET _CLIENTPROJECTPATH=;

32 %LET _CLIENTPROJECTNAME=;

33 %LET _SASPROGRAMFILE=;

34

35 ;*';*";*/;quit;run;

36 ODS _ALL_ CLOSE;

37

38

39 QUIT; RUN;

40

 

RW9
Diamond | Level 26 RW9
Diamond | Level 26

In your code, you use a macro variable &ofile. - this is never described, do you mean outfile?  Also couple of tips, don't write code all in upper case, it is offensive to the eye.  Finish your macro variables with a full stop, they highlight nicer in the editor.

As for your problem, simplest ideda would be to do an OS dir command into a dataset, then call execute() to either call your import code or exeucte the code:

data _null_;
  set dirlist;
  call execute(cats('%Imp (infile=',file,',outfile=',outfile,');'));
run;

Leaving now so can't elaborate more.

Shmuel
Garnet | Level 18

in your macro program change a line,

  from: PROC IMPORT OUT= &ofile

   to     PROC IMPORT OUT= &outfile

 

so you have steps 1 and 2 from my previous post.

 

Sometimes it is easier to duplicate lines and make light changes instead

developing a mcro with a loop, and it is OK from my point of view.

ballardw
Super User

Please note @Shmuel's instruction about the DATASTEP.

If you use multiple Proc import statements you will almost certainly get text variables of different lengths and depending upon your actual data could have some variables change from numeric to text between files. For instance a Zip code that is all 5 digit values in one file may be imported as numeric but the next file with some Zip+4 codes, with a - in the middle, will be read as a 10 character text field.

So when you go to process the data or combine files you end up with incompatible data. Date fields may also be problematic depending on what program build your CSV files. I have some I deal with that create quoted date fields like "10/16/2016" that proc import treats as character, informat $13.

 

Also you look at the data step generated by proc import to see if the data types match your expectations and if the text variables are long enough to hold the longest expected value.

Then in the data step you have something like:

 

data setname;

    infile "filepath\name.csv" <infile options>.

<rest of data step>.

 

Once you have a datastep that works you use that in your macro and have the SETNAME and file information as the macro variables.

Astounding
PROC Star

Brandon,

 

While the comments you have received are on the money, I'm not sure you know what to do at this point.  Here are some "next steps".

 

First, fix the macro.  Wherever it refers to &ofile, change that to &outfile. 

 

Second, test it.  Make sure you know:

 

  • What is its output?  Does it contain what you need, or are further changes required?
  • If the output file already exists when the macro runs, what will happen to it?  (error message? replace? append?)

That won't form a full solution, but it will at least let you participate in some of the more complex steps going forward.

Ksharp
Super User
Firstly, use OS command to get these file list , and then call execute() or data step to import them.
Suppose those 22 folders are all under TEMP folder.

filename xx pipe 'dir c:\temp\*.csv  /s /b';
data _null_;
 infile xx;
input;
put _infile_;
call execute();
run;

PeterKellberg
Obsidian | Level 7

This could do the trick - basefolder is the root folder for the folders containing the csv files.

%macro importcsv(startyear=,endyear=,basefolder=);
   %local year month;
   %do year=&startyear %to &endyear;
      %do month=1 %to 12;
         %if &month<10 %then /* fix 01 02 etc */
            %let month=0&month;
            proc import out=sas_&year&month.(where=(not missing(pool))) 
               datafile="&basefolder.Pool_Code_C_&year&month..csv"
               dbms=csv replace;
               getnames=yes;
            run;
            proc import out=sas_&year&month.(where=(not missing(pool))) 
               datafile="&basefolder.Pool_Code_S_&year&month..csv"
               dbms=csv replace;
               getnames=yes;
            run;
      %end;
   %end;
%mend;
options mprint;
%importcsv(startyear=2014,endyear=2016,basefolder=h:\csvfiles\);

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 2444 views
  • 0 likes
  • 7 in conversation