BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.

Dear All,

 

I need to import many text files into sas, i found a sas code at support SAS: http://support.sas.com/kb/41/880.html  and modified it. But i still have some problems getting my desired result.  

 

(I changed the dirctory in the code into my dir)

 

1.

filename DIRLIST pipe 'dir "C:\Users\*.txt" ';

 

data dirlist ;

infile dirlist lrecl=200 truncover;

input line $200.;

if input(substr(line,1,10), ?? mmddyy10.) = . then delete;

length file_name $ 100;

file_name="c:\Users\"||scan(line,-1," ");

keep file_name;

 

data _null_;

set dirlist end=end;

count+1;

call symput('read'||left(count),file_name);

call symput('dset'||left(count),substr(file_name,118,7));

if end then call symput('max',count);

run;

 

%put &read;

log: WARNING: Apparent symbolic reference READ not resolved.

%put &dset;

log: WARNING: Apparent symbolic reference DSET not resolved.

%put &max;

 

I got the correct dirlist data file, but the two macro variables are always empty.

 

 

 

2. I manully typed in the drictory of last data file and its data name into macro variables read and dset and tested for the following macro:

 

%macro readin;

%local i;

%do i=1 %to 1;

 

proc import datafile="&&read&i"

out=%scan(&dset,&i) dbms=dlm replace;

delimiter='|';

getnames=yes;

run;

 

%end;

%mend readin;

 

%readin;

 

 

I think there is something wrong with the satement: datafile="&&read&i" , cause, SAS read the first data file in that folder, while i assigned info of  the last dat file to the macro variabels.

 

Thank you for your help!!

 

Best wishes.

 

1 ACCEPTED SOLUTION

Accepted Solutions
Shmuel
Garnet | Level 18

1)  

In your code you wrote:

 

call symput('read'||left(count),file_name);

call symput('dset'||left(count),substr(file_name,118,7));

 

it means that your macro variables will be:

    &read1 , &read2 , ...

    &dset1 , &dset2 , ...

 

Yout wrote:

   %put &read;

   %put &dset

but there are no macro variables &read &dset  but &read1 &dset1 etc.

 

2)

there is no error in:  proc import datafile="&&read&i"

 

I do see error in:   out=%scan(&dset,&i) dbms=dlm replace;

I suppose you meant to write:  out=%scan(&&dset.&i) dbms=dlm replace; 

 

 

 

 

View solution in original post

11 REPLIES 11
Reeza
Super User

If your datasets have the same format you can use the method outline here:

 

https://communities.sas.com/t5/SAS-Communities-Library/How-do-I-write-a-macro-to-import-multiple-tex...

Shmuel
Garnet | Level 18

1)  

In your code you wrote:

 

call symput('read'||left(count),file_name);

call symput('dset'||left(count),substr(file_name,118,7));

 

it means that your macro variables will be:

    &read1 , &read2 , ...

    &dset1 , &dset2 , ...

 

Yout wrote:

   %put &read;

   %put &dset

but there are no macro variables &read &dset  but &read1 &dset1 etc.

 

2)

there is no error in:  proc import datafile="&&read&i"

 

I do see error in:   out=%scan(&dset,&i) dbms=dlm replace;

I suppose you meant to write:  out=%scan(&&dset.&i) dbms=dlm replace; 

 

 

 

 

Xiaoningdemao
Quartz | Level 8
Dear Shmuel,

Thank you! That answers my question.

Best wishes.
Xiaoningdemao
Quartz | Level 8
you are right, it should be out=&&dset&i
set_all__
Fluorite | Level 6

The initial part of the code creates multiple macro variables: read1,read2.. and dset1,dset2.... and max.

 

So for the %put statement, you should rather list all read<n> and dset<n> macro variables:

%put &read1;
%put &dset1;
%put &max;

Or you can use this to list all user defined macro variables: 

%put _user_;

 

For the macro readin, it is meant to read multiple files. But it seems like you have limited it to read only one file, so it is reading only one file. You can change the macro readin as below:

 

 

%do i=1 to &max.;

 

instead of 

%do i=1 to 1;

 

 

This should make the code read all the .txt files in the folder.

Xiaoningdemao
Quartz | Level 8
Dear set_all_,

Thanks for your quick response! It helps me understand the macros I created.

Best wishes.
ballardw
Super User

You should include a large GUESSINGROWS if you must use proc import. If a column has mixed values that might be considered numeric such as Zip code or account numbers that may contain other characters such as dashes or letters then the type of value, character or numeric may be guessed at incorrectly. This may become much more of a problem if you expect Zip to be character but is numeric in some and character in others.

 

Additionally if any of the files are supposed to be of the same layout and content you may end up with different lengths for character varaibles and the potential mismatches of variable types means that when you go to combine any of them for other purposes you will get errors or possibly truncated data.

Xiaoningdemao
Quartz | Level 8
Dear ballardw,
Thank you very much for the dead up!! I modified my code according to you suggestion.

Best wishes.
Reeza
Super User
%macro readin;
%local i;
%do i=1 %to &max;
 
proc import datafile="&&read&i"
out=&&dset&i dbms=dlm replace;
delimiter='|';
getnames=yes;
guessingrows=20000;
run;
 
%end;
%mend readin;
 
%readin;

In general it's better to write your own data step to read the data. Issues with methodology include:

 

Will you continue to require macro processing to process each file? Or do you plan to combine this in the future?

Will you ever need to know the source file for the records after combining, that isn't being stored anywhere right now.

If you use proc import you can get different types for variables (numeric vs character) which means you can't append data until its fixed. 

If someone else with a different version of SAS runs this, they may not get the same results as Proc Import changes a bit over time. 

 

My nickel of opinion.

 

 

 

Xiaoningdemao
Quartz | Level 8
Dear Rezza,

Thank you for remind me the potential problems! YEs you are right. But the data I imported here are from one huge file, just stored in sub files, so I don't think I will have problems analyzing them in future.

Thanks again!

Best wishes.
ballardw
Super User

Since you apparently stuck all of this output in your WORK directory I provide any quick code but you really should examine at least a subset of the created data sets for mismatched lengths and variable types.

 

If you had put them all in one location it would be easy to check using SASHelp.vcolumns and limited a summary to that library. Since there are very likely other sets in WORK it wouldn't be easy to exclude them.

 

I'll put a small stack of quarters up that at least one variable has different lengths in different data sets.

I would not be surprised to find a different variable name in one and finding several with different types would not surprise me.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 5067 views
  • 5 likes
  • 5 in conversation