BookmarkSubscribeRSS Feed
deleted_user
Not applicable
Basically I am trying to create a dataset that contains a little over 300 text files. The file names range from hclr1.txt to hclr314.txt. They all have the same variables and are aligned neatly in columns. Is there a way to read in all these files using a do loop and macro to read all files rather than making an infile statement from each and merging later?? Any help would be appreciated. Thanks
17 REPLIES 17
deleted_user
Not applicable
Some of these txt files contain the variable day with monday wednesday friday and others contain monday tuesday wednesday thursday and friday. I am not sure how this will affect the data when it is merged??
LinusH
Tourmaline | Level 20
Yes you may. Just use the loop macro variable in the filename. After reading the file , just PROC APPEND it to a common table.

I'm not sure why you are concerned about the day variable. It will just a ordinary column in the result table...?

/Linus
Data never sleeps
deleted_user
Not applicable
I am going to end up sorting by day of th week. I tried merging the data (I just tried 3 files at the moment) and when I merged the files with Monday-Friday with the Monday Wednesday Friday it seems to have messed up.

An example of my data is:

Monday 1 14 3 79 14.07
Wednesday .....
Friday .....

merged with:

Monday .....
Tuesday .....
Wednesday ....
Thursday ....
Friday .....

The problem is that when I merge a dataset that does not contain a Tuesday, all the data from wednesday moves to tuesday and friday is left blank.

As for the macro, I have little knowledge or expierience working with them so really have no clue on the specific language needed to run properly.

Once again thanks for any help.
Cynthia_sas
SAS Super FREQ
Hi:
Just to clarify...there is a difference between MERGING datasets and CONCATENATING or APPENDING datasets.

Here's an example....you have a dataset for every month: WORK.JAN, WORK.FEB, WORK.MAR, etc and the data looks like this:
[pre]
file WORK.JAN
day sale month
01 101 01
02 102 01
03 103 01

file WORK.FEB
day sale month
01 201 02
02 202 02
03 203 02

file WORK.MAR
day sale month
01 301 03
02 302 03
03 303 03
[/pre]

and you want to "stack" or "append" all the files together so that you have an entire year. That syntax would be:
[pre]
data allyear;
set work.jan work.feb work.mar work.apr work.may work.jun
work.jul work.aug work.sep work.oct work.nov work.dec;
run;
[/pre]

(or you could use PROC APPEND or PROC SQL,) but for now, let's stick with a data step program. The output in WORK.ALLYEAR would now look like this (only a few obs shown for each month):
[pre]
file WORK.ALLYEAR
day sale month
01 101 01
02 102 01
03 103 01
01 201 02
02 202 02
03 203 02
01 301 03
02 302 03
03 303 03
[/pre]

Note how the data are "stacked" one month on top of the next month and so on. This would be concatenating or appending.


Or, you could join the data "horizontally". Consider this data for every month (only a few obs are shown) the day variable identifies the day of the month the sale variable for each month, for ease of the example, is named jansale, febsale, marsale, etc:

[pre]
file WORK.JAN
day jansale
01 101
02 102
03 103

file WORK.FEB
day febsale
01 201
02 202
03 203

file WORK.MAR
day marsale
01 301
02 302
03 303
[/pre]

If the data were joined horizontally, you would get this:
[pre]

file WORK.ALLYEAR
day jansale febsale marsale
01 101 201 301
02 102 202 302
03 103 203 303
.... more observations ....
25 125 225 325
26 126 226 326
27 127 227 327
28 128 228 328
29 129 . 329
30 130 . 330
31 131 . 331

[/pre]

Note how the end of the month shows missing values for days 29, 30 and 31 for Feb (because Feb did not have any sales on those days).

The code that created the above output is shown below:
[pre]

data allyear;
merge jan(in=injan)
feb(in=infeb)
mar(in=inmar);
by day;
run;

proc print data=allyear;
title 'after merge';
run;
[/pre]

If you are appending, then not having Tuesday in one file should not make a difference. If you are merging, using a data step, then you should merge with a BY statement and if you are joining with PROC SQL, then you have to be careful with the type of join you code so you get the right combination of rows and columns in the output file.

cynthia
deleted_user
Not applicable
Thank you, that was very helpful.

I am still confused on how to import the 300 txt files though. I currently have a macro in which I have n=0 and then n=&n + 1 where i have my file names hclr&n.txt, I don't know if this is possible and it doesn't seem to be working. I just need it to read in my files and each has the same variable name and each file is just one number higher than the last.
data_null__
Jade | Level 19
I suggest you look at the documentation for the INFILE statement. You will want to pay particular attention to the FILEVAR option. You will need read about the INPUT statement. See the following thread for and example I posted using INFILE and FILEVAR etc.

[pre]
http://support.sas.com/forums/thread.jspa?messageID=18256䝐
[/pre]
LinusH
Tourmaline | Level 20
It seems that you need to acquire some basic skills in macro programming. I suggest that you take a class in this subject. It's available as a e-course:

http://www.sas.com/apps/elearning/elearning_courses.jsp?cat=SAS+Advanced+Programming

or as a traditional instructor lead one at your nearest SAS office.

Regards,
Linus
Data never sleeps
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
If the files are all structured reasonably consistent, at least where you can code one (or more as needed) INPUT statements, why not use the FILENAME and generate the list of files (if each is not too large to process as a single input stream) so that the files are treated as a logical concatenation with a DATA step. The list of files can easily be generated by a MACRO %DO %END loop, incrementing you file-suffix count. Here is the FILENAME syntax, abbreviated:

FILENAME INDATA ("file1.txt" "file2.txt" "file3.txt");

Scott Barry
SBBWorks, Inc.
deleted_user
Not applicable
OK, well this is my code I have so far:

%MACRO workdata;
%DO a=1 %TO 314;
DATA hclr&a;
INFILE 'C:users\work\documents\work\hclr&a..txt';
INFORMAT ..... ;
INPUT ....;
RUN;

PROC SORT DATA=hclr&a;
BY .....;
RUN;

%END;
%MEND workdata;

%workdata;

RUN;

The error is that it says the files can't be found but I thought the hclr&a..txt code should have continously replace &a with the numbers 1 through 314?? This code I thought would work to create 314 data sets that I could later merge. Anyone see any problems with this macro? I am using the right infile statement because it does work when I replace &a with a number. Thanks for any help.
Cynthia_sas
SAS Super FREQ
Hi:
Try these 2 title statements and see which one works -- &SYSDATE is an automatic macro variable:
[pre]
title1 'Run on &sysdate';
title2 "Run on &sysdate";
[/pre]

cynthia
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
Suggest you add the following statement to your program in order to generate as much diagnostic information as possible in your SAS log output - the second statement (or both of them, as preferred) can be commented out when you have a working macro:

OPTIONS SOURCE SOURCE2;
OPTIONS MACROGEN SYMBOLGEN MLOGIC;

The additional SAS compilation information generated will help you self-diagnose the SAS problem, when using macro language.

Scott Barry
SBBWorks, Inc.
deleted_user
Not applicable
It gave me more information but it still says physical file does not exist. For some reason it is not replacing the &a..txt with the new number of the do loop.
deleted_user
Not applicable
I figured it out. It needed " " instead of ' ' around the infile statement. Thanks for all the Help!!
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
You should be able to follow the SAS code compilation and generate SAS statements from your macro. If unable to identify the problem cause, you need to post a reply with your SAS log information (COPY / PASTE) into your reply. Key information is "where your macro variable is assigned" and "where does SAS resolve the code using the macro variable". SAS will reveal the macro variable assignments -- obviously the lines must use an ampersand where referenced.

Scott Barry
SBBWorks, Inc.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 17 replies
  • 3423 views
  • 0 likes
  • 6 in conversation