BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
AZFXL
Calcite | Level 5

I have multiple textfiles with sequentially incremental names, say test_1, test_2, ..., test_1000, each of them have exactly the same format (i.e. number of columns and delimited by comma). My intention is to append all these textfiles into one SAS database following the sequence of their filename. However, with the coding that I am currently using (as attached below), despite it will append all the textfiles, but the appending sequence is not as per my expectation (i.e, it will append test_1,test_10,...,test_19,test_100,...,test_199,test_1000,test_2,test_20,...,test_29,test_200,...,test_299,test_3,test_30,...,test_39,test_300,...,test_399,... instead of test_1, test_2, test_3, ..., test_1000). 

 

As such, I am trying to explore the feasibility of using SAS macro to get the intended result. Appreciate if I could get any advice from the community to start off with the SAS macro, thank you very much in advance. 

 

data libname.test;
	infile "&dlpath.\test_*.txt"
		delimiter=','
		missover
		firstobs=1
		DSD
		lrecl = 32767;
	format column1 $12.;
	format column2 yymmdds10.;
	format column3 best12.;
	input
		column1 $
		column2 :yymmdd8.
		column3;
run;

 

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

So you are saying you want sort by the numeric suffix on the filename instead of sorting on the full filename?

Do the files already have a variable that contains that number from the filename?  Or do you need to create it?

 

There should not be any need for macro coding to solve this, since there is no need to generate code.

 

You can use the FILENAME= option to get the name of the file currently being read.  You could then extract the number from that.

data libname.test;
  length fname $256;
  infile "&dlpath.\test_*.txt" dsd truncover filename=fname;
  length filenum row 8 column1 $12 column2 8 column3 8;
  format column2 yymmdds10.;
  informat column2 yymmdd.;
  input column1 -- column3;
  if fname ne lag(fname) then row=0;
  row+1;
  filenum=input(scan(fname,-2,'._'),32.);
run;

proc sort data=libname.test ;
  by filenum row;
run;

 

Or you could generate a list of file names and use the FILEVAR= option of the INFILE statement instead.  Make sure to use the END= option also.  So if you had a dataset named FILELIST with a variable named FILENAME that had the files in the order you want to read them then just use a data step like this to read all of the files in that order.

data libname.test;
  set filelist;
  infile TXT filevar=filename dsd truncover end=eof;
  do while not(eof);
      ..... Use the same code to define variables and INPUT one line ....
     output;
  end;
run;

 

View solution in original post

3 REPLIES 3
PaigeMiller
Diamond | Level 26

Can't you just sort the information by file name after SAS does all of this appending? You can use the FILENAME= option of the INFILE statement to create a variable in the SAS data set with the name of the original file.

--
Paige Miller
Tom
Super User Tom
Super User

So you are saying you want sort by the numeric suffix on the filename instead of sorting on the full filename?

Do the files already have a variable that contains that number from the filename?  Or do you need to create it?

 

There should not be any need for macro coding to solve this, since there is no need to generate code.

 

You can use the FILENAME= option to get the name of the file currently being read.  You could then extract the number from that.

data libname.test;
  length fname $256;
  infile "&dlpath.\test_*.txt" dsd truncover filename=fname;
  length filenum row 8 column1 $12 column2 8 column3 8;
  format column2 yymmdds10.;
  informat column2 yymmdd.;
  input column1 -- column3;
  if fname ne lag(fname) then row=0;
  row+1;
  filenum=input(scan(fname,-2,'._'),32.);
run;

proc sort data=libname.test ;
  by filenum row;
run;

 

Or you could generate a list of file names and use the FILEVAR= option of the INFILE statement instead.  Make sure to use the END= option also.  So if you had a dataset named FILELIST with a variable named FILENAME that had the files in the order you want to read them then just use a data step like this to read all of the files in that order.

data libname.test;
  set filelist;
  infile TXT filevar=filename dsd truncover end=eof;
  do while not(eof);
      ..... Use the same code to define variables and INPUT one line ....
     output;
  end;
run;

 

Reeza
Super User

What about sorting the file after the fact instead?

The filename variable is character so that's a character ordering, test_1, test_100. 

You can capture the filename in a variable and then extract the number. Then sort by that number.

 

 

data libname.test;
	infile "&dlpath.\test_*.txt"
		delimiter=','
		missover
		firstobs=1
		DSD
		lrecl = 32767 filename=filevar;
	format column1 $12.;
	format column2 yymmdds10.;
	format column3 best12.;
       file_name=filevar;
      *this may need to be modified to capture the number based on the full path;
       file_num=input(compress(file_name, , 'kd'), 8.);
	input
		column1 $
		column2 :yymmdd8.
		column3;
run;

proc sort data=libname.test;
by file_num column1 column2 column3;
run;

@AZFXL wrote:

I have multiple textfiles with sequentially incremental names, say test_1, test_2, ..., test_1000, each of them have exactly the same format (i.e. number of columns and delimited by comma). My intention is to append all these textfiles into one SAS database following the sequence of their filename. However, with the coding that I am currently using (as attached below), despite it will append all the textfiles, but the appending sequence is not as per my expectation (i.e, it will append test_1,test_10,...,test_19,test_100,...,test_199,test_1000,test_2,test_20,...,test_29,test_200,...,test_299,test_3,test_30,...,test_39,test_300,...,test_399,... instead of test_1, test_2, test_3, ..., test_1000). 

 

As such, I am trying to explore the feasibility of using SAS macro to get the intended result. Appreciate if I could get any advice from the community to start off with the SAS macro, thank you very much in advance. 

 

data libname.test;
	infile "&dlpath.\test_*.txt"
		delimiter=','
		missover
		firstobs=1
		DSD
		lrecl = 32767;
	format column1 $12.;
	format column2 yymmdds10.;
	format column3 best12.;
	input
		column1 $
		column2 :yymmdd8.
		column3;
run;

 


 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 774 views
  • 2 likes
  • 4 in conversation