BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mmhxc5
Quartz | Level 8

Hi,

I would like to post my question again, because the last time I did not get a satisfactry answer.

 

I would like to add variable YEAR from 1992 to 2017 based on the filename for each file available in comma delimited .txt type.  The file names are  ID1992, ID1993, ..., ID2017,and export it back as its original file, that is .txt.

 

Each file has 137 variables and more than 50,000 rows of data - character and numeric. I don't want to interfere within the files, but just want to add another variable YEAR for each row of each file. 

 

I appreciate your time and help.

 

Thanks,

1 ACCEPTED SOLUTION

Accepted Solutions
VDD
Ammonite | Level 13 VDD
Ammonite | Level 13
%macro runner(yy=);
	proc import datafile="E:\Idaho\ID&yy..txt"
		dbms=dlm out=work.test;
		delimiter =",";
		getnames=yes;
		guessingrows=100000;
	run;
	data Work.ID&yy.ay;
		set test;
		Inspection_year=&yy.;
	run;
	proc export data=Work.ID&yy.ay replace
		outfile = "E:\Ad_Year\ID&yy.ay"
/* or do you want the output file to be Text
use this
Outfile = "E:\Ad_Year\ID&yy.ay.txt"
*/ dbms=dlm; delimiter=','; run;
proc datasets lib=work;
delete test;
quit; %mend runner; %runner(yy=1992); %runner(yy=1993); . . %runner(yy=2017);

Thank you @Reeza for the eagle eye added a delete. 

View solution in original post

21 REPLIES 21
Reeza
Super User
Show the code that you can use to successfully read a single file. Then that code can be modified to accomplish this. After reviewing your previous question, I believe this would be enough information to solve your problem.
mmhxc5
Quartz | Level 8

@Reezathank you!

Here is my code that is doing the job one by one. I want this code to be automated to all at once.

proc import datafile='E:\Idaho\ID2017.txt'
dbms=dlm out=work.test;
delimiter =",";
getnames=yes;
guessingrows=100000;
run;
data Work.ID2017ay;
set test;
Inspection_year=2017;
run;
proc export data=Work.ID2017ay replace
outfile = 'E:\Ad_Year\ID2017ay'
dbms=dlm;
delimiter=',';
run;
ballardw
Super User

@mmhxc5 wrote:

@Reezathank you!

Here is my code that is doing the job one by one. I want this code to be automated to all at once.


Did you try my example?

Reeza
Super User

If you wanted to automate that, you could turn it into a macro and then call it for each year. 

 

Here's a tutorial for that. 

https://github.com/statgeek/SAS-Tutorials/blob/master/Turning%20a%20program%20into%20a%20macro.md

 

@ballardw  solution will work as well, but I'm not a huge fan of running code I don't understand. 

ballardw
Super User

@Reeza wrote:

If you wanted to automate that, you could turn it into a macro and then call it for each year. 

 

Here's a tutorial for that. 

https://github.com/statgeek/SAS-Tutorials/blob/master/Turning%20a%20program%20into%20a%20macro.md

 

@ballardw  solution will work as well, but I'm not a huge fan of running code I don't understand. 


@Reeza

 

My example is basically a buffer manipulation. The Input places the entire line of data into the automatic variable _infile_ then depending on whether we need a header row or not appends a comma, since the OP says the file is CSV, and either the text YEAR for a header or the value of a year.

 

The FILE statement documentation even has hints on this:

Updating an External File in Place

You can use the FILE statement with the INFILE and PUT statements to update an external file 
in place, updating either an entire record or only selected fields within a record.
Follow these guidelines: Always place the INFILE statement first. Specify the same fileref or physical filename in the INFILE and FILE statements. Use options that are common to both the INFILE and FILE statements
in the INFILE statement. (Any such options that are used in the FILE statement are ignored.) Use the SHAREBUFFERS option in the INFILE statement to allow the INFILE and FILE statements
to use the same buffer, which saves CPU time and enables you to update individual fields
instead of entire records.

I could have used the _file_ variable as well instead of the variable.

 

Reeza
Super User

@ballardw I understand your code, I wasn't sure the OP would, so s/he shouldn't run it :). 

 

Rationale for this type of 'rule' is that if you don't understand it enough you cannot make changes or understand what's happening when something goes wrong. Edge cases can also slip through if you're not checking your cases correctly. 

 

Your solution is the one I would recommend as being the most correct. However, the macro approach will work as well. 

ballardw
Super User

@Reeza wrote:

@ballardw I understand your code, I wasn't sure the OP would, so s/he shouldn't run it :). 

 

Rationale for this type of 'rule' is that if you don't understand it enough you cannot make changes or understand what's happening when something goes wrong. Edge cases can also slip through if you're not checking your cases correctly. 

 

Your solution is the one I would recommend as being the most correct. However, the macro approach will work as well. 


Understood. And actually the write out of _infile_ is partially to avoid any of the issues that arise when "best" formats may get applied from proc import and such things as account numbers.

Reeza
Super User

Yeah, PROC IMPORT definitely has the possibility of not reading in the data correctly and then consequently writing it out incorrectly. 

 


@ballardw wrote:

@Reeza wrote:

@ballardw I understand your code, I wasn't sure the OP would, so s/he shouldn't run it :). 

 

Rationale for this type of 'rule' is that if you don't understand it enough you cannot make changes or understand what's happening when something goes wrong. Edge cases can also slip through if you're not checking your cases correctly. 

 

Your solution is the one I would recommend as being the most correct. However, the macro approach will work as well. 


Understood. And actually the write out of _infile_ is partially to avoid any of the issues that arise when "best" formats may get applied from proc import and such things as account numbers.


 

ballardw
Super User

I strongly suggest backing up you text files before you start such things. Any error on your part could destroy your original file.

 

Will you need to insert a column header?

data _null_;
   infile "<path>\id1922.txt" lrecl=10000;/* needs to be at least as long as longest input line*/
   file "<path>\newid1922.txt";
   input;
   if _n_=1 then outline= catx(',',_infile_,'Year');
   else outline=catx(',',_infile_,'1992');
   put outline;
run;

Replace path with your location. If you write to a different folder then you could use the same file name.

This provides a header row. If you don't need one remove the IF - else only have the last outline.

 

If this is sufficient then steps can be made to automate this.

VDD
Ammonite | Level 13 VDD
Ammonite | Level 13
%macro runner(yy=);
	proc import datafile="E:\Idaho\ID&yy..txt"
		dbms=dlm out=work.test;
		delimiter =",";
		getnames=yes;
		guessingrows=100000;
	run;
	data Work.ID&yy.ay;
		set test;
		Inspection_year=&yy.;
	run;
	proc export data=Work.ID&yy.ay replace
		outfile = "E:\Ad_Year\ID&yy.ay"
/* or do you want the output file to be Text
use this
Outfile = "E:\Ad_Year\ID&yy.ay.txt"
*/ dbms=dlm; delimiter=','; run;
proc datasets lib=work;
delete test;
quit; %mend runner; %runner(yy=1992); %runner(yy=1993); . . %runner(yy=2017);

Thank you @Reeza for the eagle eye added a delete. 

Reeza
Super User
@VDD, should probably drop the test data set at the end of the loop. Otherwise if the import files, it'll use the previous years data.
mmhxc5
Quartz | Level 8

@VDD, thank you for your nice code. I run the code, it works well, but it adds only the Inspection_Year for 2017. Other years are blank. Could you please look at the code and see if there is any error? I really appreciate it. Thank you!

Reeza
Super User
Did you add in the macro call for the other years?
mmhxc5
Quartz | Level 8

@Reeza, Yes, I added the macros for the other years. Here is the complete code I run, it works neatly, but only adds the Inspection_Year for 2017 and the other years are left blank.

 

%macro runner(yy=);
	proc import datafile="E:\Desktop\Idaho\ID&yy..txt"
		dbms=dlm out=work.test;
		delimiter =",";
		getnames=yes;
		guessingrows=1000000;
	run;
	data Work.ID&yy.ay;
		set test;
		Inspection_year=&yy.;
	run;
	proc export data=Work.ID&yy.ay replace
		outfile = "E:\Desktop\Idaho\Ad_ear\ID&yy.txt"
		dbms=dlm;
		delimiter=',';
	run;proc datasets lib=work;delete test;quit;
%mend runner;
%runner(yy=1992);
%runner(yy=1993);
%runner(yy=1994);
%runner(yy=1995);
%runner(yy=1996);
%runner(yy=1997);
%runner(yy=1998);
%runner(yy=1999);
%runner(yy=2000);
%runner(yy=2001);
%runner(yy=2002);
%runner(yy=2003);
%runner(yy=2004);
%runner(yy=2005);
%runner(yy=2006);
%runner(yy=2007);
%runner(yy=2008);
%runner(yy=2009);
%runner(yy=2010);
%runner(yy=2011);
%runner(yy=2012);
%runner(yy=2013);
%runner(yy=2014);
%runner(yy=2015);
%runner(yy=2016);
%runner(yy=2017);

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 21 replies
  • 4256 views
  • 7 likes
  • 6 in conversation