Hi,
I would like to post my question again, because the last time I did not get a satisfactry answer.
I would like to add variable YEAR from 1992 to 2017 based on the filename for each file available in comma delimited .txt type. The file names are ID1992, ID1993, ..., ID2017,and export it back as its original file, that is .txt.
Each file has 137 variables and more than 50,000 rows of data - character and numeric. I don't want to interfere within the files, but just want to add another variable YEAR for each row of each file.
I appreciate your time and help.
Thanks,
%macro runner(yy=);
proc import datafile="E:\Idaho\ID&yy..txt"
dbms=dlm out=work.test;
delimiter =",";
getnames=yes;
guessingrows=100000;
run;
data Work.ID&yy.ay;
set test;
Inspection_year=&yy.;
run;
proc export data=Work.ID&yy.ay replace
outfile = "E:\Ad_Year\ID&yy.ay"
/* or do you want the output file to be Text
use this
Outfile = "E:\Ad_Year\ID&yy.ay.txt"
*/
dbms=dlm;
delimiter=',';
run;
proc datasets lib=work;
delete test;
quit;
%mend runner;
%runner(yy=1992);
%runner(yy=1993);
.
.
%runner(yy=2017);
Thank you @Reeza for the eagle eye added a delete.
@Reezathank you!
Here is my code that is doing the job one by one. I want this code to be automated to all at once.
proc import datafile='E:\Idaho\ID2017.txt'
dbms=dlm out=work.test;
delimiter =",";
getnames=yes;
guessingrows=100000;
run;
data Work.ID2017ay;
set test;
Inspection_year=2017;
run;
proc export data=Work.ID2017ay replace
outfile = 'E:\Ad_Year\ID2017ay'
dbms=dlm;
delimiter=',';
run;
If you wanted to automate that, you could turn it into a macro and then call it for each year.
Here's a tutorial for that.
https://github.com/statgeek/SAS-Tutorials/blob/master/Turning%20a%20program%20into%20a%20macro.md
@ballardw solution will work as well, but I'm not a huge fan of running code I don't understand.
@Reeza wrote:
If you wanted to automate that, you could turn it into a macro and then call it for each year.
Here's a tutorial for that.
https://github.com/statgeek/SAS-Tutorials/blob/master/Turning%20a%20program%20into%20a%20macro.md
@ballardw solution will work as well, but I'm not a huge fan of running code I don't understand.
My example is basically a buffer manipulation. The Input places the entire line of data into the automatic variable _infile_ then depending on whether we need a header row or not appends a comma, since the OP says the file is CSV, and either the text YEAR for a header or the value of a year.
The FILE statement documentation even has hints on this:
Updating an External File in Place You can use the FILE statement with the INFILE and PUT statements to update an external file
in place, updating either an entire record or only selected fields within a record.
Follow these guidelines: Always place the INFILE statement first. Specify the same fileref or physical filename in the INFILE and FILE statements. Use options that are common to both the INFILE and FILE statements
in the INFILE statement. (Any such options that are used in the FILE statement are ignored.) Use the SHAREBUFFERS option in the INFILE statement to allow the INFILE and FILE statements
to use the same buffer, which saves CPU time and enables you to update individual fields
instead of entire records.
I could have used the _file_ variable as well instead of the variable.
@ballardw I understand your code, I wasn't sure the OP would, so s/he shouldn't run it :).
Rationale for this type of 'rule' is that if you don't understand it enough you cannot make changes or understand what's happening when something goes wrong. Edge cases can also slip through if you're not checking your cases correctly.
Your solution is the one I would recommend as being the most correct. However, the macro approach will work as well.
@Reeza wrote:
@ballardw I understand your code, I wasn't sure the OP would, so s/he shouldn't run it :).
Rationale for this type of 'rule' is that if you don't understand it enough you cannot make changes or understand what's happening when something goes wrong. Edge cases can also slip through if you're not checking your cases correctly.
Your solution is the one I would recommend as being the most correct. However, the macro approach will work as well.
Understood. And actually the write out of _infile_ is partially to avoid any of the issues that arise when "best" formats may get applied from proc import and such things as account numbers.
Yeah, PROC IMPORT definitely has the possibility of not reading in the data correctly and then consequently writing it out incorrectly.
@ballardw wrote:
@Reeza wrote:
@ballardw I understand your code, I wasn't sure the OP would, so s/he shouldn't run it :).
Rationale for this type of 'rule' is that if you don't understand it enough you cannot make changes or understand what's happening when something goes wrong. Edge cases can also slip through if you're not checking your cases correctly.
Your solution is the one I would recommend as being the most correct. However, the macro approach will work as well.
Understood. And actually the write out of _infile_ is partially to avoid any of the issues that arise when "best" formats may get applied from proc import and such things as account numbers.
I strongly suggest backing up you text files before you start such things. Any error on your part could destroy your original file.
Will you need to insert a column header?
data _null_; infile "<path>\id1922.txt" lrecl=10000;/* needs to be at least as long as longest input line*/ file "<path>\newid1922.txt"; input; if _n_=1 then outline= catx(',',_infile_,'Year'); else outline=catx(',',_infile_,'1992'); put outline; run;
Replace path with your location. If you write to a different folder then you could use the same file name.
This provides a header row. If you don't need one remove the IF - else only have the last outline.
If this is sufficient then steps can be made to automate this.
%macro runner(yy=);
proc import datafile="E:\Idaho\ID&yy..txt"
dbms=dlm out=work.test;
delimiter =",";
getnames=yes;
guessingrows=100000;
run;
data Work.ID&yy.ay;
set test;
Inspection_year=&yy.;
run;
proc export data=Work.ID&yy.ay replace
outfile = "E:\Ad_Year\ID&yy.ay"
/* or do you want the output file to be Text
use this
Outfile = "E:\Ad_Year\ID&yy.ay.txt"
*/
dbms=dlm;
delimiter=',';
run;
proc datasets lib=work;
delete test;
quit;
%mend runner;
%runner(yy=1992);
%runner(yy=1993);
.
.
%runner(yy=2017);
Thank you @Reeza for the eagle eye added a delete.
@VDD, thank you for your nice code. I run the code, it works well, but it adds only the Inspection_Year for 2017. Other years are blank. Could you please look at the code and see if there is any error? I really appreciate it. Thank you!
@Reeza, Yes, I added the macros for the other years. Here is the complete code I run, it works neatly, but only adds the Inspection_Year for 2017 and the other years are left blank.
%macro runner(yy=);
proc import datafile="E:\Desktop\Idaho\ID&yy..txt"
dbms=dlm out=work.test;
delimiter =",";
getnames=yes;
guessingrows=1000000;
run;
data Work.ID&yy.ay;
set test;
Inspection_year=&yy.;
run;
proc export data=Work.ID&yy.ay replace
outfile = "E:\Desktop\Idaho\Ad_ear\ID&yy.txt"
dbms=dlm;
delimiter=',';
run;proc datasets lib=work;delete test;quit;
%mend runner;
%runner(yy=1992);
%runner(yy=1993);
%runner(yy=1994);
%runner(yy=1995);
%runner(yy=1996);
%runner(yy=1997);
%runner(yy=1998);
%runner(yy=1999);
%runner(yy=2000);
%runner(yy=2001);
%runner(yy=2002);
%runner(yy=2003);
%runner(yy=2004);
%runner(yy=2005);
%runner(yy=2006);
%runner(yy=2007);
%runner(yy=2008);
%runner(yy=2009);
%runner(yy=2010);
%runner(yy=2011);
%runner(yy=2012);
%runner(yy=2013);
%runner(yy=2014);
%runner(yy=2015);
%runner(yy=2016);
%runner(yy=2017);
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.