I want to combine 12 datasets--only those with June and July for past 6 years

Reply
Frequent Contributor
Posts: 75

I want to combine 12 datasets--only those with June and July for past 6 years

I need June and July data (which are in separate files) for this year and the 5 preceding years. 
 
Right now it's being done with a macro but it's tedious since I have to update each line of the %mend date. 
 
Also, the current month's data is in a separate folder from the others.
 
Then, if possible, what I want to do is take the average change in 'var1' (number of attendees, let's say) between June and July over the previous 5 years.
 
Any help would be greatly appreciated. I am not committed to the macro method if there's a better way. 
 
 
 
 
libname output   "\\folder";
 
%macro date(y,m);
 
 
data FILE&y.&m;
        infile "\\file&m.&y..prn" lrecl=430 missover;
        input
                @ ;
                %include "\\statement.txt"              ;
 run;
   
%mend date;
%date(11,06); %date(11,07);
%date(12,06); %date(12,07);
%date(13,06); %date(13,07);
%date(14,06); %date(14,07);
%date(15,06); %date(15,07);
%date(16,06); /*%date(15,10) ---Can't include this one because it's in a different folder*/
;
 
 
 
Respected Advisor
Posts: 4,993

Re: I want to combine 12 datasets--only those with June and July for past 6 years

It looks like you might be adding to your use of macros, rather than reducing it.  Consider this macro:

 

%macro dates (month=, start_yr=, end_yr=);

   %local yr;

   %do yr=&start_yr %to &end_yr;

      %date (&yr, &month)

   %end;

%mend;

 

Now you would have to call this macro just once to loop through all the Junes:

 

%dates (month=06, start_yr=11, end_yr=16)

 

Is that easier?  Perhaps.

SAS Employee
Posts: 24

Re: I want to combine 12 datasets--only those with June and July for past 6 years

Hello.

 

To read in your data, I suggest you use a macro %do loop in place of calling the macro over and over. I worked up an example of the logic you can use:

 

options mprint symbolgen;
%macro readjunjul;
data _null_;
yr=year(today())-2000; /* to get a 2 digit year*/
call symput('thisyear', left(put(yr,8.)));
call symput('lastyear',left(put(yr-5,8.)));
call symput ("month1",'06');
call symput ("month2",'07');
stop;
run;

%do i=&lastyear %to &thisyear;
%do j=1 %to 2;
data _null_;
put "&i";
put "&&month&j.."//;
run;
%end;
%end;
%mend;
%readjunjul;

 

Replace the second data null step with your data step that reads in the text file. 

You can use similar logic to create one data set that contains all the data. Then to compute your average change you can follow the logic from this example in the PROC MEANS documentation:

 

https://support.sas.com/documentation/cdl/en/proc/68954/HTML/default/viewer.htm#p074ebrvnxe5yan1ii4r...

 

Michelle

 

 

 

Frequent Contributor
Posts: 75

Re: I want to combine 12 datasets--only those with June and July for past 6 years

Thanks so much. I've been playing with this to get it to work. I am still running into two problems:

 

1. the current month's dataset is kept in a different folder from the previous months'. So I tried to add a condition to the data step, but it is not working. Just tries to take everything from the pre-release folder (see text in red):

 

options mprint symbolgen;
%macro readjunjul;
data _null_;
 
/* UPDATE TO PREVIOUS MONTH --> */ call symput ("month1",'06');
/* UPDATE TO CURRENT MONTH -->  */ call symput ("month2",'07');
yr=year(today())-2000; /* to get a 2 digit year*/
call symput('thisyear', left(put(yr,8.)));
call symput('lastyear',left(put(yr-5,8.)));
 
stop;
run;
 
%do i=&lastyear %to &thisyear;
%do j=1 %to 2;
 
data TAB&&month&j..&i.;
 
if &i.=16 and &j.=2 then infile "\\lhs4\prerelease\ANALYSIS\DATA\TABS\20&i.\tab&&month&j..&i..prn" lrecl=430 missover;
else infile "\\fld8filer\Tabs\tab&&month&j..&i..prn" lrecl=430 missover;
 
input
                @6              PROV                    2.                      /* PROV */
                @29             SYEAR                   4.                      /* Year */
                @33             SMNTH                   2.                      /* Month */
                @29             SDATE                   6.                      /* Survey year and month  - yyyymm */
 
                @122    SEX                             $CHAR1.         /* Sex of respondent. */
                @140    NAICS                   2.                      /* Industry 2-digit */
                @140    NAICS4                  4.                      /* Industry 4-digit */
                @144    SOC4                    $CHAR4.         /* occupation 4-digit */
;
                if missing(AGE_TABS) then AGEGROUP_main = .;
                        else if AGE_TABS        LT 15 then AGEGROUP_main = 1; /* LT 15 */
                        else if 15 LE AGE_TABS LT 25 then AGEGROUP_main = 2;    /*15-24 */
                        else if 25 LE AGE_TABS LT 55 then AGEGROUP_main = 3;    /* 25-54 */
                        else AGEGROUP_main = 4; /* 55+ */
 
put "&i";
put "&&month&j.."//;
run;
 
%end;
%end;
%mend;
%readjunjul;
 
 
FOLLOW-UP QUESTION #2: I am trying to get the average change between Jun and Jul for the past 5 years (not the average level). So in excel it would be Average(jul11-jun11,jul12-jun12,jul13-jun13,jul14-jun14,jul15-jun15) and then subtract the current month's level change from that average (jul16-jun16)-Average(jul11-jun11,jul12-jun12,jul13-jun13,jul14-jun14,jul15-jun15).
 
Ideally i'd end up with 2 columns in a final database with these two numbers for each occupation. Thanks again!
Frequent Contributor
Posts: 75

Re: I want to combine 12 datasets--only those with June and July for past 6 years

Also, how can I combine these 12 datasets into 1 final dataset (ideally with the columns for the change between months).

Grand Advisor
Posts: 17,396

Re: I want to combine 12 datasets--only those with June and July for past 6 years

Your expanding your question when you don't have your first one complete....one step at a time. 

 

Here's a method that will work, you should fix the filename list to be reflective of your file names. But it works without macros and is easier to understand and read. It's untested so there's probably errors, but it should be relatively easy to debug.

 

The data step creates a file list with the list of file names. 

The second data step reads all the files into one dataset, with a variable that identifies the source. 

data file_list;
	do i=1 to 16;

		do j=1 to 2;

			if i=16 and j=2 then
				file_name=catt("\\lhs4\prerelease\ANALYSIS\DATA\TABS\20", put(i, z2.), 
					"\tabmonth", j, ".prn");
			else
				filename=catt("\\fld8filer\Tabs\tabmonth", j, put(i, z2.), ".prn");
			output;
		end;
	end;
run;

filename placehold;

data want;
	set file_list;
	infile placehold filevar=file_name filename=filename eov=eov lrecl=430 
		missover;
	input @;

	if _n_ eq 1 or eov then
		do;
			txt_file_name=scan(filename, -1, "\");
			eov=0;
		end;
	else
		input
                @6 PROV 2.                      /* PROV */
                @29 SYEAR 4.                      /* Year */
                @33 SMNTH 2.                      /* Month */
                @29 SDATE 
			6.                      /* Survey year and month  - yyyymm */
 
                @122 
			SEX                             $CHAR1.         /* Sex of respondent. */
                @140 NAICS 
			2.                      /* Industry 2-digit */
                @140 NAICS4 
			4.                      /* Industry 4-digit */
                @144 
			SOC4                    $CHAR4.         /* occupation 4-digit */;

	if missing(AGE_TABS) then
		AGEGROUP_main=.;
	else if AGE_TABS LT 15 then
		AGEGROUP_main=1;

	/* LT 15 */
	else if 15 LE AGE_TABS LT 25 then
		AGEGROUP_main=2;

	/*15-24 */
	else if 25 LE AGE_TABS LT 55 then
		AGEGROUP_main=3;

	/* 25-54 */
	else
		AGEGROUP_main=4;

	/* 55+ */
run;

 

 

Grand Advisor
Posts: 17,396

Re: I want to combine 12 datasets--only those with June and July for past 6 years

[ Edited ]

Did you know that you can include multiple files in a filename statement and read them all in at once? 

Especially if they have the same format. 

 

Filename myfiles ('file1' 'file2' 'file3');

 

And then in the infile statement include the myfiles in place of the file path. 

 

Infile myfiles ...;

 

 

Or you could dynamically build your filename and use the FILEVAR option to read them all at once and not have a macro at all. 

 

Example 10 in the documentation

http://support.sas.com/documentation/cdl/en/lestmtsref/68024/HTML/default/viewer.htm#n1rill4udj0tfun...

 

If you use the dynamic option you can programmatically change it for the last record so although the path is different the same process will work. Use an If/Then. 

Grand Advisor
Posts: 17,396

Re: I want to combine 12 datasets--only those with June and July for past 6 years

I think that may be File not filename in the first option. 

Ask a Question
Discussion stats
  • 7 replies
  • 370 views
  • 0 likes
  • 4 in conversation