I have another question for anyone willing to help.
I have 4 files which I input on a regular basis. One is a record of phone calls coming in from source A, one from source B, then one is old phone calls from source A and one from source B. They have the exact same column headers and data in each sheet. I don't want to just combine these because the old calls builds upon itself week over week, while the records are only of the past week.
Currently I just did the basic import, and I have it copied 4 times over. (DATA WORK.Lag_1; etc..)
If I changed it to DATA WORK.&filename could I get this to rotate through &filename1, &filename2, &filename3, &filename4?
Then just change the INFILE to &data1, &data2... to rotate through respectively?
I did a google search... couldn't find much that I understood... still learning. I appreciate everyone's help here!
SAS has an AGE statement used with PROC DATASETS. You would input a WORK.filename0 (or something like that for consistency), then when finished processing, then SORT or whatever to copy out to your permanent SAS data library keeping the same member-name, and then use AGE to allow SAS to cycle up each of the file-names.
Suggested Google advanced search argument, this topic / post:
you know the data structure (proc import has to make judgements)
so, just write the simplest of routines to load your data
(you might find the code generated by proc import is helpful, but I find it verbose to the point of becoming unhelpful - imho)
Adapt the infile statement with "*" and/or "?" to pick up the 4 input files. [pre]infile '/paths/path/*calls.txt' lrecl=10000 dsd ;[/pre]That is designed to pick up files A_calls.txt, B_calls.txt C_calls.txt etc in folder /paths/path
There is another option for the INFILE statement that can indicate from which file you are reading as you go through all 4 inputs. There are examples in these forums.
With this approach you would collect one file (a SAS data set) each week holding all new call data. analysing more than the latest week
The SAS system data sets have a feature referred to as "generation data groups". Setting the option MAXGEN=53, you could store and refer to any information for the last 12 months.
By default the latest generation (i.e. latest week of data) is referred to by the data name like: [pre] proc print data = calls.weeks ; run;[/pre]The individual generations are referred like[pre] proc print data = calls.weeks(gennum=-2) ; run;[/pre]which refers to the data for the week before last.
To put the last 4 weeks call data into one table [pre]data ;
by account calltime ;
run;[/pre](The BY statement interleaves the weekly data - it is like the final stage of SORT - here it delivers the data for the last 4 weeks for each account in account and calltime order, rather than the whole of each week before starting the next week. It does require that the weekly data are stored in that sorted order.
When analysing larger volumes of data create a VIEW rather than a DATA set. A VIEW is a process.
To suit your situation when you want to analyse more than the latest week, the process would interleave all the call history data you want.
First collect the information of the number of generations currently stored
proc sql noprint ;
select min(gen,maxgen-1) into :earliest separated by ' ' from dictionary.tables
where libname= 'CALLS' and memname ='WEEKS'
/* substitute your library and data set name here*/
That provides the earliest generation group reference ( once the generation history is full the earliest reference is one less than genmax )
Then you need to generate the list of generation data group references (like calls.weeks(gennum=-12) which is for the first of the last 13 weeks)
I have a pet macro for this kind of generation thing [pre]%macro gen(n, pattern=###, from=1) ;
%do i= &from %to &n ;
%sysfunc( tranwrd( %superq(pattern), ###, &i ))
%mend ;[/pre] which I use here like[pre]data year /view=year ;
set %gen( &earliest, from=0, pattern= calls.weeks( gennum=-###) ) ;
by ACCOUNT_NO CALL_TIME ;
(assumes your weekly data would be stored, sorted by account_no and call_time.)
That data view "work.YEAR" is a process that returns all your call history data stored in these generation data sets
Here is a macro I used to read the same 8 variables (SchNum Last First Middle DOB Sex Grade Race) from 4 different csv files which have the same column headers.
%macro crs (sch=);
informat DOB yymmdd.;
format Schnum $3.Last $40.First $20. Middle $20. Sex $40. Grade $2. Race $40. ;
infile "G:\Departments\Research\MAP\0607\CRF_Winter2007\&sch..csv" dsd missover lrecl=32767 firstobs=2;
input SchNum Last First Middle DOB Sex Grade Race;