BookmarkSubscribeRSS Feed
Russell
Calcite | Level 5
Hey everyone,
I have another question for anyone willing to help.

I have 4 files which I input on a regular basis. One is a record of phone calls coming in from source A, one from source B, then one is old phone calls from source A and one from source B. They have the exact same column headers and data in each sheet. I don't want to just combine these because the old calls builds upon itself week over week, while the records are only of the past week.

Currently I just did the basic import, and I have it copied 4 times over. (DATA WORK.Lag_1; etc..)

If I changed it to DATA WORK.&filename could I get this to rotate through &filename1, &filename2, &filename3, &filename4?

Then just change the INFILE to &data1, &data2... to rotate through respectively?

I did a google search... couldn't find much that I understood... still learning. I appreciate everyone's help here!
6 REPLIES 6
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
SAS has an AGE statement used with PROC DATASETS. You would input a WORK.filename0 (or something like that for consistency), then when finished processing, then SORT or whatever to copy out to your permanent SAS data library keeping the same member-name, and then use AGE to allow SAS to cycle up each of the file-names.

Scott Barry
SBBWorks, Inc.

Suggested Google advanced search argument, this topic / post:

proc datasets age site:sas.com
Ksharp
Super User
Answer is Yes.
Post the following code which I write before.


[pre]
%let subdir=D:\sasdata\peter\patrick\data\;
filename dir pipe "dir &subdir.*.xls /B";
data new;
infile dir truncover end=last;
input filename $100.;
filename=cats("&subdir",filename);
call symputx(cats('path',_n_),filename);
call symputx(cats('dsn',_n_),scan(scan(filename,6,'\'),1,'.'));
if last then call symputx('nobs',_n_);
run;
%put _user_;
%macro import;
%do i=1 %to &nobs;
proc import datafile="&&path&i" out=&&dsn&i dbms=excel replace;
getnames=yes;
mixed=yes;
run;
%end;
%mend import;

%import


[/pre]




Ksharp
Russell
Calcite | Level 5
Thanks Scott and KSharp... KSharp... quick question for you.

I've seen something like this in my searches... but where do I define the column names, lengths, formats, etc? Is it just after this block of code?

I used the Import Data function and just copied the code... so it starts with:

DATA WORK.rawdata;
LENGTH
(columns defined)

FORMAT
(columns defined)

INFORMAT
(columns defined)

INFILE ""
LRECL=350
FIRSTOBS = 2
ENCODING="WLATIN1"
TERMSTR=CRLF
DLM=','
MISSOVER
DSD ;
INPUT
(columns defined)
RUN;

Would I replace the whole INFILE statement with your code or what?

I realize these are really basic questions and I thank you all for taking your time to help me... this is my first job out of college and I've been here 2 weeks, trying to teach myself.

And I figure using the wildcard is where it will select every file in that directory.. where is it getting the names where the data will be stored once imported?

Message was edited by: Russell Message was edited by: Russell
Ksharp
Super User
Ok.
I post another code I wrote before.

[pre]
%let subdir=D:\sasdata\peter\patrick\;
filename dir pipe "dir &subdir.*.csv /B";
data new;
infile dir truncover end=last;
input filename $100.;
filename=cats("&subdir",filename);
call symputx(cats('path',_n_),filename);
call symputx(cats('dsn',_n_),scan(scan(filename,5,'\'),1,'.'));
if last then call symputx('nobs',_n_);
run;
%put _user_;

%macro import;
%do i=1 %to &nobs;

data WORK.&&dsn&i; ;
%let _EFIERR_ = 0; /* set the ERROR detection macro variable */
infile "&&path&i" delimiter = ',' MISSOVER DSD lrecl=32767 firstobs=2 ;
informat VAR1 best32. ;
informat VAR2 $30. ;
informat VAR3 comma30. ;
informat VAR4 comma30.;
informat VAR5 comma30.;
informat VAR6 comma30.;
informat VAR7 comma30.;
informat VAR8 comma30.;
informat VAR9 best12.;
format VAR1 best12. ;
format VAR2 $30. ;
format VAR3 best32.2 ;
format VAR4 best32.2;
format VAR5 best32.2;
format VAR6 best32.2;
format VAR7 best32.2;
format VAR8 best32.2;
format VAR9 best12. ;
input
VAR1
VAR2
VAR3
VAR4
VAR5
VAR6
VAR7
VAR8
VAR9
;


if _ERROR_ then call symputx('_EFIERR_',1); /* set ERROR detection macro variable */
run;
%end;
%mend import;

%import

[/pre]



About wildcard,If you want every dataset which name start with 'makedata',then use dir c:\makedata* command,it will return the names makedata1 makedata2 ......

Ksharp
Peter_C
Rhodochrosite | Level 12
you know the data structure (proc import has to make judgements)
so, just write the simplest of routines to load your data
(you might find the code generated by proc import is helpful, but I find it verbose to the point of becoming unhelpful - imho)

Adapt the infile statement with "*" and/or "?" to pick up the 4 input files. [pre]infile '/paths/path/*calls.txt' lrecl=10000 dsd ;[/pre]That is designed to pick up files A_calls.txt, B_calls.txt C_calls.txt etc in folder /paths/path

There is another option for the INFILE statement that can indicate from which file you are reading as you go through all 4 inputs. There are examples in these forums.

With this approach you would collect one file (a SAS data set) each week holding all new call data.
analysing more than the latest week
The SAS system data sets have a feature referred to as "generation data groups". Setting the option MAXGEN=53, you could store and refer to any information for the last 12 months.
By default the latest generation (i.e. latest week of data) is referred to by the data name like: [pre] proc print data = calls.weeks ; run;[/pre]The individual generations are referred like[pre] proc print data = calls.weeks(gennum=-2) ; run;[/pre]which refers to the data for the week before last.

To put the last 4 weeks call data into one table [pre]data ;
set calls.weeks(gennum=0)
calls.weeks(gennum=-1)
calls.weeks(gennum=-2)
calls.weeks(gennum=-3) ;
by account calltime ;
run;[/pre](The BY statement interleaves the weekly data - it is like the final stage of SORT - here it delivers the data for the last 4 weeks for each account in account and calltime order, rather than the whole of each week before starting the next week. It does require that the weekly data are stored in that sorted order.

When analysing larger volumes of data create a VIEW rather than a DATA set. A VIEW is a process.
To suit your situation when you want to analyse more than the latest week, the process would interleave all the call history data you want.
First collect the information of the number of generations currently stored
proc sql noprint ;
select min(gen,maxgen-1) into :earliest separated by ' ' from dictionary.tables
where libname= 'CALLS' and memname ='WEEKS'
/* substitute your library and data set name here*/
;
quit;
That provides the earliest generation group reference ( once the generation history is full the earliest reference is one less than genmax )
Then you need to generate the list of generation data group references (like calls.weeks(gennum=-12) which is for the first of the last 13 weeks)
I have a pet macro for this kind of generation thing [pre]%macro gen(n, pattern=###, from=1) ;
%do i= &from %to &n ;
%sysfunc( tranwrd( %superq(pattern), ###, &i ))
%end ;
%mend ;[/pre] which I use here like[pre]data year /view=year ;
set %gen( &earliest, from=0, pattern= calls.weeks( gennum=-###) ) ;
by ACCOUNT_NO CALL_TIME ;
run;[/pre]
(assumes your weekly data would be stored, sorted by account_no and call_time.)

That data view "work.YEAR" is a process that returns all your call history data stored in these generation data sets
GreggB
Pyrite | Level 9
Here is a macro I used to read the same 8 variables (SchNum Last First Middle DOB Sex Grade Race) from 4 different csv files which have the same column headers.
*********************************************************************************************
%macro crs (sch=);
data &sch;
informat DOB yymmdd.;
format Schnum $3.Last $40.First $20. Middle $20. Sex $40. Grade $2. Race $40. ;
infile "G:\Departments\Research\MAP\0607\CRF_Winter2007\&sch..csv" dsd missover lrecl=32767 firstobs=2;
input SchNum Last First Middle DOB Sex Grade Race;
run;
%mend;
%crs(sch=oms);%crs(sch=sms);%crs(sch=tsms);%crs(sch=wmms);

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 1961 views
  • 0 likes
  • 5 in conversation