Hello everyone!
I’ve just been employed at a newspaper to do some statistical analyses on their sales. I have sas experience from my studies, but mostly the statistical part. Alas, my sas “data” skills are not entirely up to the job in such a way that it will be a long lasting solution.
Here’s my problem:
I get a bunch of data on flat files once a week, and I’m interested in movements between a number of variables in-between these weekly data dumps.
They are separated on client basis, with a so-called “BP” number (ex. 2000008939). There are around 50000 of those unique numbers in my data. Every BP number has a number of variables attached. Such as shipping address, campaign code ect.
So now, imagine, that from one week to the next, some clients will leave the newspaper, and new ones will join the paper. That means that I have a change in the BP numbers from week to week. Those are the ones I want to match by. So far, I’ve managed to do it with simple dummy variables – say wea1 = 1 for active subscription in week one, wea2 = 2 for active subscription in week2 – and so forth.
For now I have only 5 weeks of data, and 3 variables of interest, and thus it’s manageable with a dummy for each week, and a dummy for each variable.
BUT…
Imagine that in a few months have several weeks of data, each with around 7-8 variables of interest. That will lead to a dummy mess I can’t begin to imagine.
My problem is, most often, that I get lots of code that looks like this:
data wea100716;
set wea100716;
if abb = 'CAMPAIGN' then camp1 = 1;
if abb = 'NORMAL' then normal1 = 1;
if abb = 'FREE' then free1 = 1;
if abb = 'TRIAL' then trial1 = 1;
if abb = 'UNKNOWN' then unknown1 = 1;
if abb = 'BUNDLING' then bund1 = 1;
if abb = 'MEMBER G' then mem1 = 1;
run;quit;
data wea100723;
set wea100723;
if abb = 'CAMPAIGN' then camp2 = 1;
if abb = 'NORMAL' then normal2 = 1;
if abb = 'FREE' then free2 = 1;
if abb = 'TRIAL' then trial2 = 1;
if abb = 'UNKNOWN' then unknown2 = 1;
if abb = 'BUNDLING' then bund2 = 1;
if abb = 'MEMBER G' then mem3 = 1;
run;quit;
data wea100730;
set wea100730;
if abb = 'CAMPAIGN' then camp3 = 1;
if abb = 'NORMAL' then normal3 = 1;
if abb = 'FREE' then free3 = 1;
if abb = 'TRIAL' then trial3 = 1;
if abb = 'UNKNOWN' then unknown3 = 1;
if abb = 'BUNDLING' then bund3 = 1;
if abb = 'MEMBER G' then mem3 = 1;
run;quit;
data wea100806;
set wea100806;
if abb = 'CAMPAIGN' then camp4 = 1;
if abb = 'NORMAL' then normal4 = 1;
if abb = 'FREE' then free4 = 1;
if abb = 'TRIAL' then trial4 = 1;
if abb = 'UNKNOWN' then unknown4 = 1;
if abb = 'BUNDLING' then bund4 = 1;
if abb = 'MEMBER G' then mem4 = 1;
run;quit;
data wea100813;
set wea100813;
if abb = 'CAMPAIGN' then camp5 = 1;
if abb = 'NORMAL' then normal5 = 1;
if abb = 'FREE' then free5 = 1;
if abb = 'TRIAL' then trial5 = 1;
if abb = 'UNKNOWN' then unknown5 = 1;
if abb = 'BUNDLING' then bund5 = 1;
if abb = 'MEMBER G' then mem5 = 1;
run;quit;
Which is (I think) a very stupid way, of looking at the development in the variable “abb” which shows, what kind of subscription the clients have. If let’s say, free5 = 1, then I know that BP number x, had that kind of subscription in week 5.
One of my goals, is to calculate how many unique BP numbers go from ‘CAMPAIGN’ to ‘NORMAL’ in between the weeks. So I need to match my “dummy for client being there in week 1” with “client being there in week two” with “client having abb = CAMPAIGN in week one” to “client having abb = normal in week two”.
God. Just the sheer tough of it makes me want to quit and be a street musician.
Sooooo….
What I would like to do, is to make some sort of “macro” or “loop” that can recognize different variables and then do a string of code, perhaps like the one above, from n=1 to n=10 for different numbers. Here is what I imagine it would look like. Where n=1 for week one, and i being some number I prespecify…. Catch my drift?
data wea100813;
set wea100813;
if abb = 'CAMPAIGN' then camp(n+i) = 1;
if abb = 'NORMAL' then normal(n+i) = 1;
if abb = 'FREE' then free(n+i) = 1;
if abb = 'TRIAL' then trial(n+i) = 1;
if abb = 'UNKNOWN' then unknown(n+i) = 1;
if abb = 'BUNDLING' then bund(n+i) = 1;
if abb = 'MEMBER G' then mem(n+i) = 1;
run;quit;
I just do not, want to end up with 36 week dummys, looking at 10 different variables for EACH week.
Well. I hope you get it.
Here is a link to my sascode.
Sascode:
http://dl.dropbox.com/u/1321324/Work/WEA.sas
I really hope you guys can gimme some pointers to some smart sas functions that would relieve me of my dummy to answers pain.
Thanks in advance!
/Toby