Building models with SAS Enterprise Miner, SAS Factory Miner, SAS Visual Data Mining and Machine Learning or just with programming

Restructuring a table

Accepted Solution Solved
Reply
Contributor
Posts: 25
Accepted Solution

Restructuring a table

Hello,

I recently started on a new project, and the people who have been working on this so far have the main data table in quite a mess.  Basically, we have about 3M unique customers with a contract ID (hereinafter referred to as acctnum).  Then we have usage data, about 30 variables, for every month of the past 12 months.  Every variable, every month, in a different variable.  Yikes.

If I'm not doing the best way of describing this, the table looks somewhat like:

Acctnum..........Rate_Plan_201201..........Revenue_201201..........Metric1_201202..........Metric2_201201..........Rate_Plan_201202..........Revenue_201202..........Metric1_201202..........Metric2_201202......

And so on and so forth.  Almost 600 variables.  Headache.

I want to transform this into a vertical table with a Month Period Key identifier, as every relational database I've ever touched is designed.

First off, could someone confirm that this is, indeed, the correct way to set up these tables?  (There are quite a few like this, but I just want to start with this one.)

Second off, I have a code similar to the following:

DATA New (Keep=Acctnum Rate_Plan Revenue Metric1 Metric2......);

     SET Old;

     IF Rate_Plan_201201 ^= '' THEN DO;

          MPK          = 133; /*Number of months since January, 2000*/

          Rate_Plan = Rate_Plan_201201;

          Revenue    = Revenue_201201;

          Metric1      = Metric1_201201;

          Metric2      = Metric2_201201;

     END;

     IF Rate_Plan_201202 ^= '' THEN DO;

          MPK          = 134; /*Number of months since January, 2000*/

          Rate_Plan = Rate_Plan_201202;

          Revenue    = Revenue_201202;

          Metric1      = Metric1_201202;

          Metric2      = Metric2_201202;

     END;

RUN;

What's happening is when I move to the next hoping to create a new MPK it is overwriting, only allowing the latest data to be written.  Can someone help me figure out how to do this??  Thanks!!!

Thanks so much everyone!

--Russell        


Accepted Solutions
Solution
‎09-10-2012 06:27 PM
Respected Advisor
Posts: 4,646

Re: Restructuring a table

This is a somewhat elaborate transposition problem. The key is to ttranspose numeric and character variables separately and to get the MPK variable from the input variable names. Here is how it can be done :

data test;
input Acctnum Rate_Plan_201201 $ Revenue_201201 Metric1_201201 Metric2_201201
              Rate_Plan_201202 $ Revenue_201202 Metric1_201202 Metric2_201202;
datalines;
1 a 1 2 3 c 5 6 7
2 b 2 3 4 c 6 5 4
3 b 4 5 6 d 3 6 4
;

proc transpose data=test out=testc;
by acctnum notsorted;
var _character_;
run;

proc transpose data=test out=testn;
by acctnum notsorted;
var _numeric_;
run;

data testcc;
set testc;
mpk = intck("MONTH",'01JAN2000'd,input(scan(_name_,-1,"_"),yymmn6.));
var = substr(_name_,1,length(trim(_name_))-7);
drop _name_;
run;

data testnn;
set testn(where=(upcase(_name_) ne "ACCTNUM"));
mpk = intck("MONTH",'01JAN2000'd,input(scan(_name_,-1,"_"),yymmn6.));
var = substr(_name_,1,length(trim(_name_))-7);
drop _name_;
run;

proc sort data=testcc; by acctnum mpk var; run;
proc sort data=testnn; by acctnum mpk var; run;

proc transpose data=testcc out=testccc(drop=_name_);
by acctnum mpk;
var col1;
id var;
run;

proc transpose data=testnn out=testnnn(drop=_name_);
by acctnum mpk;
var col1;
id var;
run;

data want;
merge testccc testnnn;
by acctnum mpk;

if not missing(Rate_Plan);
run;

PG

PG

View solution in original post


All Replies
Solution
‎09-10-2012 06:27 PM
Respected Advisor
Posts: 4,646

Re: Restructuring a table

This is a somewhat elaborate transposition problem. The key is to ttranspose numeric and character variables separately and to get the MPK variable from the input variable names. Here is how it can be done :

data test;
input Acctnum Rate_Plan_201201 $ Revenue_201201 Metric1_201201 Metric2_201201
              Rate_Plan_201202 $ Revenue_201202 Metric1_201202 Metric2_201202;
datalines;
1 a 1 2 3 c 5 6 7
2 b 2 3 4 c 6 5 4
3 b 4 5 6 d 3 6 4
;

proc transpose data=test out=testc;
by acctnum notsorted;
var _character_;
run;

proc transpose data=test out=testn;
by acctnum notsorted;
var _numeric_;
run;

data testcc;
set testc;
mpk = intck("MONTH",'01JAN2000'd,input(scan(_name_,-1,"_"),yymmn6.));
var = substr(_name_,1,length(trim(_name_))-7);
drop _name_;
run;

data testnn;
set testn(where=(upcase(_name_) ne "ACCTNUM"));
mpk = intck("MONTH",'01JAN2000'd,input(scan(_name_,-1,"_"),yymmn6.));
var = substr(_name_,1,length(trim(_name_))-7);
drop _name_;
run;

proc sort data=testcc; by acctnum mpk var; run;
proc sort data=testnn; by acctnum mpk var; run;

proc transpose data=testcc out=testccc(drop=_name_);
by acctnum mpk;
var col1;
id var;
run;

proc transpose data=testnn out=testnnn(drop=_name_);
by acctnum mpk;
var col1;
id var;
run;

data want;
merge testccc testnnn;
by acctnum mpk;

if not missing(Rate_Plan);
run;

PG

PG
Contributor
Posts: 25

Re: Restructuring a table

Thank you!

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 2 replies
  • 252 views
  • 0 likes
  • 2 in conversation