BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
That____Redhead
Fluorite | Level 6

Hello,

I recently started on a new project, and the people who have been working on this so far have the main data table in quite a mess.  Basically, we have about 3M unique customers with a contract ID (hereinafter referred to as acctnum).  Then we have usage data, about 30 variables, for every month of the past 12 months.  Every variable, every month, in a different variable.  Yikes.

If I'm not doing the best way of describing this, the table looks somewhat like:

Acctnum..........Rate_Plan_201201..........Revenue_201201..........Metric1_201202..........Metric2_201201..........Rate_Plan_201202..........Revenue_201202..........Metric1_201202..........Metric2_201202......

And so on and so forth.  Almost 600 variables.  Headache.

I want to transform this into a vertical table with a Month Period Key identifier, as every relational database I've ever touched is designed.

First off, could someone confirm that this is, indeed, the correct way to set up these tables?  (There are quite a few like this, but I just want to start with this one.)

Second off, I have a code similar to the following:

DATA New (Keep=Acctnum Rate_Plan Revenue Metric1 Metric2......);

     SET Old;

     IF Rate_Plan_201201 ^= '' THEN DO;

          MPK          = 133; /*Number of months since January, 2000*/

          Rate_Plan = Rate_Plan_201201;

          Revenue    = Revenue_201201;

          Metric1      = Metric1_201201;

          Metric2      = Metric2_201201;

     END;

     IF Rate_Plan_201202 ^= '' THEN DO;

          MPK          = 134; /*Number of months since January, 2000*/

          Rate_Plan = Rate_Plan_201202;

          Revenue    = Revenue_201202;

          Metric1      = Metric1_201202;

          Metric2      = Metric2_201202;

     END;

RUN;

What's happening is when I move to the next hoping to create a new MPK it is overwriting, only allowing the latest data to be written.  Can someone help me figure out how to do this??  Thanks!!!

Thanks so much everyone!

--Russell        

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

This is a somewhat elaborate transposition problem. The key is to ttranspose numeric and character variables separately and to get the MPK variable from the input variable names. Here is how it can be done :

data test;
input Acctnum Rate_Plan_201201 $ Revenue_201201 Metric1_201201 Metric2_201201
              Rate_Plan_201202 $ Revenue_201202 Metric1_201202 Metric2_201202;
datalines;
1 a 1 2 3 c 5 6 7
2 b 2 3 4 c 6 5 4
3 b 4 5 6 d 3 6 4
;

proc transpose data=test out=testc;
by acctnum notsorted;
var _character_;
run;

proc transpose data=test out=testn;
by acctnum notsorted;
var _numeric_;
run;

data testcc;
set testc;
mpk = intck("MONTH",'01JAN2000'd,input(scan(_name_,-1,"_"),yymmn6.));
var = substr(_name_,1,length(trim(_name_))-7);
drop _name_;
run;

data testnn;
set testn(where=(upcase(_name_) ne "ACCTNUM"));
mpk = intck("MONTH",'01JAN2000'd,input(scan(_name_,-1,"_"),yymmn6.));
var = substr(_name_,1,length(trim(_name_))-7);
drop _name_;
run;

proc sort data=testcc; by acctnum mpk var; run;
proc sort data=testnn; by acctnum mpk var; run;

proc transpose data=testcc out=testccc(drop=_name_);
by acctnum mpk;
var col1;
id var;
run;

proc transpose data=testnn out=testnnn(drop=_name_);
by acctnum mpk;
var col1;
id var;
run;

data want;
merge testccc testnnn;
by acctnum mpk;

if not missing(Rate_Plan);
run;

PG

PG

View solution in original post

2 REPLIES 2
PGStats
Opal | Level 21

This is a somewhat elaborate transposition problem. The key is to ttranspose numeric and character variables separately and to get the MPK variable from the input variable names. Here is how it can be done :

data test;
input Acctnum Rate_Plan_201201 $ Revenue_201201 Metric1_201201 Metric2_201201
              Rate_Plan_201202 $ Revenue_201202 Metric1_201202 Metric2_201202;
datalines;
1 a 1 2 3 c 5 6 7
2 b 2 3 4 c 6 5 4
3 b 4 5 6 d 3 6 4
;

proc transpose data=test out=testc;
by acctnum notsorted;
var _character_;
run;

proc transpose data=test out=testn;
by acctnum notsorted;
var _numeric_;
run;

data testcc;
set testc;
mpk = intck("MONTH",'01JAN2000'd,input(scan(_name_,-1,"_"),yymmn6.));
var = substr(_name_,1,length(trim(_name_))-7);
drop _name_;
run;

data testnn;
set testn(where=(upcase(_name_) ne "ACCTNUM"));
mpk = intck("MONTH",'01JAN2000'd,input(scan(_name_,-1,"_"),yymmn6.));
var = substr(_name_,1,length(trim(_name_))-7);
drop _name_;
run;

proc sort data=testcc; by acctnum mpk var; run;
proc sort data=testnn; by acctnum mpk var; run;

proc transpose data=testcc out=testccc(drop=_name_);
by acctnum mpk;
var col1;
id var;
run;

proc transpose data=testnn out=testnnn(drop=_name_);
by acctnum mpk;
var col1;
id var;
run;

data want;
merge testccc testnnn;
by acctnum mpk;

if not missing(Rate_Plan);
run;

PG

PG
That____Redhead
Fluorite | Level 6

Thank you!

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 883 views
  • 0 likes
  • 2 in conversation