BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
runrunrunbunny
Calcite | Level 5

HI

Does anyone know how to populate data to a newly added column?

Thanks in advance!

 

Bunny

 

Screen Shot 2019-12-05 at 1.22.33 AM.png

1 ACCEPTED SOLUTION

Accepted Solutions
tomrvincent
Rhodochrosite | Level 12
I created a calendar dataset with one record per day (because a table like that is useful in so many ways) and then I can join to it using start_date and cancel_date.

View solution in original post

9 REPLIES 9
Jagadishkatam
Amethyst | Level 16

Please provide more details on how you want to derive the new variable we need to use the existing variables to derive the new column.

Thanks,
Jag
runrunrunbunny
Calcite | Level 5

HI Jag, I use Macro to get six datasets and their number of records are 3243, 4904, 4904, 3723, 4904,4904. What I need to do is populate these values to a new variable called "new_column" based on same month and year. After that, I will derive a new variable by using variable CNT_terminate and new_column. But the thing is how to fill those values 3243, 4904, 4904, 3723, 4904,4904 into the variable new_column? Thanks a lot!

 

Screen Shot 2019-12-05 at 10.25.08 AM.pngScreen Shot 2019-12-05 at 10.23.48 AM.png

ballardw
Super User

Please post code by copying from the editor and pasting into a code box opened on the forum with either the {I} or "running man" icon. That way we don't have to retype your code from scratch. (Hint: anything more than a very few lines are likely not to have anyone on this forum with the time or desire to do so.).

 

This should work

%Macro active_account(cancel_date, active_customer);
   data work.___temp;
      set sd.active_1;
      if start_date le &cancel_date.;
   run;

   data &active_customer;
      set work.___temp nobs=nn;
      new_column=nn;
   run;
%mend;

Any process that involves separating data into multiple sets is always suspect for reliability and scalability.

You might be better off describing how this 'output' is used as you do not actually show any code that could possibly create the actual output you show.

runrunrunbunny
Calcite | Level 5

Thank you so much for you reply! 

sd.Tel_data is a whole dataset including both active and cancelled customers.Like this:

 

customer_id      start_date            cancel_date

8661                   29APR2002          18APR2005

8662                   20APR2000                            .

8663                   10May2003          18OCT2006

 

 

 

sd.active only contains active customers' information. Like this:

customer_id      start_date            cancel_date

8662                   20APR2000                            .

 

 

The code blow is to figure out number of active customers each month:

 

data  sd.active;

set sd.Tel_data  ;

if cancel_date EQ missing; 

run;

%macro active_account(input,output);

data &output;

set sd.active;

if start_date LE  &input;

run; 

%mend;

%active_account('30APR2004'D, sd.active_APR2004);

%active_account('30APR2005'D, sd.active_APR2005);

%active_account('30APR2006'D, sd.active_APR2006);

%active_account('30AUG2004'D, sd.active_AUG2004);

%active_account('30AUG2005'D, sd.active_AUG2005);

%active_account('30AUG2006'D, sd.active_AUG2006);

 

The dataset below called sd.Temo which contains information of cancelled customers. This dataset has variables like customer_id, start_date, cancel_date. (sd.Temo is also subsetted from whole dataset sd.Tel_data)

 

customer_id      start_date            cancel_date        CNT_terminate

8661                   29APR2002          18APR2005             154

8663                   10May2003          18OCT2006             142

 

 

What I need to do is to derive a new variable called "CNT_active " which means number of active customers each month in dataset  sd.Temo, and populate the results of active customers each month I got from Macro above to new column CNT_active. Like this:

customer_id      start_date            cancel_date        CNT_terminate     CNT_active

8661                   29APR2002          18APR2005             154                       3733

8663                   10May2003          18OCT2006             142                       4890

 

sd.active and sd.Temo don't have any common key variables such as customer_id because one table is for active customers and the other is for cancelled customers, they are not overlapping. 

Could you please give me some advice? Thanks a lot!!

 

 

ballardw
Super User

Can you describe or show how you intend to use that data? Or what the question to answer with this process might be?

 

It really doesn't make a lot of sense to me why you are attaching a count of all active customers to each customer record of inactive customers?

 

runrunrunbunny
Calcite | Level 5

Hi ballardw, my purpose is calculate churn rate using CNT_terminate divided  by CNT_active if these two variables are in a same dataset. Like this:

customer_id      start_date            cancel_date        CNT_terminate     CNT_active

8661                29APR2002          30APR2004             154                        3243

8663                10May2003          30APR2005              142                       4904

runrunrunbunny
Calcite | Level 5

Hi, here's the description of my question. Thanks in advance everyone!!!

 

 

 

sd.Tel_data is a whole dataset including information of both active and cancelled customers.Like this:

 

customer_id      start_date            cancel_date

8661                   29APR2002          30APR2004

8662                   20APR2000                            .

8663                   10May2003          30APR2005

 

 

 

sd.active is subsetted from sd.Tel_data, which only contains active customers' information. Like this:

customer_id      start_date            cancel_date

8662                   20APR2000                            .

 

 

The code blow is to figure out number of active customers each month:

 

data  sd.active;

set sd.Tel_data  ;

if cancel_date EQ missing; 

run;

%macro active_account(input, output);

data &output;

set sd.active;

if start_date LE  &input;

run; 

%mend;

%active_account('30APR2004'D, sd.active_APR2004);

%active_account('30APR2005'D, sd.active_APR2005);

%active_account('30APR2006'D, sd.active_APR2006);

%active_account('30AUG2004'D, sd.active_AUG2004);

%active_account('30AUG2005'D, sd.active_AUG2005);

%active_account('30AUG2006'D, sd.active_AUG2006);

  

 

 

 

The dataset below called sd.Temo which contains information of cancelled customers. This dataset has variables like customer_id, start_date, cancel_date. (sd.Temo is also subsetted from whole dataset sd.Tel_data)

 

 

customer_id      start_date            cancel_date        CNT_terminate

8661                   29APR2002          30APR2004             154

8663                   10May2003          30APR2005              142

 

 

What I need to do is to derive a new variable called CNT_active which means number of active customers each month in dataset  sd.Temo, and populate the results of number of active customers each month I got from Macro above to new column CNT_active. My purpose is calculate churn rate using CNT_terminate divided  by CNT_active if these two variables are in a same dataset. Like this:

customer_id      start_date            cancel_date        CNT_terminate     CNT_active

8661                29APR2002          30APR2004             154                        3243

8663                10May2003          30APR2005              142                       4904

 

 

sd.active and sd.Temo don't have any common key variables such as customer_id because one table is for active customers and the other is for cancelled customers, they are not overlapping. 

But how to derive the new variable CNT_active  in dataset sd.Temo and populate the results from Macro which are 3243, 4904, 4904,3723, 4904, 4904 into variable CNT_active based on same month and year?

 

Here's the results from Macro above:

Screen Shot 2019-12-05 at 1.54.11 PM.png 

tomrvincent
Rhodochrosite | Level 12
I created a calendar dataset with one record per day (because a table like that is useful in so many ways) and then I can join to it using start_date and cancel_date.
Reeza
Super User
Look at the vtable table in sashelp dictionary. It has the Ns you want, so if you use a naming convention in your data set names you can easily pull that out. However, I would consider rewriting this from scratch to actually calculate what you want, which is easier.

hackathon24-white-horiz.png

The 2025 SAS Hackathon Kicks Off on June 11!

Watch the live Hackathon Kickoff to get all the essential information about the SAS Hackathon—including how to join, how to participate, and expert tips for success.

YouTube LinkedIn

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 2617 views
  • 0 likes
  • 5 in conversation