BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
AshokD
Obsidian | Level 7

Hi,

 

I've a sample input data below and the expected output format.

I need to find number of active customers as per the start date of the year and their based on end date, I've to keep the count added for each year. 

Example,

Customer A,H,I - Active from 2004, so we have 3 customers active from 2004 and I'm keeping the count added horizontally for each year and by 2009 one customer got dropped. So for 2010 - customer count was 2.

Similarly I've to populate the same for the next row active from  2005 and followed by  other rows.

Could you please suggest me the best approach to get this done ?

 

Thanks in advance for your help.

 

input data.PNG

 

 

 Output data.PNG

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

As you can see in the script I've shared it's just about creating the data that's best suited for the Proc you want to use it. 

I don't understand how you would get to the count of 2 in the highlighted cell - what it means - and though can't create the prep data step logic for it.

Patrick_0-1714818300113.png

 

See if below code gives you sufficient pointers to prepare the data the way you need it.

data have;
  input customer$ start_date:ddmmyy10. end_date:ddmmyy10.;
  format start_date end_date ddmmyy10.;
  datalines;
A 01/01/2004 31/12/9999
B 01/01/2005 01/01/2009
C 01/01/2005 31/12/9999
D 01/01/2006 01/01/2009
E 01/01/2006 31/12/9999
F 01/01/2007 01/01/2008
G 01/01/2007 31/12/9999
H 01/01/2004 01/01/2009
I 01/01/2004 31/12/9999
J 01/01/2008 01/01/2010
K 01/01/2008 31/12/9999
;

%let stop_year=%sysfunc(year(%sysfunc(today())));
%let stop_year=%sysfunc(year(%sysfunc(inputn(31jan2010,date9.))));
data prep;
  set have;
  start_year=year(start_date);
  do i=0 to intck('year',start_date,end_date);
    active_year=year(intnx('year',start_date,i));
    if active_year>&stop_year then leave;
    output;
  end;
  keep customer start_year active_year;
run;

data layout;
  do start_year=2004 to &stop_year;
    do active_year=2004 to &stop_year;
      output;
    end;
  end;
run;

proc tabulate data=prep classdata=layout;
  class start_year active_year;
  keylabel n=' ';
  table start_year, active_year;
run;

Patrick_1-1714818450318.png

 

 

View solution in original post

4 REPLIES 4
Patrick
Opal | Level 21

Going forward please provide data as text and not as screenshot. What's the most appreciated is if you provide sample data via a tested SAS script that creates the data similar to below data have step.

 

One way to go:

Create a table with one row per customer and year which then is very easy to use with Proc Report or Proc Tabulate.  

data have;
  input customer$ start_date:ddmmyy10. end_date:ddmmyy10.;
  format start_date end_date ddmmyy10.;
  datalines;
A 01/01/2004 31/12/2004
B 01/01/2005 31/12/2009
C 01/01/2004 31/12/9999
;

%let stop_year=%sysfunc(year(%sysfunc(today())));
data prep;
  set have;
  do i=0 to intck('year',start_date,end_date);
    year=year(intnx('year',start_date,i));
    if year>&stop_year then leave;
    year2=year;
    output;
  end;
  keep customer year year2;
run;

proc tabulate data=prep;
  class year year2;
  keylabel n=' ';
  table year, year2;
run;
AshokD
Obsidian | Level 7

@Patrick 

Thank you for sharing the script.

I've tried with the below sample data  but it looks not as expected. For example, customers who has start date with 2004 and they should be tracked within first row but the below output shows diagonally and also it is getting added in the next row along with 2005 and other years.

I've posted the expected output vs actual output. Could you please have a look and share your thoughts.

 

data have;
input customer$ start_date:ddmmyy10. end_date:ddmmyy10.;
format start_date end_date ddmmyy10.;
datalines;
A 01/01/2004 31/12/9999
B 01/01/2005 01/01/2009
C 01/01/2005 31/12/9999
D 01/01/2006 01/01/2009
E 01/01/2006 31/12/9999
F 01/01/2007 01/01/2008
G 01/01/2007 31/12/9999
H 01/01/2004 01/01/2009
I 01/01/2004 31/12/9999
J 01/01/2008 01/01/2010
K 01/01/2008 31/12/9999
;

%let stop_year=%sysfunc(year(%sysfunc(today())));
data prep;
set have;
do i=0 to intck('year',start_date,end_date);
year=year(intnx('year',start_date,i));
if year>&stop_year then leave;
year2=year;
output;
end;
keep customer year year2;
run;

proc tabulate data=prep;
class year year2;
keylabel n=' ';
table year, year2;
run;

Actual Output:-

Output data1.PNG

 

Expected Output:-

Output data.PNG

Patrick
Opal | Level 21

As you can see in the script I've shared it's just about creating the data that's best suited for the Proc you want to use it. 

I don't understand how you would get to the count of 2 in the highlighted cell - what it means - and though can't create the prep data step logic for it.

Patrick_0-1714818300113.png

 

See if below code gives you sufficient pointers to prepare the data the way you need it.

data have;
  input customer$ start_date:ddmmyy10. end_date:ddmmyy10.;
  format start_date end_date ddmmyy10.;
  datalines;
A 01/01/2004 31/12/9999
B 01/01/2005 01/01/2009
C 01/01/2005 31/12/9999
D 01/01/2006 01/01/2009
E 01/01/2006 31/12/9999
F 01/01/2007 01/01/2008
G 01/01/2007 31/12/9999
H 01/01/2004 01/01/2009
I 01/01/2004 31/12/9999
J 01/01/2008 01/01/2010
K 01/01/2008 31/12/9999
;

%let stop_year=%sysfunc(year(%sysfunc(today())));
%let stop_year=%sysfunc(year(%sysfunc(inputn(31jan2010,date9.))));
data prep;
  set have;
  start_year=year(start_date);
  do i=0 to intck('year',start_date,end_date);
    active_year=year(intnx('year',start_date,i));
    if active_year>&stop_year then leave;
    output;
  end;
  keep customer start_year active_year;
run;

data layout;
  do start_year=2004 to &stop_year;
    do active_year=2004 to &stop_year;
      output;
    end;
  end;
run;

proc tabulate data=prep classdata=layout;
  class start_year active_year;
  keylabel n=' ';
  table start_year, active_year;
run;

Patrick_1-1714818450318.png

 

 

AshokD
Obsidian | Level 7

@Patrick 

Apologies, You're right. 

I've updated the above highlighted cell wrongly. Now the data looks good and the logic works.

Thank you so much for your help.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 671 views
  • 0 likes
  • 2 in conversation