Solved: Re: Creating a Sequence Count with By Variables and Increasing Count O...

publicSynechism · Posted 05-21-2019 03:50 PM

I have:

data have;
input cid $ acctno $ opendate mmddyy10. hicredit memberid $ qtr mmddyy10.;
datalines;
123	x	1/1/2018	2000	ABC	3/1/2018
123	x	1/1/2018	2000	ABC	6/1/2018
123	x	1/1/2018	2000	ABC	9/1/2018
123	x	1/1/2018	2000	ABC	12/1/2018
123	y	1/1/2018	2300	XYZ	3/1/2019
123	y	1/1/2018	2300	XYZ	6/1/2019
123	y	1/1/2018	2300	XYZ	9/1/2019
123	y	1/1/2018	2300	XYZ	12/1/2019
123	z	1/1/2017	5000	XYZ	3/1/2018
123	z	1/1/2017	5000	XYZ	6/1/2018
123	z	1/1/2017	5000	XYZ	9/1/2018
123	z	1/1/2017	5000	XYZ	12/1/2018
123	z	1/1/2017	5000	XYZ	3/1/2019
123	z	1/1/2017	5000	XYZ	6/1/2019
123	z	1/1/2017	5000	XYZ	9/1/2019
123	z	1/1/2017	5000	XYZ	12/1/2019
123	w	1/1/2017	10000	XYZ	3/1/2018
123	w	1/1/2017	10000	XYZ	6/1/2018
123	w	1/1/2017	10000	XYZ	9/1/2018
123	w	1/1/2017	10000	XYZ	12/1/2018
123	w	1/1/2017	10000	XYZ	3/1/2019
123	w	1/1/2017	10000	XYZ	6/1/2019
123	w	1/1/2017	10000	XYZ	9/1/2019
123	w	1/1/2017	10000	XYZ	12/1/2019
;
run;

proc sort data=have;
by cid opendate qtr;
run;

I want a Sequence Variable, COUNT, BY CID OPENDATE based on increasing QTR values. This will create groups by QTR within CID OPENDATE groups.

For example, in CID OPENDATE group (123 01/01/2017),

QTR == 03/01/2018 => COUNT == 1

QTR == 06/01/2018 => COUNT == 2

etc.

I don't want:

proc sort data=have out=want;
by cid opendate qtr;
run;

data want2;
format opendate mmddyy10. qtr mmddyy10.;
set want;
count + 1;
by cid opendate qtr;
if first.cid or first.opendate or first.qtr then count = 1;
run;

If this can be accomplished by using a RETAIN statement, could you also explain how the RETAIN statement functions to allow this to work?

Thank you.

EDIT: Fixed missing info in code block

noling · Posted 05-21-2019 04:08 PM

This adds +1 for each new quarter BY cid and opendate. Seem right?

The retain statement holds the specified value between processing different records in the dataset. Order of operation is important here: compare values first, THEN assign the current value to the retain variable to check against it in the next pass.

data have;
input cid $ acctno $ opendate mmddyy10. hicredit memberid $ qtr mmddyy10.;
format opendate qtr mmddyy10.;
datalines;
123 x 1/1/2018 2000 ABC 3/1/2018
123 x 1/1/2018 2000 ABC 6/1/2018
123 x 1/1/2018 2000 ABC 9/1/2018
123 x 1/1/2018 2000 ABC 12/1/2018
123 y 1/1/2018 2300 XYZ 3/1/2019
123 y 1/1/2018 2300 XYZ 6/1/2019
123 y 1/1/2018 2300 XYZ 9/1/2019
123 y 1/1/2018 2300 XYZ 12/1/2019
123 z 1/1/2017 5000 XYZ 3/1/2018
123 z 1/1/2017 5000 XYZ 6/1/2018
123 z 1/1/2017 5000 XYZ 9/1/2018
123 z 1/1/2017 5000 XYZ 12/1/2018
123 z 1/1/2017 5000 XYZ 3/1/2019
123 z 1/1/2017 5000 XYZ 6/1/2019
123 z 1/1/2017 5000 XYZ 9/1/2019
123 z 1/1/2017 5000 XYZ 12/1/2019
123 w 1/1/2017 10000 XYZ 3/1/2018
123 w 1/1/2017 10000 XYZ 6/1/2018
123 w 1/1/2017 10000 XYZ 9/1/2018
123 w 1/1/2017 10000 XYZ 12/1/2018
123 w 1/1/2017 10000 XYZ 3/1/2019
123 w 1/1/2017 10000 XYZ 6/1/2019
123 w 1/1/2017 10000 XYZ 9/1/2019
123 w 1/1/2017 10000 XYZ 12/1/2019
;
run;proc sort data=have;by cid opendate qtr;run;

data want;
	set have;
	by cid opendate;
	retain prev_qtr;
	if first.opendate then do;
		count=0;
		call missing(prev_qtr);
	end;
	if qtr ne prev_qtr then count+1;

	*hold value for next record;
	prev_qtr= qtr;
run;

Register today and join us virtually on June 16!
sasglobalforum.com | #SASGF

View now: on-demand content for SAS users

View solution in original post

noling · Posted 05-21-2019 04:08 PM

This adds +1 for each new quarter BY cid and opendate. Seem right?

The retain statement holds the specified value between processing different records in the dataset. Order of operation is important here: compare values first, THEN assign the current value to the retain variable to check against it in the next pass.

data have;
input cid $ acctno $ opendate mmddyy10. hicredit memberid $ qtr mmddyy10.;
format opendate qtr mmddyy10.;
datalines;
123 x 1/1/2018 2000 ABC 3/1/2018
123 x 1/1/2018 2000 ABC 6/1/2018
123 x 1/1/2018 2000 ABC 9/1/2018
123 x 1/1/2018 2000 ABC 12/1/2018
123 y 1/1/2018 2300 XYZ 3/1/2019
123 y 1/1/2018 2300 XYZ 6/1/2019
123 y 1/1/2018 2300 XYZ 9/1/2019
123 y 1/1/2018 2300 XYZ 12/1/2019
123 z 1/1/2017 5000 XYZ 3/1/2018
123 z 1/1/2017 5000 XYZ 6/1/2018
123 z 1/1/2017 5000 XYZ 9/1/2018
123 z 1/1/2017 5000 XYZ 12/1/2018
123 z 1/1/2017 5000 XYZ 3/1/2019
123 z 1/1/2017 5000 XYZ 6/1/2019
123 z 1/1/2017 5000 XYZ 9/1/2019
123 z 1/1/2017 5000 XYZ 12/1/2019
123 w 1/1/2017 10000 XYZ 3/1/2018
123 w 1/1/2017 10000 XYZ 6/1/2018
123 w 1/1/2017 10000 XYZ 9/1/2018
123 w 1/1/2017 10000 XYZ 12/1/2018
123 w 1/1/2017 10000 XYZ 3/1/2019
123 w 1/1/2017 10000 XYZ 6/1/2019
123 w 1/1/2017 10000 XYZ 9/1/2019
123 w 1/1/2017 10000 XYZ 12/1/2019
;
run;proc sort data=have;by cid opendate qtr;run;

data want;
	set have;
	by cid opendate;
	retain prev_qtr;
	if first.opendate then do;
		count=0;
		call missing(prev_qtr);
	end;
	if qtr ne prev_qtr then count+1;

	*hold value for next record;
	prev_qtr= qtr;
run;

Register today and join us virtually on June 16!
sasglobalforum.com | #SASGF

View now: on-demand content for SAS users

publicSynechism · Posted 05-21-2019 04:21 PM

Thank you @noling.

I used this as a workaround.

data have;
input cid $ acctno $ opendate mmddyy10. hicredit memberid $ qtr mmddyy10.;
datalines;
123	x	1/1/2018	2000	ABC	3/1/2018
123	x	1/1/2018	2000	ABC	6/1/2018
123	x	1/1/2018	2000	ABC	9/1/2018
123	x	1/1/2018	2000	ABC	12/1/2018
123	y	1/1/2018	2300	XYZ	3/1/2019
123	y	1/1/2018	2300	XYZ	6/1/2019
123	y	1/1/2018	2300	XYZ	9/1/2019
123	y	1/1/2018	2300	XYZ	12/1/2019
123	z	1/1/2017	5000	XYZ	3/1/2018
123	z	1/1/2017	5000	XYZ	6/1/2018
123	z	1/1/2017	5000	XYZ	9/1/2018
123	z	1/1/2017	5000	XYZ	12/1/2018
123	z	1/1/2017	5000	XYZ	3/1/2019
123	z	1/1/2017	5000	XYZ	6/1/2019
123	z	1/1/2017	5000	XYZ	9/1/2019
123	z	1/1/2017	5000	XYZ	12/1/2019
123	w	1/1/2017	10000	XYZ	3/1/2018
123	w	1/1/2017	10000	XYZ	6/1/2018
123	w	1/1/2017	10000	XYZ	9/1/2018
123	w	1/1/2017	10000	XYZ	12/1/2018
123	w	1/1/2017	10000	XYZ	3/1/2019
123	w	1/1/2017	10000	XYZ	6/1/2019
123	w	1/1/2017	10000	XYZ	9/1/2019
123	w	1/1/2017	10000	XYZ	12/1/2019
;
run;

proc sql;
create table counts as
select distinct cid, opendate, qtr
	from have
order by cid, opendate, qtr
;
quit;

data counts;
format opendate mmddyy10. qtr mmddyy10.;
set counts;
count+1;
by cid opendate qtr;
if first.opendate then count=1;
run;

proc sql;
create table want as
select a.*, b.count
	from have a left join counts b
	on a.cid=b.cid and a.opendate=b.opendate and a.qtr=b.qtr
order by cid, opendate, qtr
;
quit;

But you're code is what I was looking for.

kulbshar · Posted 05-21-2019 04:17 PM

try this.

data test1(drop=acctno hicredit memberid);
set have;
retain count 0;
by cid opendate qtr;
cid_f=first.cid;
cid_l=last.cid;
opendate_f= first.opendate;
opendate_l=last.opendate;
qtr_f= first.qtr;
qtr_l=last.qtr;
if last.qtr=1 then count=count+1;
run;

publicSynechism · Posted 05-21-2019 04:29 PM

This did not produce the results I was looking for, but Noling's provided a solution. Thanks.

kulbshar · Posted 05-21-2019 04:40 PM

Sorry I didn't read your requirement fully. glad you got the result. Thanks.

Tom · Posted 05-21-2019 04:19 PM

I can't understand what the don't want part is in your question. Are you saying you tried that and it didn't get the answer you want?

From the problem description it sounds like you want to number the quarters within the groups defined by CID and OPENDATE.

So something like:

data want;
  set have;
  by cid opendate qtr;
  if first.opendate then count=0;
  count + first.qtr;
run;

Is it possible you have gaps in your data? So that you have one record from 2018Q1 and the next from 2018Q3 but none from 2018Q2? If so do you want the count to go 1,3 instead of 1,2? If so you might want.

data want;
  set have;
  by cid opendate qtr;
  count + intck('qtr',qtr,lag(qtr));
  if first.opendate then count=1;
run;

publicSynechism · Posted 05-21-2019 04:52 PM

I provided it as clarification if my description wasn't clear. Colleagues were confused by what I was trying to do and I received several suggestions producing the "don't want" results.

Correct, your solution is producing the desired results.

Yes, there are gaps in the actual dataset and the second code block may come in handy.

Thanks for sharing @Tom.

Creating a Sequence Count with By Variables and Increasing Count Only when Value Changes in Target

Re: Creating a Sequence Count with By Variables and Increasing Count Only when Value Changes in Targ

Re: Creating a Sequence Count with By Variables and Increasing Count Only when Value Changes in Targ

Re: Creating a Sequence Count with By Variables and Increasing Count Only when Value Changes in Targ

Re: Creating a Sequence Count with By Variables and Increasing Count Only when Value Changes in Targ

Re: Creating a Sequence Count with By Variables and Increasing Count Only when Value Changes in Targ

Re: Creating a Sequence Count with By Variables and Increasing Count Only when Value Changes in Targ

Re: Creating a Sequence Count with By Variables and Increasing Count Only when Value Changes in Targ

Re: Creating a Sequence Count with By Variables and Increasing Count Only when Value Changes in Targ

SAS Innovate 2025: Call for Content

Classroom Training Available!