BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
sameer112217
Quartz | Level 8

Hello

 

Suppose I have three variables... account no.... month and transaction amount

 

how can i identify if the person has not made any transaction for more than three months

 

account no

 

i am aware of the formula

 

max(of jan2016-dec2016) eq 0 then broken.. that is one month

 

something like this ... if it finds 0 then give 1 values and increment it by 1 and if it is more than 3 then broken

 

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

Some more fun-

 

data account;

input account $ transactionamount    month $;

datalines;

FORD   100                           JAN

FORD   100                           FEB

FORD   100                           MAR

FORD   .                               APR

FORD   .                               MAY

FORD   .                              JUN

FORD   100                           JUL

FORD   100                           AUG

FORD   100                           SEP

FORD   100                           OCT

FORD   100                           NOV

FORD   100                           DEC

TOY      100                           JAN

TOY   100                           FEB

TOY   100                           MAR

TOY   100                           APR

TOY   100                           MAY

TOY   100                          JUN

TOY   100                           JUL

TOY   100                           AUG

TOY   100                           SEP

TOY   100                           OCT

TOY   100                           NOV

TOY   100                           DEC

;

 

data want;

 declare hash myhash();

    myhash.definekey('account','transactionamount');

    myhash.definedata('remark');

    myhash.definedone();

do until(last.account);

do _n_=1 by 1 until(mod(_n_,3)=0);

  set account;

by account notsorted;

if _n_=1 then temp=0;

temp+transactionamount;

if _n_=3 and temp=0 then do;

remark='broken';

myhash.add();

end;

else call missing(remark);

end;

end;

do until(last.account);

set account;

by account notsorted;

if myhash.find() ne 0 then call missing(remark);

output;

end;

drop temp;

run;

 

Regards,

Naveen Srinivasan

View solution in original post

14 REPLIES 14
novinosrin
Tourmaline | Level 20

Please post a sample of your input data and a small sample of output data( exactly how you want). Transaction data is a piece of cake. Once I have your req in figures, I will give you the code. It also helps others too. Thanks!

ChrisBrooks
Ammonite | Level 13

I believe you're overthinking the problem - if you sort your data by accountno and month then, in a data step, use the lag function to look back one, two and three periods by accountno then you'll find it much easier. I suspect your solution would run into multiple complexities with a rolling three month span.

 

One thing to be careful of with lag() though is - "don't use it inside conditional statements" or you'll get incorrect results.

 

If you post a datastep to create some sample data and some desired output we can give you a more detailed answer.

sameer112217
Quartz | Level 8

Sample dataset

 

data account;

account  transactionamount    month

FORD   100                           JAN

FORD   100                           FEB

FORD   100                           MAR

FORD   .                               APR

FORD   .                               MAY

FORD   .                              JUN

FORD   100                           JUL

FORD   100                           AUG

FORD   100                           SEP

FORD   100                           OCT

FORD   100                           NOV

FORD   100                           DEC

TOY      100                           JAN

TOY   100                           FEB

TOY   100                           MAR

TOY   100                           APR

TOY   100                           MAY

TOY   100                          JUN

TOY   100                           JUL

TOY   100                           AUG

TOY   100                           SEP

TOY   100                           OCT

TOY   100                           NOV

TOY   100                           DEC

 

I cannot officialy post work dataset but something like this... suppose if a customer has not made transaction for three consecutive months it needs to identified and in new column remark should mention.... broken

 

can someone help me with code. thanks everyone

 

 

 

novinosrin
Tourmaline | Level 20

data account;

input account $ transactionamount    month $;

datalines;

FORD   100                           JAN

FORD   100                           FEB

FORD   100                           MAR

FORD   .                               APR

FORD   .                               MAY

FORD   .                              JUN

FORD   100                           JUL

FORD   100                           AUG

FORD   100                           SEP

FORD   100                           OCT

FORD   100                           NOV

FORD   100                           DEC

TOY      100                           JAN

TOY   100                           FEB

TOY   100                           MAR

TOY   100                           APR

TOY   100                           MAY

TOY   100                          JUN

TOY   100                           JUL

TOY   100                           AUG

TOY   100                           SEP

TOY   100                           OCT

TOY   100                           NOV

TOY   100                           DEC

;

 

 

 

data want;

do until(last.account);

do _n_=1 by 1 until(mod(_n_,3)=0);

if _n_=1 then temp=0;

set account;

by account notsorted;

temp+transactionamount;

if _n_=3 and temp=0 then remark='broken';

else call missing(remark);

output;

end;

end;

drop temp;

run;

 

 

novinosrin
Tourmaline | Level 20

Some more fun-

 

data account;

input account $ transactionamount    month $;

datalines;

FORD   100                           JAN

FORD   100                           FEB

FORD   100                           MAR

FORD   .                               APR

FORD   .                               MAY

FORD   .                              JUN

FORD   100                           JUL

FORD   100                           AUG

FORD   100                           SEP

FORD   100                           OCT

FORD   100                           NOV

FORD   100                           DEC

TOY      100                           JAN

TOY   100                           FEB

TOY   100                           MAR

TOY   100                           APR

TOY   100                           MAY

TOY   100                          JUN

TOY   100                           JUL

TOY   100                           AUG

TOY   100                           SEP

TOY   100                           OCT

TOY   100                           NOV

TOY   100                           DEC

;

 

data want;

 declare hash myhash();

    myhash.definekey('account','transactionamount');

    myhash.definedata('remark');

    myhash.definedone();

do until(last.account);

do _n_=1 by 1 until(mod(_n_,3)=0);

  set account;

by account notsorted;

if _n_=1 then temp=0;

temp+transactionamount;

if _n_=3 and temp=0 then do;

remark='broken';

myhash.add();

end;

else call missing(remark);

end;

end;

do until(last.account);

set account;

by account notsorted;

if myhash.find() ne 0 then call missing(remark);

output;

end;

drop temp;

run;

 

Regards,

Naveen Srinivasan

ballardw
Super User

Is you data absolutely always going to be from within a single calendar year? If not you will need a year component to compare such things and an actual SAS date value would be better.

Kurt_Bremser
Super User

Use lag(), as @ChrisBrooks suggested, and a retained counter to prevent a false alarm in the first 2 obs of an account.

- do a data step with by account;

- at first account, set counter to 1, else increment by 1

- retrieve previous two values by using lag(transactionamount) and lag2(transactionamount)

- if counter > 2 and transactionamount and all the lagged values are missing, set remark to 'broken'

sameer112217
Quartz | Level 8
It would be great you can write the code as i never used lag statement
ChrisBrooks
Ammonite | Level 13

I think this should do the trick - note that if FORD for July is missing then flag will also equal 1 for that month - you'll need to amend the code if that isn't what you want

 

data have;
input account $ transactionamount    month $;
datalines;
FORD   100                           JAN
FORD   100                           FEB
FORD   100                           MAR
FORD   .                               APR
FORD   .                               MAY
FORD   .                              JUN
FORD   100                           JUL
FORD   100                           AUG
FORD   100                           SEP
FORD   100                           OCT
FORD   100                           NOV
FORD   100                           DEC
TOY      100                           JAN
TOY   100                           FEB
TOY   100                           MAR
TOY   100                           APR
TOY   100                           MAY
TOY   100                          JUN
TOY   100                           JUL
TOY   100                           AUG
TOY   100                           SEP
TOY   100                           OCT
TOY   100                           NOV
TOY   100                           DEC
;
run;

proc sql;
	create table have_sorted
	as select *
	from have
	order by account,
	case
		when month="JAN" then 1
		when month="FEB" then 2
		when month="MAR" then 3
		when month="APR" then 4
		when month="MAY" then 5
		when month="JUN" then 6
		when month="JUL" then 7
		when month="AUG" then 8
		when month="SEP" then 9
		when month="OCT" then 10
		when month="NOV" then 11
		when month="DEC" then 12
		else 99
	end
	;
quit;

data want;
	set have_sorted;
	by account;
	if first.account then counter=1;
	else counter=counter+1;
	
	lag1=lag1(transactionamount);
	lag2=lag2(transactionamount);
	if missing(transactionamount) and missing(lag1) and missing(lag2) and counter>2 then flag=1;
	else flag=0;
	
	retain counter;
run;
ChrisNZ
Tourmaline | Level 20

Like this?


data WANT;
  set ACCOUNT;
  if      TRANSACTIONAMOUNT =. 
   & lag (TRANSACTIONAMOUNT)=. 
   & lag2(TRANSACTIONAMOUNT)=. 
   & _N_>3
   & lag (ACCOUNT) = ACCOUNT
   & lag2(ACCOUNT) = ACCOUNT
   then FLAG='Broken';
proc print noobs; run;

 

ACCOUNT TRANSACTIONAMOUNT MONTH FLAG
FORD 100 JAN  
FORD 100 FEB  
FORD 100 MAR  
FORD   APR  
FORD   MAY  
FORD   JUN Broken
FORD 100 JUL  
FORD 100 AUG  
FORD 100 SEP  
FORD 100 OCT  
FORD 100 NOV  
FORD 100 DEC  
TOY 100 JAN  
TOY 100 FEB  
TOY 100 MAR  
TOY 100 APR  
TOY 100 MAY  
TOY 100 JUN  
TOY 100 JUL  
TOY 100 AUG  
TOY 100 SEP  
TOY 100 OCT  
TOY 100 NOV  
TOY 100 DEC  

 

sameer112217
Quartz | Level 8
Data is not for single calender year but for last 3 years. By the way all replies are high class and helpful and i tested each reply from every user gave me new insight
ballardw
Super User

@sameer112217 wrote:
Data is not for single calender year but for last 3 years. By the way all replies are high class and helpful and i tested each reply from every user gave me new insight

Then you will need the year variable to sort by within the products as well as the numeric month.

Ksharp
Super User
The following gave you all the number of adjacent missing value.
It is easy to pick up the number greater than three.



 data have;
input account $ transactionamount    month $;
datalines;
FORD   100                           JAN
FORD   100                           FEB
FORD   100                           MAR
FORD   .                               APR
FORD   .                               MAY
FORD   .                              JUN
FORD   100                           JUL
FORD   100                           AUG
FORD   100                           SEP
FORD   100                           OCT
FORD   100                           NOV
FORD   100                           DEC
TOY      100                           JAN
TOY   100                           FEB
TOY   100                           MAR
TOY   100                           APR
TOY   100                           MAY
TOY   100                          JUN
TOY   100                           JUL
TOY   100                           AUG
TOY   100                           SEP
TOY   100                           OCT
TOY   100                           NOV
TOY   100                           DEC
;
run;
data temp;
 set have;
 by account transactionamount notsorted;
 group+first.transactionamount;
run;
proc sql;
create table want as
 select account,group,count(*) as n_missing
  from temp
    where transactionamount is missing
       group by account,group;
quit;


sameer112217
Quartz | Level 8

Thank you everyone. Each reply and solution was unique and high quality material

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 14 replies
  • 2010 views
  • 2 likes
  • 7 in conversation