BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
rahulsaha2127
Fluorite | Level 6

Hi,

I have the following data:

UserDateTransaction Amount
A10-Aug-21120
A11-Aug-21110
A15-Aug-21600
A20-Aug-21585
A21-Aug-21130
B7-Aug-21300
B8-Aug-21350
B8-Aug-211450
B9-Aug-2110000

 

I want to flag those users who does a transaction with value greater than 5 times the average of all the last transactions and remove these anomalous transactions from the average calculation in the next step.

 

So my table would look like:

 

UserDateTransaction AmountAverageFlag
A10-Aug-2112000
A11-Aug-211101200
A15-Aug-216001151
A20-Aug-215851151
A21-Aug-211301150
B7-Aug-2130000
B8-Aug-213503000
B8-Aug-2114503250
B9-Aug-211100021001

 

So here, For each transaction of an user,we need to look at the average of the last transactions and check if the current transaction amount > 5 times the average amount. If yes, then flag variable will take the value 1. Now, When we see such a case we will not include that particular transaction for average calculation in the next step, so that we don't miss the next anomalous transaction due to the increase in the average of the last transactions.

 

Appreciate any suggestions!

 

1 ACCEPTED SOLUTION

Accepted Solutions
AMSAS
SAS Super FREQ

This can be achieved by using FIRST. LAST. Processing

 

/* Create test input */

data have ;
	input user $ date date7. amount ;
cards ;
A 10Aug21 120
A 11Aug21 110
A 15Aug21 600
A 20Aug21 585
A 21Aug21 130
B 7Aug21 300
B 8Aug21 350
B 8Aug21 1450
B 9Aug21 10000
;
run ;

/* Sort by user and date */
proc sort data=have out=srtd ;
	by user date ;
run ;

data want ;
	/* retain value of sum and count over iterations of the data step */
	retain 
		sum  0 
		count 0 ;
	/* read the sorted dataset by user/date */
	set srtd ;
	by user date ;
	/* If it's the 1st occurrance of a user */
	if first.user then do ;
		sum=amount ;
		count=1 ;
		flag=0 ;
	end ;
	/* If it is NOT the 1st occurrance of a user */
	else do ;
		/* Calculate the average prior to this observation */
		avg=sum/count ;
		/* If amount is less than 5x the average */
		if amount<avg*5 then do ;
			/* increate the sum (total) and count of valid observations */
			sum=sum+amount ;
			count=count+1 ;
			/* Set flag to 0 */
 			flag=0 ;
		end ;
		/* If amount is greater than 5x the average */
		else do ;
			/* Set flag to 1 */
			flag=1 ;
		end ;
	end ;
	/* Output observation */
	output ;
run ;

View solution in original post

1 REPLY 1
AMSAS
SAS Super FREQ

This can be achieved by using FIRST. LAST. Processing

 

/* Create test input */

data have ;
	input user $ date date7. amount ;
cards ;
A 10Aug21 120
A 11Aug21 110
A 15Aug21 600
A 20Aug21 585
A 21Aug21 130
B 7Aug21 300
B 8Aug21 350
B 8Aug21 1450
B 9Aug21 10000
;
run ;

/* Sort by user and date */
proc sort data=have out=srtd ;
	by user date ;
run ;

data want ;
	/* retain value of sum and count over iterations of the data step */
	retain 
		sum  0 
		count 0 ;
	/* read the sorted dataset by user/date */
	set srtd ;
	by user date ;
	/* If it's the 1st occurrance of a user */
	if first.user then do ;
		sum=amount ;
		count=1 ;
		flag=0 ;
	end ;
	/* If it is NOT the 1st occurrance of a user */
	else do ;
		/* Calculate the average prior to this observation */
		avg=sum/count ;
		/* If amount is less than 5x the average */
		if amount<avg*5 then do ;
			/* increate the sum (total) and count of valid observations */
			sum=sum+amount ;
			count=count+1 ;
			/* Set flag to 0 */
 			flag=0 ;
		end ;
		/* If amount is greater than 5x the average */
		else do ;
			/* Set flag to 1 */
			flag=1 ;
		end ;
	end ;
	/* Output observation */
	output ;
run ;

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 1 reply
  • 621 views
  • 2 likes
  • 2 in conversation