BookmarkSubscribeRSS Feed
Ronein
Meteorite | Level 14
Hello
I have a data set with information of income sources for each customer for 12 months.
There can be several type of incomes sources:
Ind_Cash_Deposit(Income from cash deposit)
Ind_Check_Deposit(Income from cheque deposit)
Ind_Wage_Income((Income from wage)
Ind_SocialSeurity_Income(Income from social security)
I want to check for each customer-
IF there is an income source that was "active" in first 6 months and stopped be "active" in last 6  months.
The definition of  "Active" is that it is used at least 4 months from 6 months.
For example:
Customer 111:
In First 6 months active income sources are- Wage
In last 6 months active income sources are- Nothing
So in result wanted field will write "Wage"
Customer 222:
In First 6 months active income sources are- Wage+Social Insurance 
In last 6 months active income sources are- Social Insurance 
So in result wanted field will write " Social "
Customer 333:
In First 6 months active income sources are- Wage 
In last 6 months active income sources are- Wage
So in result wanted field will write " No_Change"
 
May show code that provide the result data set?
Please not that in result data set each customer will have one row only with following fields:
Active_Income_Source_First6Mon
Active_Income_Source_Last6Mon
Removed_Income_Sources
Data have;
input cust_ID YYMM Ind_Cash_Deposit Ind_Check_Deposit Ind_Wage_Income Ind_SocialSeurity_Income;
cards;
111 2212 0 1 1 0
111 2301 0 0 1 0
111 2302 0 0 1 0
111 2303 1 1 1 0
111 2304 0 0 1 0
111 2305 0 0 1 0
111 2306 0 0 1 0
111 2307 0 0 1 0
111 2308 0 0 1 0
111 2309 0 0 0 0
111 2310 0 0 0 0
111 2311 0 0 0 0
222 2212 0 0 1 1
222 2301 0 0 1 1
222 2302 0 0 1 1
222 2303 0 0 1 1
222 2304 0 0 1 1
222 2305 0 0 1 1
222 2306 0 0 1 1
222 2307 0 0 1 1
222 2308 0 0 1 1
222 2309 0 0 0 1
222 2310 0 0 0 1
222 2311 0 0 0 1
333 2212 0 0 1 0
333 2301 0 0 1 0
333 2302 0 0 1 1
333 2303 0 0 1 1
333 2304 0 0 1 0
333 2305 0 0 1 0
333 2306 0 0 1 0
333 2307 0 0 1 0
333 2308 0 0 1 0
333 2309 0 0 1 0
333 2310 0 0 1 1
333 2311 0 0 1 0
;
Date=input(cat(YYMM),YYMMN4.):
Format date date9.;
Run;
5 REPLIES 5
Patrick
Opal | Level 21

"IF there is an income source that was "active" in first 6 months and stopped be "active" in last 6  months"

So you're not interested in a change where a source was inactive in the first 6 months and then active afterwards?

 

Is your real data also already sorted by cust_id and month_YYYYMM?

Does your real data also only contain 12 months per customer?

Ronein
Meteorite | Level 14
Always 12 months follow up for each customer. If there are added sources of income then put it in plus for example "wage+". If lost income sources than put it in minus for example " wage-"
ballardw
Super User

Yet another case of the first thing needed to deal with "6 months" is fixing the "date" value for every single record.

 

Repeatedly.

 

If you are going to ask us about intervals then at least do the step yourself to make sure there is an actual DATE value before asking for help.

You've been shown how multiple times.

Ronein
Meteorite | Level 14

Here is the raw data set.

I added date var

Data _have;
input cust_ID YYMM Ind_Cash_Deposit Ind_Check_Deposit Ind_Wage_Income Ind_SocialSeurity_Income;
cards;
111 2212 0 1 1 0
111 2301 0 0 1 0
111 2302 0 0 1 0
111 2303 1 1 1 0
111 2304 0 0 1 0
111 2305 0 0 1 0
111 2306 0 0 1 0
111 2307 0 0 1 0
111 2308 0 0 1 0
111 2309 0 0 0 0
111 2310 0 0 0 0
111 2311 0 0 0 0
222 2212 0 0 1 1
222 2301 0 0 1 1
222 2302 0 0 1 1
222 2303 0 0 1 1
222 2304 0 0 1 1
222 2305 0 0 1 1
222 2306 0 0 1 1
222 2307 0 0 1 1
222 2308 0 0 1 1
222 2309 0 0 0 1
222 2310 0 0 0 1
222 2311 0 0 0 1
333 2212 0 0 1 0
333 2301 0 0 1 0
333 2302 0 0 1 1
333 2303 0 0 1 1
333 2304 0 0 1 0
333 2305 0 0 1 0
333 2306 0 0 1 0
333 2307 0 0 1 0
333 2308 0 0 1 0
333 2309 0 0 1 0
333 2310 0 0 1 1
333 2311 0 0 1 0
444 2212 0 0 0 0
444 2301 0 0 0 0
444 2302 0 0 0 0
444 2303 0 0 0 0
444 2304 1 1 0 0
444 2305 0 0 0 0
444 2306 0 0 0 0
444 2307 0 0 1 0
444 2308 0 0 1 0
444 2309 0 0 1 0
444 2310 0 0 1 0
444 2311 0 0 1 0
;
Run;

data have;
retain cust_ID YYMM date Ind_Cash_Deposit Ind_Check_Deposit Ind_Wage_Income Ind_SocialSeurity_Income;
set _have;
Date=input(put(YYMM,4.),yymmn4.);
format date date9.;
Run;
LinusH
Tourmaline | Level 20

It still a bit unclear, your example forge" 222 seem inconsistent, shouldn't that be "Wage" in Removed_Income_Sources?

It wold also have been good if you provided test data for scenarios when more than one type of income was removed.

 

Second, I think it is clumsy to have information as concatenated strings in one filed, it makes it pretty seless for anythin else than making list reports.

That said, I created Removed_Income_Sources in the following code, but created active flags per half year and original field.

You can create Active_Income_Source_First6Mon and Active_Income_Source_Last6Mon yourself using the same startegy.

proc sql;
	create table h1_2023 as
		select 
			cust_id, 
			case when sum(Ind_Cash_Deposit) < 4 then 0 else 1 end as Ind_Cash_Deposit_F6M, 
			case when sum(Ind_Check_Deposit) < 4 then 0 else 1 end as Ind_Check_Deposit_F6M, 
			case when sum(Ind_Wage_Income) < 4 then 0 else 1 end as Ind_Wage_Income_F6M, 
			case when sum(Ind_SocialSeurity_Income) < 4 then 0 else 1 end as Ind_SocialSeurity_Income_F6M 
		from have
		where date < '01Jul2023'd
		group by cust_id
;
	create table h2_2023 as
		select 
			cust_id, 
			case when sum(Ind_Cash_Deposit) < 4 then 0 else 1 end as Ind_Cash_Deposit_L6M, 
			case when sum(Ind_Check_Deposit) < 4 then 0 else 1 end as Ind_Check_Deposit_L6M, 
			case when sum(Ind_Wage_Income) < 4 then 0 else 1 end as Ind_Wage_Income_L6M, 
			case when sum(Ind_SocialSeurity_Income) < 4 then 0 else 1 end as Ind_SocialSeurity_Income_L6M
		from have
		where date >= '01Jul2023'd
		group by cust_id
;
quit;

data want;
	merge work.h1_2023 work.h2_2023;
	by cust_id;
	length Removed_Income_Sources $50;
	if Ind_Cash_Deposit_F6M and not Ind_Cash_Deposit_L6M then Removed_Income_Sources = 'Cash';
	if Ind_Check_Deposit_F6M and not Ind_Check_Deposit_L6M then Removed_Income_Sources = cat(trimn(Removed_Income_Sources),' ','Check');
	if Ind_Wage_Income_F6M and not Ind_Wage_Income_L6M then Removed_Income_Sources = cat(trimn(Removed_Income_Sources),' ','Wage');
	if Ind_SocialSeurity_Income_F6M and not Ind_SocialSeurity_Income_L6M then Removed_Income_Sources = cat(trimn(Removed_Income_Sources),' ','Social');
	if Removed_Income_Sources = ' ' then Removed_Income_Sources = 'No_change';
	else Removed_Income_Sources = left(translate(compbl(Removed_Income_Sources),'+',' '));
run;

 

Data never sleeps

SAS Innovate 2025: Register Today!

 

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 988 views
  • 3 likes
  • 4 in conversation