Hello
For each customer there are 6 rows for scores in 6 months.
I want to calculate for each customer number of changes in score during 6 months.
for example:
ID 1 -there were 2 changes during the period (from 9 to 8 and from 8 to 9)
ID 2 -there were 3 changes during the period (from 7 to 6 and from 6 to 7 and from 7 to 😎
ID 3 -there were 2 changes during the period (from . to 9 and from 9 to 10
Data rawdata;
Input ID month score;
cards;
1 1807 9
1 1808 9
1 1809 9
1 1810 8
1 1811 9
1 1812 9
2 1807 7
2 1808 6
2 1809 7
2 1810 7
2 1811 8
2 1812 8
3 1807 .
3 1808 9
3 1809 9
3 1810 9
3 1811 10
3 1812 10
;
run;
)
Use by-group processing, the lag() function, and an automatically retained variable:
Data rawdata;
Input ID month score;
cards;
1 1807 9
1 1808 9
1 1809 9
1 1810 8
1 1811 9
1 1812 9
2 1807 7
2 1808 6
2 1809 7
2 1810 7
2 1811 8
2 1812 8
3 1807 .
3 1808 9
3 1809 9
3 1810 9
3 1811 10
3 1812 10
;
run;
data want;
set rawdata;
by id;
oldscore = lag(score);
if first.id then changes = 0;
else if score ne oldscore then changes + 1;
if last.id then output;
keep id changes;
run;
proc print data=want noobs;
run;
Result:
ID changes 1 2 2 3 3 2
Use by-group processing, the lag() function, and an automatically retained variable:
Data rawdata;
Input ID month score;
cards;
1 1807 9
1 1808 9
1 1809 9
1 1810 8
1 1811 9
1 1812 9
2 1807 7
2 1808 6
2 1809 7
2 1810 7
2 1811 8
2 1812 8
3 1807 .
3 1808 9
3 1809 9
3 1810 9
3 1811 10
3 1812 10
;
run;
data want;
set rawdata;
by id;
oldscore = lag(score);
if first.id then changes = 0;
else if score ne oldscore then changes + 1;
if last.id then output;
keep id changes;
run;
proc print data=want noobs;
run;
Result:
ID changes 1 2 2 3 3 2
Are there always exactly 6 months of data or would there be some sort of rolling window? I would suggest creating a change from previous result flag, then you can simply sum() the flags based on a 6 month window, perhaps something like:
data have; set rawdata; by id; if not first.id and score ne lag(score) then chg_flg=1; run;
In this way you can say sum chg_flg over start of period to end of period = changes in the 6 months.
Thank you both for very nice solutions that you sent.
I just summarize your solutions here.
Data rawdata;
Input ID month score;
cards;
1 1807 9
1 1808 9
1 1809 9
1 1810 8
1 1811 9
1 1812 9
2 1807 7
2 1808 6
2 1809 7
2 1810 7
2 1811 8
2 1812 8
3 1807 .
3 1808 9
3 1809 9
3 1810 9
3 1811 10
3 1812 10
;
run;
/****Way1******/
/****Way1******/
/****Way1******/
/****Way1******/
Data want;
set rawdata;
by id;
oldscore = lag(score);
if first.id then No_changes = 0;
else if score ne oldscore then No_changes + 1;
if last.id then output;
keep id No_changes;
run;
proc print data=want noobs;
run;
/****Way2******/
/****Way2******/
/****Way2******/
/****Way2******/
data HelpTbl;
set rawdata;
by id;
if not first.id and score ne lag(score) then chg_flg=1;
run;
PROC SQL;
create table want2 as
select ID,
sum( chg_flg) as No_changes
from HelpTbl
group by ID
;
QUIT;
proc print data=want2 noobs;
run;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.