Hello
For each customer there are 6 rows for scores in 6 months.
I want to calculate for each customer number of changes in score during 6 months.
for example:
ID 1 -there were 2 changes during the period (from 9 to 8 and from 8 to 9)
ID 2 -there were 3 changes during the period (from 7 to 6 and from 6 to 7 and from 7 to 😎
ID 3 -there were 2 changes during the period (from . to 9 and from 9 to 10
Data rawdata;
Input ID month score;
cards;
1 1807 9
1 1808 9
1 1809 9
1 1810 8
1 1811 9
1 1812 9
2 1807 7
2 1808 6
2 1809 7
2 1810 7
2 1811 8
2 1812 8
3 1807 .
3 1808 9
3 1809 9
3 1810 9
3 1811 10
3 1812 10
;
run;
)
Use by-group processing, the lag() function, and an automatically retained variable:
Data rawdata;
Input ID month score;
cards;
1 1807 9
1 1808 9
1 1809 9
1 1810 8
1 1811 9
1 1812 9
2 1807 7
2 1808 6
2 1809 7
2 1810 7
2 1811 8
2 1812 8
3 1807 .
3 1808 9
3 1809 9
3 1810 9
3 1811 10
3 1812 10
;
run;
data want;
set rawdata;
by id;
oldscore = lag(score);
if first.id then changes = 0;
else if score ne oldscore then changes + 1;
if last.id then output;
keep id changes;
run;
proc print data=want noobs;
run;
Result:
ID changes 1 2 2 3 3 2
Use by-group processing, the lag() function, and an automatically retained variable:
Data rawdata;
Input ID month score;
cards;
1 1807 9
1 1808 9
1 1809 9
1 1810 8
1 1811 9
1 1812 9
2 1807 7
2 1808 6
2 1809 7
2 1810 7
2 1811 8
2 1812 8
3 1807 .
3 1808 9
3 1809 9
3 1810 9
3 1811 10
3 1812 10
;
run;
data want;
set rawdata;
by id;
oldscore = lag(score);
if first.id then changes = 0;
else if score ne oldscore then changes + 1;
if last.id then output;
keep id changes;
run;
proc print data=want noobs;
run;
Result:
ID changes 1 2 2 3 3 2
Are there always exactly 6 months of data or would there be some sort of rolling window? I would suggest creating a change from previous result flag, then you can simply sum() the flags based on a 6 month window, perhaps something like:
data have; set rawdata; by id; if not first.id and score ne lag(score) then chg_flg=1; run;
In this way you can say sum chg_flg over start of period to end of period = changes in the 6 months.
Thank you both for very nice solutions that you sent.
I just summarize your solutions here.
Data rawdata;
Input ID month score;
cards;
1 1807 9
1 1808 9
1 1809 9
1 1810 8
1 1811 9
1 1812 9
2 1807 7
2 1808 6
2 1809 7
2 1810 7
2 1811 8
2 1812 8
3 1807 .
3 1808 9
3 1809 9
3 1810 9
3 1811 10
3 1812 10
;
run;
/****Way1******/
/****Way1******/
/****Way1******/
/****Way1******/
Data want;
set rawdata;
by id;
oldscore = lag(score);
if first.id then No_changes = 0;
else if score ne oldscore then No_changes + 1;
if last.id then output;
keep id No_changes;
run;
proc print data=want noobs;
run;
/****Way2******/
/****Way2******/
/****Way2******/
/****Way2******/
data HelpTbl;
set rawdata;
by id;
if not first.id and score ne lag(score) then chg_flg=1;
run;
PROC SQL;
create table want2 as
select ID,
sum( chg_flg) as No_changes
from HelpTbl
group by ID
;
QUIT;
proc print data=want2 noobs;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.