Hello
I have multiple observations for each customer .(6 rows for each customer in months :1807,1808,1809,1810,1811,1812)
I want to calculate for each customer Number of times that he touch grade 11 or grade 12.
For ID 1 I expect to get value 3
For ID 2 I expect to get value 0
For ID 3 I expect to get value 1
May anyone show please How to do it in one step?
In the following code I show how to do it in two steps.
My challenge is to learn to do it in one step with lag or another way,
Data rawdata;
Input ID month score;
cards;
1 1807 11
1 1808 12
1 1809 9
1 1810 8
1 1811 11
1 1812 9
2 1807 7
2 1808 6
2 1809 7
2 1810 7
2 1811 8
2 1812 8
3 1807 .
3 1808 9
3 1809 9
3 1810 9
3 1811 10
3 1812 11
;
run;
/*Way1*/
Data tbl2;
set rawdata;
IF score in(11,12) then Touch_11_12_flag=1;
else Touch_11_12_flag=0;
Run;
PROC SQL;
create table want1 as
select ID,
sum(Touch_11_12_flag) as Total_Touch_11_12
from tbl2
group by ID
;
QUIT;
proc print data=want1 noobs;
run;
It's the same approach with a retained variable that worked for you in https://communities.sas.com/t5/SAS-Programming/Calculate-number-of-changes-in-score-during-6-months/...
Thank you so much!
data want3;
set rawdata;
by id;
oldscore = lag(score);
if first.id and score in(11,12) then Total_Touch_11_12 = 1;
else if first.id and score not in(11,12) then Total_Touch_11_12 = 0;
else if score in(11,12) then Total_Touch_11_12 + 1;
if last.id then output;
keep id Total_Touch_11_12;
run;
proc print data=want3 noobs;
run;
You're overcomplicating. Such a step has three discrete elements:
So the step can easily be written as
data want3;
set rawdata;
by id;
if first.id then total_touch_11_12 = 0;
if score in (11,12) then total_touch_11_12 + 1;
if last.id then output;
keep id total_touch_11_12;
run;
Note that the increment statement
total_touch_11_12 + 1;
implies an automatic retain. If you need to do another type of calculation, you will need an explicit retain statement.
Since oldscore is not needed here (there's no comparison with a previous observation (to detect a change) in your logic), the assignment with the lag() function can also be omitted.
Data rawdata;
Input ID month score;
cards;
1 1807 11
1 1808 12
1 1809 9
1 1810 8
1 1811 11
1 1812 9
2 1807 7
2 1808 6
2 1809 7
2 1810 7
2 1811 8
2 1812 8
3 1807 .
3 1808 9
3 1809 9
3 1810 9
3 1811 10
3 1812 11
;
run;
PROC SQL;
create table want1 as
select ID,
sum(score in(11,12)) as Total_Touch_11_12
from rawdata
group by ID
;
QUIT;
I see that @Kurt_Bremser noticed the same problem from you already in sas communities. Is there a reason to post it again? The usual result of this approach is to see subsequent new questions ignored - like the boy who cried wolf.
Help us help you - don't overload the people with useful answers.
regards,
MK
I'd write that as
data want;
do until (last.id);
set rawdata;
by id;
if score in (11,12) then touch11 = sum(touch11,1);
end;
touch11 = max(0,touch11); * to prevent missing values;
keep id touch11;
run;
Both the "classical" data step method (first,increment,last) and the DOW loop require sorting and need only a single sequential pass through the sorted dataset, and are therefore functionally equivalent with (probably) only marginal differences in performance.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.