Hello
Please delete my previous message and I will try to explain again my question.
The raw data contain information for each customer ID.
There is information about number of months that customer was expose to illness.
There is also follow up of 6 months with score value 1-11.
IF customer touch score 11 then he died.
For each customer there are 6 scores (in follow up period of 6 months).
For each customer there is a different number of months that he/she exposed.
For example: IF No_Mon_expose=4 then Score5 and Score6 will get null values and Ind_Expose_Mon5=0 and Ind_Expose_Mon6=0.
In months after expose period the scores are not relevant (we fix it in data set called "wanted1")
What is the way to create data set "wanted2" using array instead of multiple IF else that written manually?
Data Rawdata;
Input ID No_Mon_expose Score1 Score2 Score3 Score4 Score5 Score6;
cards;
1 6 5 5 5 5 5 4
2 3 11 11 11 11 11 11
3 3 9 8 9 11 11 11
4 4 3 3 3 3 3 3
5 2 5 5 5 5 5 5
6 1 11 11 11 11 11 11
7 6 2 2 2 2 2 2
8 3 2 2 2 2 2 2
9 2 3 3 4 2 2 2
10 6 2 2 2 2 2 2
;
run;
data wanted1;
set Rawdata;
array Ind_Expose_Mon (6);
do i=1 to dim(Ind_Expose_Mon);
if i<=No_Mon_expose then Ind_Expose_Mon(i)=1;
else Ind_Expose_Mon(i)=0;
end;
drop i;
run;
Data wanted2;
Set wanted1;
IF Score1=11 then do;
Ind_Expose_Mon2=0;
Ind_Expose_Mon3=0;
Ind_Expose_Mon4=0;
Ind_Expose_Mon5=0;
Ind_Expose_Mon6=0;
Score2=.;
Score3=.;
Score4=.;
Score5=.;
Score6=.;
end;
else IF Score2=11 then do;
Ind_Expose_Mon3=0;
Ind_Expose_Mon4=0;
Ind_Expose_Mon5=0;
Ind_Expose_Mon6=0;
Score3=.;
Score4=.;
Score5=.;
Score6=.;
end;
else IF Score3=11 then do;
Ind_Expose_Mon4=0;
Ind_Expose_Mon5=0;
Ind_Expose_Mon6=0;
Score4=.;
Score5=.;
Score6=.;
end;
else IF Score4=11 then do;
Ind_Expose_Mon5=0;
Ind_Expose_Mon6=0;
Score5=.;
Score6=.;
end;
else IF Score5=11 then do;
Ind_Expose_Mon6=0;
Score6=.;
end;
Run;
PROC SQL;
create table summaryInfo as
select
sum(case when Score1=11 then 1 else 0 end ) as No_Failures_Mon1,
sum( Ind_Expose_Mon1) as No_Customers_Expose_Mon1,
calculated No_Failures_Mon1/calculated No_Customers_Expose_Mon1 as q1,
sum(case when Score2=11 then 1 else 0 end ) as No_Failures_Mon2,
sum( Ind_Expose_Mon2) as No_Customers_Expose_Mon2,
calculated No_Failures_Mon2/calculated No_Customers_Expose_Mon2 as q2,
sum(case when Score3=11 then 1 else 0 end ) as No_Failures_Mon3,
sum( Ind_Expose_Mon3) as No_Customers_Expose_Mon3,
calculated No_Failures_Mon3/calculated No_Customers_Expose_Mon3 as q3,
sum(case when Score4=11 then 1 else 0 end ) as No_Failures_Mon4,
sum( Ind_Expose_Mon4) as No_Customers_Expose_Mon4,
calculated No_Failures_Mon4/calculated No_Customers_Expose_Mon4 as q4,
sum(case when Score5=11 then 1 else 0 end ) as No_Failures_Mon5,
sum( Ind_Expose_Mon5) as No_Customers_Expose_Mon5,
calculated No_Failures_Mon5/calculated No_Customers_Expose_Mon5 as q5,
sum(case when Score6=11 then 1 else 0 end ) as No_Failures_Mon6,
sum( Ind_Expose_Mon6) as No_Customers_Expose_Mon6,
calculated No_Failures_Mon6/calculated No_Customers_Expose_Mon6 as q6,
(1-calculated q1)*(1-calculated q2)*(1-calculated q3)*(1-calculated q4)*(1-calculated q5)*(1-calculated q6) as prob_survive
from wanted2
;
QUIT;
Instead of showing the code (that you don't want to use) for WANT show the data.
Sorry ,I didn't understand what you mean
Post the dataset that you want to get from the input you posted. Post it in the form of a data step like you did with the input data.
@Ronein wrote:
Hello
Please delete my previous message and I will try to explain again my question.
The raw data contain information for each customer ID.
There is information about number of months that customer was expose to illness.
There is also follow up of 6 months with score value 1-11.
IF customer touch score 11 then he died.For each customer there are 6 scores (in follow up period of 6 months).
So if the customer died at month 2 they continued to check if the customer was still dead at month 3, 4, 5 and 6????
You might actually show what "wanted2" should look like.
It might be appropriate to describe a bit of what is done with the data later. The content of your description makes this sound like you may be going for some sort of survival analysis. And any of those will likely be better off with one record per observation including the month and the score and then not have any records following "death".
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.