ID | Event | x_tpoint | x_date | x_score | dis_date | y_tpoint | y_score | y_date |
1 | a | 3 | 14thfeb | 4 | 5thmarch | |||
1 | a | 3 | 20thfeb | 5 | 5thmarch | |||
1 | b | 5thmarch | 3 | 6 | 1stmarch | |||
2 | a | 3 | 1stmarch | 5 | 1stapril | |||
2 | b | 1stapril | 3 | 4 | 15thmarch | |||
3 | a | 3 | 1st April | 6 | 1stmay | |||
3 | a | 3 | 7thapril | 7 | 1stmay |
I want to get the following table, please assist?
ID | x_tpoint | x_date1 | x_date2 | x_score1 | x_score2 | dis_date | y_tpoint | y_score | y_date |
1 | 3 | 14thfeb | 20thfeb | 4 | 5 | 5thmarch | 3 | 6 | 1stmarch |
2 | 3 | 1stmarch | 5 | 1stapril | 3 | 4 | 15thmarch | ||
3 | 3 | 1stapril | 7thapril | 6 | 7 | 1stmay | 3 |
Here is one way :
data want(rename=(x_tpoint_=x_tpoint));
array x_date_[3] $ x_date1- x_date3;
array x_score_[3] $ x_score1-x_score3;
x_tpoint_=.;
do _n_=1 by 1 until(last.id);
set have;
by id;
x_date_[_n_]=x_date;
x_score_[_n_]=x_score;
if x_tpoint ^=. then x_tpoint_=x_tpoint;
end;
keep ID x_tpoint_ x_date1 x_date2 x_score1 x_score2 dis_date y_tpoint y_score y_date;
run;
Should ask @Nrjn7 what the maximum number of dates or scores will ever appear within a value of ID.
Just because a small shown example only shows 3 as likely there may actually be 5, 10, 15 or some other number.
@ballardw @r_behata That's a really good question. For a given ID, the events are repeated instruments. Meaning, For study ID 1, they could have more than one Event a (in this example: there's only 2 event a for studyid 1 but there could be more). would the code work if the event "a" or "b" were more for given studyid. for every event they would have dates.
Thank you
@Nrjn7 The code can be slightly modified to adjust to your needs. Have you tried running the code with your actual data to see what happens ?
The sample output you have shown has fixed columns such as : x_score1,x_score2. If your data has more than 2 events , say 5 or 10 does the output columns needs to change accordingly. If yes, show us some samples.
@r_behata thanks for responding.
Using the code, this is the result I'm currently getting.
id | x_date1 | x_datet2 | x_score1 | x_score2 | x_tpoint | dis_date | y_tpoint | y_score | y_date |
1 | 14thfeb | 20thfeb | 4 | 5 | . | . | 3 | 6 | 1stmarch |
2 | 1stmarch | 5 | . | . | . | 3 | 4 | 15thmarch | |
3 | 1stapril | 7thapril | 6 | 7 | 3 | . | . | . |
So, the columns for score and date are dependent on how many event per study ID there is.
For example, for StudyID 1 he could have 'n' number of event 'a' or 'n' number of event b, (x_tpoint, y_tpoint and dis_date) is always fixed)
so, I have this table
id | event | x_tpoint | x_date | x_score | dis_date | y_tpoint | y_date | y_score |
1 | a | 3 | 1stjan | 4 | 8 | |||
1 | a | 3 | 2ndjan | 3 | 8 | |||
1 | a | 3 | 3rdjan | 5 | 8 | |||
1 | 'n' th a | |||||||
1 | b | 8 | 3 | 4thjan | 6 | |||
1 | b | 8 | 3 | 5thjan | 7 | |||
1 | 'n' th b |
The events are repeated instruments so ID 1 can have 'n' number of a's and b's.
I want the table below:
id | event | x_tpoint | x_date1 | x_date2 | x_date3 | x_date'n' | x_score1 | x_score2 | x_score3 | x_score'n' |
|
I couldn't fit the y_score y_date but the idea is the same.
Really appreciate the help.
I am not quite sure how you want these rows to be represented in your output. They seem to be having different dates unlike the first sample you have shown.
1 | b | 8 | 3 | 4thjan | 6 | |||
1 | b | 8 | 3 | 5thjan | 7 |
Would you like the y_score, y_date to be converted as Y_Score1,Y_Score2...YScoren and Y_date1,Y_Date2...Y_Daten ?
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.