Hello,
I have the following task:
For each customer ID there are multiple columns with information on varaible X and varaible W during the months Jan 2021 till Jun 2021
( The names of the columns X are in format XYYMM ,for example:X2101 is varaible X in Jan 2021)
( The names of the columns W are in format XYYMM,for example:W2104 is varaible X in April 2021).
There is also another column called month (numeric ) with values structure YYMM
(for example: 2103 is March 2021).
There are 3 versions for the soure data set (Have1 ,Have2,Have3).
Task1 is to convert values to null in columns that lower than month value.
For example:
For Customer ID=1 ,month=2102 so X2101 and X2102 and W2101 and W2102 should be converted to null values.
Task2 is to create new varaibles: X1,X2,X3,X4,X5,X6,W1,W2,W3,W4,W5,W6 with information by number of months that passes since the month value.
May anyone show a code that create "Wanted1" and "wanted2" data sets?
What source data set is the best structure to work with? Have1 or Have2 or Have3?
The task is to follow up the customer since he recieved a drug and it is important for us to look on the values by the times passes since he recieved the treatment ...
Can you please provide the data in a SAS data step (instead of an Excel file)?
Can you please provide the data in a SAS data step (instead of an Excel file)?
/***I am sending two ways to represet source data set so you can work on each of them in order to get the desired wanted data set***/
Data Have1;
Input ID month $ X2101 X2102 X2103 X2104 X2105 X2106 W2101 W2102 W2103 W2104 W2105 W2106
cards;
1 2102 10 8 42 43 19 32 9 7 38 39 17 29
2 2104 14 4 32 54 26 42 13 4 29 49 23 38
3 2101 25 5 13 65 73 43 23 5 12 59 66 39
4 2102 62 77 76 34 32 2 56 69 68 31 29 2
5 2103 42 34 54 54 65 34 38 31 49 49 59 31
6 2104 34 54 34 65 65 98 31 49 31 59 59 88
;
Run;
Data Have2;
Input ID mon $ X W month $;
cards;
1 2101 10 9 2102
1 2102 8 7 2102
1 2103 42 38 2102
1 2104 43 39 2102
1 2105 19 17 2102
1 2106 32 29 2102
2 2101 14 13 2104
2 2102 4 4 2104
2 2103 32 29 2104
2 2104 54 49 2104
2 2105 26 23 2104
2 2106 42 38 2104
3 2101 25 23 2101
3 2102 5 5 2101
3 2103 13 12 2101
3 2104 65 59 2101
3 2105 73 66 2101
3 2106 43 39 2101
4 2101 62 56 2102
4 2102 77 69 2102
4 2103 76 68 2102
4 2104 34 31 2102
4 2105 32 29 2102
4 2106 2 2 2102
5 2101 42 38 2103
5 2102 34 31 2103
5 2103 54 49 2103
5 2104 54 49 2103
5 2105 65 59 2103
5 2106 34 31 2103
6 2101 34 31 2104
6 2102 54 49 2104
6 2103 34 31 2104
6 2104 65 59 2104
6 2105 65 59 2104
6 2106 98 88 2104
;
Run;
I know we have discussed this before. Maxim 19, Long beats Wide.
You need to transpose the data set to a long data set, then leave it as long; and then most of what you describe will be easy, and furthermore, the subsequent analysis will also be easier.
I don't have time to write code for you right now. Why don't you see if you can use PROC TRANSPOSE to obtain a long data set and then make missing values as needed.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.