I have a dataset that have four variables: ID, Day, Status, and Var1.
ID | Day | Status | Var 1 |
1 | 1 | 0 | 1 |
2 | 1284 | 0 | 2 |
3 | 28 | 0 | 2 |
4 | 432 | 1 | 1 |
5 | 1018 | 0 | 2 |
6 | 85 | 0 | 1 |
7 | 1007 | 0 | 2 |
8 | 824 | 0 | 2 |
9 | 907 | 0 | 2 |
10 | 191 | 1 | 2 |
10 | 392 | 1 | 2 |
10 | 433 | 1 | 2 |
11 | 819 | 1 | 2 |
11 | 1004 | 1 | 2 |
However, since ID 10 and 11 have multiple events and I want to code "Day" as two variables "Start" and "End". Many thanks!
ID | Start | End | Status | Var1 |
1 | 0 | 1 | 0 | 1 |
2 | 0 | 1284 | 0 | 2 |
3 | 0 | 28 | 0 | 2 |
4 | 0 | 432 | 1 | 1 |
5 | 0 | 1018 | 0 | 2 |
6 | 0 | 85 | 0 | 1 |
7 | 0 | 1007 | 0 | 2 |
8 | 0 | 824 | 0 | 2 |
9 | 0 | 907 | 0 | 2 |
10 | 0 | 191 | 1 | 2 |
10 | 191 | 392 | 1 | 2 |
10 | 392 | 433 | 1 | 2 |
11 | 0 | 819 | 1 | 2 |
11 | 819 | 1004 | 1 | 2 |
Is your data actually sorted as shown?
Please see:
data have; input ID Day Status Var ; datalines; 1 1 0 1 2 1284 0 2 3 28 0 2 4 432 1 1 5 1018 0 2 6 85 0 1 7 1007 0 2 8 824 0 2 9 907 0 2 10 191 1 2 10 392 1 2 10 433 1 2 11 819 1 2 11 1004 1 2 ; data want; set have; by id; l_day=lag(day); if first.id then do; start=0; end=day; end; else do; start=l_day; end=day; end; drop day l_day; run;
Note the first data step creates a usable data set as shown. This is the preferred manner of sharing data on this forum as then we know the variable names, types and any properties set such as formats or labels.
The second data set assumes the values are sorted by Id and Day. This allows use of a simple By Id. When you use a By statement in the data step then SAS creates automatic variables named First.variable and Last.Variable that have values of 1/0 indicating true/false that the current observation is the first of a by group or last of the by group.
Since you need a value from the previous observation we use the LAG function to get the value to use as needed. Warning: this function is a queuing function and seldom returns the result wanted when used conditionally (in an IF <condition> then do; <statements>; end; block ).
Then knowing whether the observation is the first or not we know which value to use for Start and End. Then drop the no longer needed variables.
Is your data actually sorted as shown?
Please see:
data have; input ID Day Status Var ; datalines; 1 1 0 1 2 1284 0 2 3 28 0 2 4 432 1 1 5 1018 0 2 6 85 0 1 7 1007 0 2 8 824 0 2 9 907 0 2 10 191 1 2 10 392 1 2 10 433 1 2 11 819 1 2 11 1004 1 2 ; data want; set have; by id; l_day=lag(day); if first.id then do; start=0; end=day; end; else do; start=l_day; end=day; end; drop day l_day; run;
Note the first data step creates a usable data set as shown. This is the preferred manner of sharing data on this forum as then we know the variable names, types and any properties set such as formats or labels.
The second data set assumes the values are sorted by Id and Day. This allows use of a simple By Id. When you use a By statement in the data step then SAS creates automatic variables named First.variable and Last.Variable that have values of 1/0 indicating true/false that the current observation is the first of a by group or last of the by group.
Since you need a value from the previous observation we use the LAG function to get the value to use as needed. Warning: this function is a queuing function and seldom returns the result wanted when used conditionally (in an IF <condition> then do; <statements>; end; block ).
Then knowing whether the observation is the first or not we know which value to use for Start and End. Then drop the no longer needed variables.
Many thanks for your help. It is easily understood and works well.
Assumy data are sorted by ID/DAY, then:
data have;
input ID Day Status Var ;
datalines;
1 1 0 1
2 1284 0 2
3 28 0 2
4 432 1 1
5 1018 0 2
6 85 0 1
7 1007 0 2
8 824 0 2
9 907 0 2
10 191 1 2
10 392 1 2
10 433 1 2
11 819 1 2
11 1004 1 2
;
data want;
set have;
by id;
end=day;
start=ifn(first.id,0,lag(end));
run;
Thank you for your help.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.