Hi,
I have a question on how to assign 999 in the column 'desired', if the cell in 'value' column is missing and rows before and after the missing cell has some value (0 or 1 or 2) then we should assign 999 under desired column in that row.
999 must be assigned based on missing values under each ID and section. Please advise.
ID | Sequence | section | value | desired |
1 | 1 | a | 0 | 0 |
1 | 2 | b | 999 | |
1 | 3 | c | 1 | 1 |
1 | 4 | d | 0 | 0 |
1 | 5 | e | 0 | 0 |
2 | 1 | a | 1 | 1 |
2 | 2 | b | 1 | 1 |
2 | 3 | c | 999 | |
2 | 4 | d | 2 | 2 |
2 | 5 | e | 2 | 2 |
Thank you
Run this:
data have;
infile datalines dsd dlm=" " truncover;
input ID $ Sequence section $ value;
datalines;
1 1 a 0
1 2 b
1 3 c 1
1 4 d 0
1 5 e 0
2 1 a 1
2 2 b 1
2 3 c
2 4 d 2
2 5 e 2
3 1 a .
3 2 b 1
4 1 a 0
4 2 b
;
data want;
set have;
by id;
set /* create a "look-ahead" */
have (
firstobs=2
keep=id value
rename=(id=_id value=_value)
)
have ( /* this is needed to prevent a premature end of the data step */
obs=1
keep=id value
rename=(id=_id value=_value)
)
;
_lvalue = lag(value);
if
not (first.id or last.id) and
value = . and _lvalue ne . and _value ne .
then desired = 999;
else desired = value;
drop _:;
run;
Note how I create the dataset with a data step with DATALINES; by doing this in the future, you will make it MUCH easier for us to help you, as we will know exactly what we have to deal with.
I added two groups that illustrate the first/last behavior.
What if the value column has another value, say 3 for some ID?
There is no chance for another value as it is predefined to have either 0 or 1 or 2 in this case.
Ok, so that part really doesn't matter or?
So you simply want to set desired = 999 if value is missing, correct?
I want to assign 999 if the rows before and after the missing rows has any value (0 or 1 or 2).
For each ID.
What if the missing value appears as first or last for a given ID?
Then we can ignore them. It is those missing values inbetween the rows with any value (0 or 1 or 2) must be assigned as 999.
If first or last is blank then we should ignore them.
Run this:
data have;
infile datalines dsd dlm=" " truncover;
input ID $ Sequence section $ value;
datalines;
1 1 a 0
1 2 b
1 3 c 1
1 4 d 0
1 5 e 0
2 1 a 1
2 2 b 1
2 3 c
2 4 d 2
2 5 e 2
3 1 a .
3 2 b 1
4 1 a 0
4 2 b
;
data want;
set have;
by id;
set /* create a "look-ahead" */
have (
firstobs=2
keep=id value
rename=(id=_id value=_value)
)
have ( /* this is needed to prevent a premature end of the data step */
obs=1
keep=id value
rename=(id=_id value=_value)
)
;
_lvalue = lag(value);
if
not (first.id or last.id) and
value = . and _lvalue ne . and _value ne .
then desired = 999;
else desired = value;
drop _:;
run;
Note how I create the dataset with a data step with DATALINES; by doing this in the future, you will make it MUCH easier for us to help you, as we will know exactly what we have to deal with.
I added two groups that illustrate the first/last behavior.
Please see this example. I have inserted 2 additional blank rows under ID=2. This fails when it has 2 or more consequtive blank rows.
I tried modifying firstobs value but it didnt work for second consequtinve blank value.
data have;
infile datalines dsd dlm=" " truncover;
input ID $ Sequence section $ value;
datalines;
1 1 a 0
1 2 b
1 3 c 1
1 4 d 0
1 5 e 0
2 1 a 1
2 2 b 1
2 3 c
2 4 d 2
2 5 e 2
2 6 f
2 7 g
2 8 h 0
3 1 a .
3 2 b 1
4 1 a 0
4 2 b
;
run;
data want;
set have;
by id;
set /* create a "look-ahead" */
have (
firstobs=2
keep=id value
rename=(id=_id value=_value)
)
have ( /* this is needed to prevent a premature end of the data step */
obs=1
keep=id value
rename=(id=_id value=_value)
)
;
_lvalue = lag(value);
if
not (first.id or last.id) and
value = . and _lvalue ne . and _value ne .
then desired = 999;
else desired = value;
drop _:;
run;
Same as Kurt's code :
data have;
infile datalines dsd dlm=" " truncover;
input ID $ Sequence section $ value;
datalines;
1 1 a 0
1 2 b
1 3 c 1
1 4 d 0
1 5 e 0
2 1 a 1
2 2 b 1
2 3 c
2 4 d 2
2 5 e 2
3 1 a .
3 2 b 1
4 1 a 0
4 2 b
;
data want;
merge have have(firstobs=2 keep=id value rename=(id=_id value=_value));
desired=value;
if missing(value) and
(id=lag(id) and not missing(lag(value))) and
(id=_id and not missing(_value)) then desired=999;
drop _: ;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.