Hi all,
I have some question for assigning value with condition. Given an example below.
I create table with missing values at the end of table, which is aging12.
data temp;
input aging aging1 aging2 aging3 aging4 aging5 aging6 aging7 aging8 aging9 aging10 aging11 aging12;
datalines;
1 2 3 4 4 4 4 4 4 4 4 4 4
2 3 4 4 4 4 4 4 4 4 4 3 .
3 4 4 4 4 4 4 4 4 4 2 . .
4 4 4 4 4 4 4 4 4 1 . . .
;
run;
Next I would like to assign a new column, called status with the latest value available. In the second row, it would be 3, in the third row, it would be 2 and in the last row, it would be 1.
I have written down the code below.
data temp;
set temp;
array aging_a{*} aging1-aging12;
do i=1 to 12;
if cmiss(of aging1--aging12) ne 0 then do;
if missing(aging_a{i}) then status=aging_a{i-1};
else status=aging12;
end;
end;
run;
The output works only the second row and it returns with value equal to 3. However, the rest are returned with missing values.
Any suggestion on this?
Thanks
Not sure what you want as output.
By any chance, is this what you want?
data temp2;
set temp;
array aging_a{*} aging1-aging12;
do i=1 to 12;
if cmiss(of aging1--aging12) ne 0 then do;
if missing(aging_a{i}) then do; status=aging_a{i-1};leave;end;
else status=aging12;
end;
end;
run;
It sounds like you could simplify the DO loop considerably:
do i=12 to 1 by -1 until (status > .);
status = aging{i};
end;
Just look backwards through the list and take the first nonmissing value you find. Is that the correct logic?
Try this if your missing values are to the end .
data temp;
set temp;
array aging_a{*} aging1-aging12;
do i=1 to dim(aging_a);
if aging_a(i)^=. then new_col=aging_a(i);
end;
run;
Hi @LengYi Please clarify whether you want the last non missing value as the value of status for records that have at least one missing value?
If the above assumption of mine is correct, you don't need a loop
data temp;
input aging aging1 aging2 aging3 aging4 aging5 aging6 aging7 aging8 aging9 aging10 aging11 aging12;
datalines;
1 2 3 4 4 4 4 4 4 4 4 4 4
2 3 4 4 4 4 4 4 4 4 4 3 .
3 4 4 4 4 4 4 4 4 4 2 . .
4 4 4 4 4 4 4 4 4 1 . . .
;
run;
data want;
set temp;
array aging_a{*} aging12-aging1;
if nmiss(of aging_a{*} )>0 then do;
_iorc_=whichn(coalesce(of aging_a{*}),of aging_a{*});
status=aging_a(_iorc_);
end;
run;
One point using the code structure of
data temp; set temp;
can be very dangerous as it completely replaces the source data set with a new data set. An error in logic or even a type could destroy data and require you to go back to early steps to start over. Debugging data issues that occur with this type of code, especially if used repeatedly can sometimes take a great deal of time;
If there are no "gaps" in the data and you always want the rightmost value of the aging array this would be one way:
data temp2; set temp; array aging_a{*} aging1-aging12; status = aging_a[n(of aging_a(*))]; run;
the N function returns how many of the values are not missing. If there aren't any gaps then that would be the last.
If there might be gaps it may be more efficient to work from the right to left with
do I = dim(aging_a) to 1 by -1;
if not missing (aging_a[I]) then do;
status= aging_a[I];
leave;
end;
end;
The LEAVE instruction would exit the do loop when the first non-missing value is find on the right of the array.
You should specify a rule for if the array is empty unless that can never happen.
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.