Hello,
Basically, I am trying to see if Variable X, for which I have monthly values, changes over time for participants. I have created a dataset where for each subject their values for Variable X are printed for each month in the time period. Unfortunately, sometimes Variable X is missing for a subject.
I tried to insert an array as follows:
data mydata;
array varx value1--value60;
do i=1-70;
if varx(i)=. then varx(i)=varx(i-1);
end;
run;
Unfortunately this creates two problems. One is what to do if participants' first value for Variable X is missing. The second is that I keep getting an error message that all variables in array list must be the same type. Considering that all the variables in the array list are numeric, I'm not sure why I'm getting this error message or how to fix it.
Any assistance is much appreciated.
largest(1,of v:) returns the largest value of any variable that begins with the letter v. If they are all missing it returns a missing value.
smallest(1,of v:) returns the smallest value of any variable that begins with the letter v. If they are all missing it returns a missing value.
ifn compare the results of the two functions. If both functions return the same value (regardless of whether the value is a number or a missing value), flag gets set to 0. If they return different values, flag gets set to 1.
Try this (note the single dash between value1 and value60) :
data mydata;
array varx value1-value60;
if missing(value(1)) then value1 = 0; /* Replace 0 with whatever is appropriate */
do i = 2 to 60;
if missing(varx(i)) then varx(i)=varx(i-1);
end;
run;
PG
Note that your do loop goes to 70, but you only have 60 elements in the array.
The LOF (Last Observation Forward) approach to data inputation has a long, mostly, negative history. It basically assumes the response surface is flat, and that the assumption that you are trying to examine.
I'll point out some additional problems you have with the code you posted:
1. You are attempting to load the array with the -- operator when, in fact, you only want the - operator. With -- you are including all variables between value1 and value60, while with the - operator you are only incuding those that begin with the prefix 'value' and end with a number between 1 and 60.
2. Your datastep doesn't include a set statement, thus doesn't bring in ANY data.
3. You specify your loop as 1-70. Like Doc mentioned, it should be 60 rather than 70 but, additionally, should be defined as 1 to 60
4, If you are only trying to identify if there is a difference between the values, not populating the missing values, why not approach the task a bit differently. e.g.:
data mydata_want;
set mydata_have;
same=ifc(max(of value1-value60)*
(60-nmiss(of value1-value60))=
sum(of value1-value60),'NoChange','Change');
run;
Thank you for your response. Could you explain what the code you provided will do? Thanks!
The code creates a variable called "same". It ignores missing values but, for any remaining values, if at least one is different from the others, it assigns a value of "change" to the variable same. If all values are the same, it assigns a value of "same" to the variable same.
Hi,
How about the code below:
data one;
input v1-v5;
cards;
20 . 30 35 15
10 80 . . 20
20 . 20 20 20
88 88 88 88 88
18 18 18 18 18
20 15 10 8 6
;
data two;
set one;
flag=ifn(largest(1,of v:)=largest(5-nmiss(of v:),of v:),0,1);
proc print;run;
Obs v1 v2 v3 v4 v5 flag
1 20 . 30 35 15 1
2 10 80 . . 20 1
3 20 . 20 20 20 0
4 88 88 88 88 88 0
5 18 18 18 18 18 0
6 20 15 10 8 6 1
Better!
That worked! Could you please explain what the coding means?
Which one worked?
Sorry, the most recent one from Mr. Tabachneck.
largest(1,of v:) returns the largest value of any variable that begins with the letter v. If they are all missing it returns a missing value.
smallest(1,of v:) returns the smallest value of any variable that begins with the letter v. If they are all missing it returns a missing value.
ifn compare the results of the two functions. If both functions return the same value (regardless of whether the value is a number or a missing value), flag gets set to 0. If they return different values, flag gets set to 1.
Hi Art,
An English question. is "set" a verb or something else in your "If they return different values, flag gets set to 1."?
It is a verb in that sentence. More of a computer programming usage than normal English. A similar computer programming verb would be ASSIGN. The closest normal English usage for SET as a verb would in something like. " I'm cold, could you please set the thermostat to a higher temperature? "
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.