I receive the error : " variable ... in list does not match type prescribed for this list" when using the PROC EXPAND to get lead and lag of a variable.
Is this because the variable's type is character? it seems counterintuitive if it is the case. My code is:
proc expand data=input out=output method=none;
by fund;
id date;
convert = lag1 / transformout=(lag 1);
convert name = name_lag2 / transformout=(lag 2);
convert name = name_lag3 / transformout=(lag 3);
convert name = name_lag4 / transformout=(lag 4);
convert name = name_lag5 / transformout=(lag 5);
convert name = name_lead1 / transformout=(lead 1);
convert name = name_lead2 / transformout=(lead 2);
convert name = name_lead3 / transformout=(lead 3);
convert name = name_lead4 / transformout=(lead 4);
convert name = name_lead5 / transformout=(lead 5);
quit;
Also, how can I achieve this task without using a DATA step to descending sort to get LEAD values?
Regarding your second question. Do you want a way to do this with a data step or without a data step?
You can also do something like this. Just made up some data.
data have;
array names {19} $ 10 _temporary_ ('Alice', 'Barbara', 'Carol', 'Jane', 'Janet', 'Joyce', 'Judy', 'Louise', 'Mary', 'Alfred',
'Henry', 'James', 'Jeffrey', 'John', 'Philip', 'Robert', 'Ronald', 'Thomas', 'William');
do id = 1 to 3;
do date = '01jun2020'd to '10jun2020'd;
name = names[ceil(rand("Uniform")*19)];
output;
end;
end;
format date ddmmyy10.;
run;
data want(drop=rc seq m i lname);
if _N_=1 then do;
declare hash h ();
h.definekey('id', 'seq');
h.definedata('lname');
h.definedone();
do seq = 1 by 1 until (lr1);
set have end=lr1;
h.replace(key : id, key : seq, data : name);
end;
end;
array l {-3 : 3} $ 10 lag1-lag3 m lead1-lead3;
do seq = 1 by 1 until (lr2);
set have end=lr2;
do i = -3 to 3;
lname = repeat(' ', 10);
rc = h.find (key : id, key : seq + i);
l [i] = lname;
end;
output;
end;
run;
Regarding your first question, the Convert Statement Documentation is pretty clear:
"The CONVERT statement lists the variables to be processed. Only numeric variables can be processed."
You do not have to sort the data twice you only need to sort it once by using fund and date as sort fields and utilize arrays in a data step.
proc sort data=inpdata; by fund date;
run;
if you have 3 years of data you will have utmost 1096 dates so define an array of size 1095. This is an over kill but helps.
data out_data;
length _name1 -- _name1096 $32.;
set inpdata;
by fund;
array _names{1096} _name1 -- _name1096;
retain _name1 -- _name1096 '';
if first.fund then count=0;
_names[count]=name;
if last.fund then do;
<look forward and backward in the array depending on the number of lags and leads>
<output>
end;
run;
The code becomes hairier as you increase the number of variables you have to maintain for output increases. Or keep variables that you need to process and leave the others out. This can be joined back later.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.