The substr procedure was used to extract the first three letters of the season name.
Actualid seems indeed to be redundant with the Actual variable. If your dataset is large,
this can lead to store much unnecessary information.
In order to gain space, it may be better to work with your season codification and use formats
to get names whenever needed. Here is an example using character codes '1', '2' for Summer and Winter.
proc format;
invalue $season 'Summer'='1'
'Winter'='2'
other=.;
value $seas_name '1'="Summer"
'2'="Winter";
value $seas_short '1'="Sum"
'2'="Win";
run;
data have;
infile datalines dlm=',';
format Subject $3. Projectid Actual $1.;
informat Actual $season.;
input Subject Projectid Actual;
/* Summer and Winter will be replaced by their codes in the resulting dataset */
datalines;
001,1,Summer
002,2,Winter
003,1,Winter
004,1,.
;
run;
data want;
set have;
if Actual=. then Actual=Projectid;
run;
proc print data=want;
format Projectid Actual $seas_name.;
run;
... View more