The names are not random. They are derived based on a formula. The list will be Fall 2008 - Fall 2020 after being derived.
@DavidPhillips2 wrote:
The names are not random. They are derived based on a formula. The list will be Fall 2008 - Fall 2020 after being derived.
In that case do NOT use dynamic names. Fix the names and use an array. Then you can use a calcalated YEAR number to index into the array and decide which variable to use to store the value (or which value to read).
So something like:
data want;
set have;
array fall [2008:2020] fall2008-fall2020;
year = ** formula1 **;
fall[year] = ** formula 2 ** ;
run;
I follow the logic in general. I'm trying to follow the details. The idea is to populate the fields of the Fall 2008- Fall2020 after identifying which field the measure relates to.
proc sql;
create table temp (time_variable varchar(50),
acaedemic_period_base varchar(6),
want varchar (50),
measure1 num(1)
);
quit;
proc sql;insert into temp(time_variable, acaedemic_period_base, want, measure1) values ('Year1', '200810', 'Fall _2008 =1', 1);
insert into temp(time_variable, acaedemic_period_base, want, measure1) values ('Year2', '200710', 'Fall _2008 =1', 1);
insert into temp(time_variable, acaedemic_period_base, want, measure1) values ('Year3', '200610', 'Fall _2008 =1', 1);
insert into temp(time_variable, acaedemic_period_base, want, measure1) values ('Year4', '200510', 'Fall _2008 =1', 1);
insert into temp(time_variable, acaedemic_period_base, want, measure1) values ('Year5', '200410', 'Fall _2008 =1', 1);
insert into temp(time_variable, acaedemic_period_base, want, measure1) values ('Year6', '200310', 'Fall _2008 =1', 0);
insert into temp(time_variable, acaedemic_period_base, want, measure1) values ('Year1', '200910', 'Fall_2009 = 1', 1);
insert into temp(time_variable, acaedemic_period_base, want, measure1) values ('Year2', '200810', 'Fall_2009 = 1', 1);
insert into temp(time_variable, acaedemic_period_base, want, measure1) values ('Year3', '200710', 'Fall_2009 = 1', 1);
insert into temp(time_variable, acaedemic_period_base, want, measure1) values ('Year4', '200610', 'Fall_2009 = 1', 0);
insert into temp(time_variable, acaedemic_period_base, want, measure1) values ('Year5', '200510', 'Fall_2009 = 1', 0);
insert into temp(time_variable, acaedemic_period_base, want, measure1) values ('Year6', '200410', 'Fall_2009 = 1', 0);
insert into temp(time_variable, acaedemic_period_base, want, measure1) values ('Year1', '201010', 'Fall_2010 = 1', 1);
insert into temp(time_variable, acaedemic_period_base, want, measure1) values ('Year2', '200910', 'Fall_2010 = 1', 0);
insert into temp(time_variable, acaedemic_period_base, want, measure1) values ('Year3', '200810', 'Fall_2010 = 1', 0);
insert into temp(time_variable, acaedemic_period_base, want, measure1) values ('Year4', '200710', 'Fall_2010 = 1', 0);
insert into temp(time_variable, acaedemic_period_base, want, measure1) values ('Year5', '200610', 'Fall_2010 = 1', 0);
insert into temp(time_variable, acaedemic_period_base, want, measure1) values ('Year6', '200510', 'Fall_2010 = 1', 0);
quit;
data want;
set temp;
array fall [2008:2020] Fall_2008-Fall_2020;
/*find the year referenced*/
year = catx(' ', 'Fall', sum(input(substr(acaedemic_period_base, 1, 4), 4.), 0));
/*record the 1 or 0 of the referenced year to the Fall_2008-Fall_2020 array slot*/
run;
Thanks for posting data. Note here is a much easier way to create such a dataset that is also easier to review as text.
data temp;
infile cards dsd truncover ;
input time_variable :$50. acaedemic_period_base :$6. want :$50. measure1 ;
cards;
Year1,200810,Fall _2008 =1,1
Year2,200710,Fall _2008 =1,1
Year3,200610,Fall _2008 =1,1
Year4,200510,Fall _2008 =1,1
Year5,200410,Fall _2008 =1,1
Year6,200310,Fall _2008 =1,0
Year1,200910,Fall_2009 = 1,1
Year2,200810,Fall_2009 = 1,1
Year3,200710,Fall_2009 = 1,1
Year4,200610,Fall_2009 = 1,0
Year5,200510,Fall_2009 = 1,0
Year6,200410,Fall_2009 = 1,0
Year1,201010,Fall_2010 = 1,1
Year2,200910,Fall_2010 = 1,0
Year3,200810,Fall_2010 = 1,0
Year4,200710,Fall_2010 = 1,0
Year5,200610,Fall_2010 = 1,0
Year6,200510,Fall_2010 = 1,0
;
So you can use SCAN() to pull out "words" from your WANT string and use INPUT() to convert those strings into numbers.
data want;
set temp;
year = input(scan(want,-2,'_='),32.);
array fall [2008:2010] Fall_2008-Fall_2010;
if 2008 <= year <= 2010 then fall[year]=input(scan(want,-1,'='),32.);
run;
proc print;
run;
acaedemic_ time_ period_ Fall_ Fall_ Fall_ Obs variable base want measure1 year 2008 2009 2010 1 Year1 200810 Fall _2008 =1 1 2008 1 . . 2 Year2 200710 Fall _2008 =1 1 2008 1 . . 3 Year3 200610 Fall _2008 =1 1 2008 1 . . 4 Year4 200510 Fall _2008 =1 1 2008 1 . . 5 Year5 200410 Fall _2008 =1 1 2008 1 . . 6 Year6 200310 Fall _2008 =1 0 2008 1 . . 7 Year1 200910 Fall_2009 = 1 1 2009 . 1 . 8 Year2 200810 Fall_2009 = 1 1 2009 . 1 . 9 Year3 200710 Fall_2009 = 1 1 2009 . 1 . 10 Year4 200610 Fall_2009 = 1 0 2009 . 1 . 11 Year5 200510 Fall_2009 = 1 0 2009 . 1 . 12 Year6 200410 Fall_2009 = 1 0 2009 . 1 . 13 Year1 201010 Fall_2010 = 1 1 2010 . . 1 14 Year2 200910 Fall_2010 = 1 0 2010 . . 1 15 Year3 200810 Fall_2010 = 1 0 2010 . . 1 16 Year4 200710 Fall_2010 = 1 0 2010 . . 1 17 Year5 200610 Fall_2010 = 1 0 2010 . . 1 18 Year6 200510 Fall_2010 = 1 0 2010 . . 1
But that seems really silly. Don't you just want to pull the year from the academic variable instead?
If so then how do you get from academic year of "200710" or "200610" to FALL_2010?
Perhaps by adding the digit in the TIME variable?
data want;
set temp;
base_year=input(acaedemic_period_base,4.);
offset=input(substr(time_variable,5),32.);
array fall [2008:2010] Fall_2008-Fall_2010;
fall[base_year+offset-1]=1;
run;
To make it truly dynamic use PROC TRANSPOSE.
data step1;
set temp;
calendar_year=input(acaedemic_period_base,4.)+input(substr(time_variable,5),32.)-1;
run;
proc transpose data=step1 prefix=FALL_ out=want(drop=_name_);
by time_variable acaedemic_period_base notsorted;
id calendar_year;
var measure1;
run;
acaedemic_ time_ period_ FALL_ FALL_ FALL_ Obs variable base 2008 2009 2010 1 Year1 200810 1 . . 2 Year2 200710 1 . . 3 Year3 200610 1 . . 4 Year4 200510 1 . . 5 Year5 200410 1 . . 6 Year6 200310 0 . . 7 Year1 200910 . 1 . 8 Year2 200810 . 1 . 9 Year3 200710 . 1 . 10 Year4 200610 . 0 . 11 Year5 200510 . 0 . 12 Year6 200410 . 0 . 13 Year1 201010 . . 1 14 Year2 200910 . . 0 15 Year3 200810 . . 0 16 Year4 200710 . . 0 17 Year5 200610 . . 0 18 Year6 200510 . . 0
Tom,
Thanks, for posting this. I was having a hard time solving this one.
I posted a simplification of what I am looking for. I have a semi-wide table that I am trying to convert to a wide table. Academic_period_base is the starting year. Year1 is one year from the base year. Year2 is two years from the base year. If the student enrolled then 1st year then year1 is 1. If the student enrolled the second year year2 is 1. The formula below adds the 1 or 2 from the year fields to academic_period_base and creates calendar_year1 and calendar_year2. The goal is to create an output that contains variables 2005-2008 that correlate to the values year 1 and year2 in a wide format. The actual wide table will have about 250 wide fields.
proc sql;
create table temp2
(
acaedemic_period_base varchar(6),
year1 num(1),
year2 num(1)
);
quit;
proc sql;
insert into temp2(acaedemic_period_base, Year1, Year2) values ('200810', 1, 1);
insert into temp2(acaedemic_period_base, Year1, Year2) values ('200710', 1, 1);
insert into temp2(acaedemic_period_base, Year1, Year2) values ('200810', 0, 0);
insert into temp2(acaedemic_period_base, Year1, Year2) values ('200710', 1, 0);
insert into temp2(acaedemic_period_base, Year1, Year2) values ('200810', 1, 1);
insert into temp2(acaedemic_period_base, Year1, Year2) values ('200710', 1, 1);
quit;
data step1;
set temp2;
calendar_year1=input(acaedemic_period_base,4.)+input(substr(year1,5),32.)-1;
calendar_year2=input(acaedemic_period_base,4.)+input(substr(year2,5),32.)-2;
run;
This will work for the full solution.
proc sql;
create table temp2
(
acaedemic_period_base varchar(6),
year1 num(1),
year2 num(1)
);
quit;
proc sql;
insert into temp2(acaedemic_period_base, Year1, Year2) values ('200810', 1, 1);
insert into temp2(acaedemic_period_base, Year1, Year2) values ('200710', 1, 1);
insert into temp2(acaedemic_period_base, Year1, Year2) values ('200810', 0, 0);
insert into temp2(acaedemic_period_base, Year1, Year2) values ('200710', 1, 0);
insert into temp2(acaedemic_period_base, Year1, Year2) values ('200810', 1, 1);
insert into temp2(acaedemic_period_base, Year1, Year2) values ('200710', 1, 1);
quit;
data want;
set temp2;
base_year=input(acaedemic_period_base,4.)-1;
array fall [2007:2010] Fall_2007-Fall_2010;
fall[base_year+1]=year1;
fall[base_year+2]=year2;
if Fall_2007 = . then Fall_2008 =0;
if Fall_2008 = . then Fall_2008 =0;
if Fall_2009 = . then Fall_2009 =0;
if Fall_2010 = . then Fall_2010 =0;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.