Is there a way to automatically do the below;
if Year1 = 1 and academic_period = '200810' then Fall _2008 =1; | Year1 | 1 | 200810 | Fall _2008 |
if Year2 = 1 and academic_period = '200710' then Fall _2008 =1; | Year2 | 1 | 200710 | Fall _2008 |
if Year3 = 1 and academic_period = '200610' then Fall _2008 =1; | Year3 | 1 | 200610 | Fall _2008 |
if Year4 = 1 and academic_period = '200510' then Fall _2008 =1; | Year4 | 1 | 200510 | Fall _2008 |
if Year5 = 1 and academic_period = '200410' then Fall _2008 =1; | Year5 | 1 | 200410 | Fall _2008 |
if Year6 = 1 and academic_period = '200310' then Fall _2008 =1; | Year6 | 1 | 200310 | Fall _2008 |
if Year1 = 1 and academic_period = '200910' then Fall_2009 =1; | Year1 | 1 | 200910 | Fall_2009 |
if Year2 = 1 and academic_period = '200810' then Fall_2009 =1; | Year2 | 1 | 200810 | Fall_2009 |
if Year3 = 1 and academic_period = '200710' then Fall_2009 =1; | Year3 | 1 | 200710 | Fall_2009 |
if Year4 = 1 and academic_period = '200610' then Fall_2009 =1; | Year4 | 1 | 200610 | Fall_2009 |
if Year5 = 1 and academic_period = '200510' then Fall_2009 =1; | Year5 | 1 | 200510 | Fall_2009 |
if Year6 = 1 and academic_period = '200410' then Fall_2009 =1; | Year6 | 1 | 200410 | Fall_2009 |
if Year1 = 1 and academic_period = '201010' then Fall_2010 =1; | Year1 | 1 | 201010 | Fall_2010 |
if Year2 = 1 and academic_period = '200910' then Fall_2010 =1; | Year2 | 1 | 200910 | Fall_2010 |
if Year3 = 1 and academic_period = '200810' then Fall_2010 =1; | Year3 | 1 | 200810 | Fall_2010 |
if Year4 = 1 and academic_period = '200710' then Fall_2010 =1; | Year4 | 1 | 200710 | Fall_2010 |
if Year5 = 1 and academic_period = '200610' then Fall_2010 =1; | Year5 | 1 | 200610 | Fall_2010 |
if Year6 = 1 and academic_period = '200510' then Fall_2010 =1; | Year6 | 1 | 200510 | Fall_2010 |
I'm trying to dynamically create fields based on a formula. The probability of human error in the future is too high if I just use one SAS line per statement here.
The part that I want is the dynamic Fall_XXXX = 1 part based on a formula.
I experimented with *catx(' ', 'Fall', sum(input(substr(academic_period, 1, 4), 4.), 0)) = 1; But I'm missing the syntax to make this line work.
To make it truly dynamic use PROC TRANSPOSE.
data step1;
set temp;
calendar_year=input(acaedemic_period_base,4.)+input(substr(time_variable,5),32.)-1;
run;
proc transpose data=step1 prefix=FALL_ out=want(drop=_name_);
by time_variable acaedemic_period_base notsorted;
id calendar_year;
var measure1;
run;
acaedemic_ time_ period_ FALL_ FALL_ FALL_ Obs variable base 2008 2009 2010 1 Year1 200810 1 . . 2 Year2 200710 1 . . 3 Year3 200610 1 . . 4 Year4 200510 1 . . 5 Year5 200410 1 . . 6 Year6 200310 0 . . 7 Year1 200910 . 1 . 8 Year2 200810 . 1 . 9 Year3 200710 . 1 . 10 Year4 200610 . 0 . 11 Year5 200510 . 0 . 12 Year6 200410 . 0 . 13 Year1 201010 . . 1 14 Year2 200910 . . 0 15 Year3 200810 . . 0 16 Year4 200710 . . 0 17 Year5 200610 . . 0 18 Year6 200510 . . 0
I am not sure exactly what your STARTING data looks like.
I am very surprised that "academic_period" values from what appear to be 6 calendar year like values all get assigned the same Fall_XXXX value. Are you sure there are no typos in there?
What was this supposed to do?
sum(input(substr(academic_period, 1, 4), 4.), 0))
You are summing some value with zero. So why bother?
It may help to tell why you have "year1" etc to begin with and what the underlying rules are. This is one case where a working example may not actually provide sufficient information for a general algorithm such as what to do "next year" when Year7=1 and some academic period.
I suspect that there may be a possible solution with Intnx, your academic period and something extracted from the Year value but I am not going to expend much effort trying to derive obscure rules from an example.
I believe we have indicated a preference for starting data as data step code to give us something to work with before this.
proc sql;
create table temp (time_variable varchar(50),
acaedemic_period_base varchar(6),
want varchar (50)
);
quit;
proc sql;insert into temp(time_variable, acaedemic_period_base, want) values ('Year1', '200810', 'Fall _2008 =1');
insert into temp(time_variable, acaedemic_period_base, want) values ('Year2', '200710', 'Fall _2008 =1');
insert into temp(time_variable, acaedemic_period_base, want) values ('Year3', '200610', 'Fall _2008 =1');
insert into temp(time_variable, acaedemic_period_base, want) values ('Year4', '200510', 'Fall _2008 =1');
insert into temp(time_variable, acaedemic_period_base, want) values ('Year5', '200410', 'Fall _2008 =1');
insert into temp(time_variable, acaedemic_period_base, want) values ('Year6', '200310', 'Fall _2008 =1');
insert into temp(time_variable, acaedemic_period_base, want) values ('Year1', '200910', 'Fall_2009 = 1');
insert into temp(time_variable, acaedemic_period_base, want) values ('Year2', '200810', 'Fall_2009 = 1');
insert into temp(time_variable, acaedemic_period_base, want) values ('Year3', '200710', 'Fall_2009 = 1');
insert into temp(time_variable, acaedemic_period_base, want) values ('Year4', '200610', 'Fall_2009 = 1');
insert into temp(time_variable, acaedemic_period_base, want) values ('Year5', '200510', 'Fall_2009 = 1');
insert into temp(time_variable, acaedemic_period_base, want) values ('Year6', '200410', 'Fall_2009 = 1');
insert into temp(time_variable, acaedemic_period_base, want) values ('Year1', '201010', 'Fall_2010 = 1');
insert into temp(time_variable, acaedemic_period_base, want) values ('Year2', '200910', 'Fall_2010 = 1');
insert into temp(time_variable, acaedemic_period_base, want) values ('Year3', '200810', 'Fall_2010 = 1');
insert into temp(time_variable, acaedemic_period_base, want) values ('Year4', '200710', 'Fall_2010 = 1');
insert into temp(time_variable, acaedemic_period_base, want) values ('Year5', '200610', 'Fall_2010 = 1');
insert into temp(time_variable, acaedemic_period_base, want) values ('Year6', '200510', 'Fall_2010 = 1');
quit;
The formula I pasted was to dynamically create fields like Fall 2010 and set the value equal to 1 based on a formula; I'm trying to figure out how to use any formula to create a dynamically named field. The fields will be used in a wide table.
You need to describe better what you are trying to do. If you want to dynamically name a variable you have two choices. Use PROC TABULATE with its ID statement to tell it which variable to use as the name. Or use some code generation.
But perhaps you just want to keep the data in a tall skinny structure instead of creating new variables. You could always create a report that converts it to a more rectangular layoff for viewing.
I'm trying to dynamically create a wide table off of a formula that defines many variables.
@DavidPhillips2 wrote:
I'm trying to dynamically create a wide table off of a formula that defines many variables.
PLEASE PROVIDE EXAMPLE DATA.
Show what SAS code you want to create from that data.
If you want to generate code from DATA the easiest way is to use the PUT statement in a data step to write the code to a file. You can then use %INCLUDE to execute the generated code.
Since you haven't really provided any workable example data or code let's just make a test case.
Say you have a dataset name META with two variables: VAR and CODE. The first has the name to use for a new variable and the second has the expression to use to assign a value to that new variable. Then you can generate a series of assignment statement to create the new variables.
filename code temp;
data _null_;
set meta;
file print;
put var '=(' code ');' ;
run;
Now if have another dataset named HAVE that contains the actual data (the variables that are referenced by the CODE values) then you can use the generated code to create new variables by including it into a data step.
data want;
set have;
%include code / source2;
run;
> You are summing some value with zero. So why bother?
This prevents returning missing values. Whether that's a good thing here remains to be seen.
> I experimented with *catx(' ', 'Fall', sum(input(substr(academic_period, 1, 4), 4.), 0)) = 1; But I'm missing the syntax to make this line work.
This line will not work. Variable names are all defined when the data step is compiled. You can't have functions to the left of the = sign (exceptions apply).
Your choices are:
- A macro that writes the code creating the variables you want. This code is then compiled as part of the data step.
- Creating variables that contain the values you need and then transposing the table to convert these values into variable names.
Do you mean I need to create intermediary variables and then save them to the final variable. I'm trying to save the result of the formula string to a data step variable. The logic will create 200 variables for a wide table.
I am saying all the variable names must be known when the data step compiles, so before it runs. You can't add variables after compilation.
Can I define them with a length statement and then populate them dynamically like with the catx statement?
If you want to go this route (rather than transposing) then you can define the variables in an array statement, and then refer to either the array index (position) or the variable name associated with that index. Like this:
data t;
array A[3] VAR1 VAR2 VAR3;
A[1]=1;
VAR2=2;
run;
I'm trying to identify the name of the variable backward, the array method predefines the name. Since I'm working backward it might make sense to transpose. I've used transpose before, I'll look into options with it tomorrow.
If you don't know the variable names in advance, then transposing is probably better. The other option would be to build the names in a macro and then use the macro to define the array. That's more convoluted.
Having said that, if the names are random, how can you build a program that uses them?
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.