BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
DavidPhillips2
Rhodochrosite | Level 12

Is there a way to automatically do the below;

if Year1 = 1 and academic_period = '200810' then Fall _2008 =1; Year1 1 200810 Fall _2008
if Year2 = 1 and academic_period = '200710' then Fall _2008 =1; Year2 1 200710 Fall _2008
if Year3 = 1 and academic_period = '200610' then Fall _2008 =1; Year3 1 200610 Fall _2008
if Year4 = 1 and academic_period = '200510' then Fall _2008 =1; Year4 1 200510 Fall _2008
if Year5 = 1 and academic_period = '200410' then Fall _2008 =1; Year5 1 200410 Fall _2008
if Year6 = 1 and academic_period = '200310' then Fall _2008 =1; Year6 1 200310 Fall _2008
if Year1 = 1 and academic_period = '200910' then Fall_2009 =1; Year1 1 200910 Fall_2009
if Year2 = 1 and academic_period = '200810' then Fall_2009 =1; Year2 1 200810 Fall_2009
if Year3 = 1 and academic_period = '200710' then Fall_2009 =1; Year3 1 200710 Fall_2009
if Year4 = 1 and academic_period = '200610' then Fall_2009 =1; Year4 1 200610 Fall_2009
if Year5 = 1 and academic_period = '200510' then Fall_2009 =1; Year5 1 200510 Fall_2009
if Year6 = 1 and academic_period = '200410' then Fall_2009 =1; Year6 1 200410 Fall_2009
if Year1 = 1 and academic_period = '201010' then Fall_2010 =1; Year1 1 201010 Fall_2010
if Year2 = 1 and academic_period = '200910' then Fall_2010 =1; Year2 1 200910 Fall_2010
if Year3 = 1 and academic_period = '200810' then Fall_2010 =1; Year3 1 200810 Fall_2010
if Year4 = 1 and academic_period = '200710' then Fall_2010 =1; Year4 1 200710 Fall_2010
if Year5 = 1 and academic_period = '200610' then Fall_2010 =1; Year5 1 200610 Fall_2010
if Year6 = 1 and academic_period = '200510' then Fall_2010 =1; Year6 1 200510 Fall_2010

I'm trying to dynamically create fields based on a formula.  The probability of human error in the future is too high if I just use one SAS line per statement here.

 

The part that I want is the dynamic  Fall_XXXX = 1 part based on a formula.

 

I experimented with *catx(' ', 'Fall', sum(input(substr(academic_period, 1, 4), 4.), 0)) = 1;  But I'm missing the syntax to make this line work.

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

To make it truly dynamic use PROC TRANSPOSE.

data step1;
  set temp;
  calendar_year=input(acaedemic_period_base,4.)+input(substr(time_variable,5),32.)-1;
run;

proc transpose data=step1 prefix=FALL_ out=want(drop=_name_);
  by time_variable acaedemic_period_base notsorted;
  id calendar_year;
  var measure1;
run;
                   acaedemic_
        time_       period_      FALL_    FALL_    FALL_
Obs    variable       base        2008     2009     2010

  1     Year1        200810        1        .        .
  2     Year2        200710        1        .        .
  3     Year3        200610        1        .        .
  4     Year4        200510        1        .        .
  5     Year5        200410        1        .        .
  6     Year6        200310        0        .        .
  7     Year1        200910        .        1        .
  8     Year2        200810        .        1        .
  9     Year3        200710        .        1        .
 10     Year4        200610        .        0        .
 11     Year5        200510        .        0        .
 12     Year6        200410        .        0        .
 13     Year1        201010        .        .        1
 14     Year2        200910        .        .        0
 15     Year3        200810        .        .        0
 16     Year4        200710        .        .        0
 17     Year5        200610        .        .        0
 18     Year6        200510        .        .        0

View solution in original post

22 REPLIES 22
ballardw
Super User

I am not sure exactly what your STARTING data looks like.

 

I am very surprised that "academic_period" values from what appear to be 6 calendar year like values all get assigned the same Fall_XXXX value. Are you sure there are no typos in there?

 

What was this supposed to do?

sum(input(substr(academic_period, 1, 4), 4.), 0))

You are summing some value with zero. So why bother?

 

It may help to tell why you have "year1" etc to begin with and what the underlying rules are. This is one case where a working example may not actually provide sufficient information for a general algorithm such as what to do "next year" when Year7=1 and some academic period.

 

I suspect that there may be a possible solution with Intnx, your academic period and something extracted from the Year value but I am not going to expend much effort trying to derive obscure rules from an example.

 

I believe we have indicated a preference for starting data as data step code to give us something to work with before this.

DavidPhillips2
Rhodochrosite | Level 12

proc sql;
	create table temp (time_variable varchar(50),
	acaedemic_period_base varchar(6),
	want varchar (50)
);
quit;
proc sql;insert into temp(time_variable, acaedemic_period_base, want) values ('Year1', '200810', 'Fall _2008 =1');
insert into temp(time_variable, acaedemic_period_base, want) values ('Year2', '200710', 'Fall _2008 =1');
insert into temp(time_variable, acaedemic_period_base, want) values ('Year3', '200610', 'Fall _2008 =1');
insert into temp(time_variable, acaedemic_period_base, want) values ('Year4', '200510', 'Fall _2008 =1');
insert into temp(time_variable, acaedemic_period_base, want) values ('Year5', '200410', 'Fall _2008 =1');
insert into temp(time_variable, acaedemic_period_base, want) values ('Year6', '200310', 'Fall _2008 =1');
insert into temp(time_variable, acaedemic_period_base, want) values ('Year1', '200910', 'Fall_2009 = 1');
insert into temp(time_variable, acaedemic_period_base, want) values ('Year2', '200810', 'Fall_2009 = 1');
insert into temp(time_variable, acaedemic_period_base, want) values ('Year3', '200710', 'Fall_2009 = 1');
insert into temp(time_variable, acaedemic_period_base, want) values ('Year4', '200610', 'Fall_2009 = 1');
insert into temp(time_variable, acaedemic_period_base, want) values ('Year5', '200510', 'Fall_2009 = 1');
insert into temp(time_variable, acaedemic_period_base, want) values ('Year6', '200410', 'Fall_2009 = 1');
insert into temp(time_variable, acaedemic_period_base, want) values ('Year1', '201010', 'Fall_2010 = 1');
insert into temp(time_variable, acaedemic_period_base, want) values ('Year2', '200910', 'Fall_2010 = 1');
insert into temp(time_variable, acaedemic_period_base, want) values ('Year3', '200810', 'Fall_2010 = 1');
insert into temp(time_variable, acaedemic_period_base, want) values ('Year4', '200710', 'Fall_2010 = 1');
insert into temp(time_variable, acaedemic_period_base, want) values ('Year5', '200610', 'Fall_2010 = 1');
insert into temp(time_variable, acaedemic_period_base, want) values ('Year6', '200510', 'Fall_2010 = 1');

quit;
DavidPhillips2
Rhodochrosite | Level 12

The formula I pasted was to dynamically create fields like Fall 2010 and set the value equal to 1 based on a formula;  I'm trying to figure out how to use any formula to create a dynamically named field.  The fields will be used in a wide table.

Tom
Super User Tom
Super User

You need to describe better what you are trying to do.  If you want to dynamically name a variable you have two choices.  Use PROC TABULATE with its ID statement to tell it which variable to use as the name.  Or use some code generation.

 

But perhaps you just want to keep the data in a tall skinny structure instead of creating new variables.  You could always create a report that converts it to a more rectangular layoff for viewing.

 

DavidPhillips2
Rhodochrosite | Level 12

I'm trying to dynamically create a wide table off of a formula that defines many variables.

Tom
Super User Tom
Super User

@DavidPhillips2 wrote:

I'm trying to dynamically create a wide table off of a formula that defines many variables.


PLEASE PROVIDE EXAMPLE DATA.

Show what SAS code you want to create from that data.

If you want to generate code from DATA the easiest way is to use the PUT statement in a data step to write the code to a file. You can then use %INCLUDE to execute the generated code.

 

Since you haven't really provided any workable example data or code let's just make a test case.

Say you have a dataset name META with two variables:  VAR and CODE.  The first has the name to use for a new variable and the second has the expression to use to assign a value to that new variable.  Then you can generate a series of assignment statement to create the new variables.

filename code temp;
data _null_;
  set meta;
   file print;
   put var '=(' code ');' ;
run;

Now if have another dataset named HAVE that contains the actual data (the variables that are referenced by the CODE values) then you can use the generated code to create new variables by including it into a data step.

data want;
  set have;
%include code / source2;
run;

 

ChrisNZ
Tourmaline | Level 20

> You are summing some value with zero. So why bother?

This prevents returning missing values. Whether that's a good thing here remains to be seen. 

ChrisNZ
Tourmaline | Level 20

> I experimented with *catx(' ', 'Fall', sum(input(substr(academic_period, 1, 4), 4.), 0)) = 1;  But I'm missing the syntax to make this line work.

 

This line will not work. Variable names are all defined when the data step is compiled. You can't have functions to the left of the = sign (exceptions apply).

 

Your choices are:

- A macro that writes the code creating the variables you want. This code is then compiled as part of the data step.

- Creating variables that contain the values you need and then transposing the table to convert these values into variable names. 

 

DavidPhillips2
Rhodochrosite | Level 12

Do you mean I need to create intermediary variables and then save them to the final variable.  I'm trying to save the result of the formula string to a data step variable.  The logic will create 200 variables for a wide table.

ChrisNZ
Tourmaline | Level 20

I am saying all the variable names must be known when the data step compiles, so before it runs. You can't add variables after compilation.

DavidPhillips2
Rhodochrosite | Level 12

Can I define them with a length statement and then populate them dynamically like with the catx statement?

ChrisNZ
Tourmaline | Level 20

If you want to go this route (rather than transposing) then you can define the variables in an array statement, and then refer to either the array index (position) or the variable name associated with that index. Like this:

data t;
  array A[3] VAR1 VAR2 VAR3;
  A[1]=1;
  VAR2=2;
run;

 

DavidPhillips2
Rhodochrosite | Level 12

I'm trying to identify the name of the variable backward, the array method predefines the name.  Since I'm working backward it might make sense to transpose.  I've used transpose before, I'll look into options with it tomorrow.

ChrisNZ
Tourmaline | Level 20

If you don't know the variable names in advance, then transposing is probably better. The other option would be to build the names in a macro and then use the macro to define the array. That's more convoluted.

Having said that, if the names are random, how can you build a program that uses them?

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 22 replies
  • 1791 views
  • 2 likes
  • 4 in conversation