DATA Step, Macro, Functions and more

How to use date formats to reference columns in conditional statements?

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 13
Accepted Solution

How to use date formats to reference columns in conditional statements?

Hi Community,

 

All I really need is to be able to figure out how to use the SAS date format as a string to reference a column name. 

 

I have a SAS date column that that is in the monyy7. (ex. jan20015) format . For each ID, there may be multiple observations with the only difference being different month values for the SAS_Date column. My data also contains a column for every month that is named in the same format as the dates in the SAS_Date column. What I am trying to do is pull the label from the SAS date format and use that to check what the corresponding value is in the column with the same name. 

 

 Here is an example of what my data looks like

ID     Jan2015     Feb2015     Mar2015    Apr2015     May2015     SAS_Date

1            1                  1                 2               5                   .            Feb2015

1            1                  1                 2               5                   .            Mar2015

1            1                  1                 2               5                   .            Apr2015

2            .                   2                 1               .                   5            Mar2015

2            .                   2                 1               .                   5            May2015

 

What I would like to do with this is create one row for each ID. It would take the date from the SAS_Date column and would compare the value from the corresponding row to the previous column. If they are the same then put a blank in that column, if different keep the value of the current month unless the previous month is blank then also put a blank in that current column.

 

What I would like my data to look like is

ID     Jan2015     Feb2015     Mar2015    Apr2015     May2015     

1            .                  .                 2               5                   .           

2            .                  .                 1               .                    .

 

I have already created a previous month column with

format prev_month monyy7.;
prev_month = Intnx('MONTH',SAS_date,-1);

If I can use the label of the date variables I think I will be able to figure out how to do the rest by using IF statements, the concatenate function, and a keep statements and then joining the data. There may be a more efficient way, but this is a way that I figured out how to handle a similar problem in the past.

 

If there is anything that is unclear please let me know and I will try to clarify.

 

Thanks for the help,

Tom

 


Accepted Solutions
Solution
‎04-13-2018 09:53 AM
Super User
Super User
Posts: 7,845

Re: How to use date formats to reference columns in conditional statements?

It was picking the values from the previous month because that is what you said you wanted.

 

To get it to search for a date you could convert the date to a string using a format.

i=whichc(put(sasdate,monyy7.),of names(*));

Make sure to change the values used in the NAMES() array to match the values that MONYY. format generates.  

 

If you know that your variables represent contiguous months then you might be able to generate the index using INTCK() function instead and just eliminate the NAMES array.

 i=intck('month','01JAN2015'd,sas_date)+1;
 if 1 <= i <= dim(vals) then want=vals(i);

View solution in original post


All Replies
Super User
Posts: 13,008

Re: How to use date formats to reference columns in conditional statements?


@Tommy1 wrote:

Hi Community,

 

All I really need is to be able to figure out how to use the SAS date format as a string to reference a column name. 

 

I have a SAS date column that that is in the monyy7. (ex. jan20015) format . For each ID, there may be multiple observations with the only difference being different month values for the SAS_Date column. My data also contains a column for every month that is named in the same format as the dates in the SAS_Date column. What I am trying to do is pull the label from the SAS date format and use that to check what the corresponding value is in the column with the same name. 

 

 Here is an example of what my data looks like

ID     Jan2015     Feb2015     Mar2015    Apr2015     May2015     SAS_Date

1            1                  1                 2               5                   .            Feb2015

1            1                  1                 2               5                   .            Mar2015

1            1                  1                 2               5                   .            Apr2015

2            .                   2                 1               .                   5            Mar2015

2            .                   2                 1               .                   5            May2015

 

What I would like to do with this is create one row for each ID. It would take the date from the SAS_Date column and would compare the value from the corresponding row to the previous column. If they are the same then put a blank in that column, if different keep the value of the current month unless the previous month is blank then also put a blank in that current column.

 

What I would like my data to look like is

ID     Jan2015     Feb2015     Mar2015    Apr2015     May2015     

1            .                  .                 2               5                   .           

2            .                  .                 1               .                    .

 

I have already created a previous month column with

format prev_month monyy7.;
prev_month = Intnx('MONTH',SAS_date,-1);

If I can use the label of the date variables I think I will be able to figure out how to do the rest by using IF statements, the concatenate function, and a keep statements and then joining the data. There may be a more efficient way, but this is a way that I figured out how to handle a similar problem in the past.

 

If there is anything that is unclear please let me know and I will try to clarify.

 

Thanks for the help,

Tom

 


Why does your result have missing value for Jan2015 and Feb2015 on the ID 1 row? "Previous column" does NOT have an obvious meaning. Do mean the variable name that corresponds the previous month of the date stored in a given column name?

 

Since your data seems to have duplicate values except for the SAS_DATE variable value I might suspect an earlier step that created the duplicates and perhaps this could be resolve earlier.

Occasional Contributor
Posts: 13

Re: How to use date formats to reference columns in conditional statements?

@ballardw Yes you are correct, the previous month corresponds to the column to the left of the one that is being referenced in the SAS_date column. You are also correct that the duplication is a result of an inner join between two data tables using proc sql in a previous step. I could not think of a more efficient way to merge the data and still capture all the data that I need. Thank you for responding.

Contributor
Posts: 47

Re: How to use date formats to reference columns in conditional statements?

Hi

Here is another solution, that works on actual data without the need to specify month variables in the code.

 

In this version it expects the data to be as shown, i.e. the first variable is a numeric ID variable, the last a sas-date, and all between are month variables. If real data is not like that, the code should be modified. Also, if more than one sas-date in an ID-group resolves to the same month, the code should be modified to make a summarization, where the out-variable is set.

 


data have; 
	ID = 1; Jan2015 = 1; Feb2015 = 2; Mar2015 = 3; SAS_date = '15feb2015'd; output; 
	ID = 1; Jan2015 = 1; Feb2015 = 2; Mar2015 = 3; SAS_date = '15mar2015'd; output; 
	ID = 1; Jan2015 = 1; Feb2015 = 2; Mar2015 = 3; SAS_date = '15jan2015'd; output; 
	ID = 2; Jan2015 = 1; Feb2015 = 2; Mar2015 = 3; SAS_date = '15feb2015'd; output; 
	ID = 3; Jan2015 = 1; Feb2015 = 2; Mar2015 = 3; SAS_date = '15jan2015'd; output; 
	ID = 3; Jan2015 = 1; Feb2015 = 2; Mar2015 = 3; SAS_date = '15mar2015'd; output; 
run;

* Set macro variable with number of months;
data _null_; set have (obs=1);
	array varin _numeric_;
	call symputx('monthcnt',dim(varin)-2);
run;
%put &=monthcnt;

* Works only if input data set is sorted on ID and a given SAS-date 
	(expressed as month) occurs only once within each ID group;
data want (drop=i SAS_date q1-q%eval(&monthcnt)); 
	set have; by ID;
	keep ID _numeric_;
	array in _numeric_;
	chk = put(SAS_date,monyy7.);

	* Declare tamp-array with retained variables to hold values for all
		records in ID group;
	array out q1-q%eval(&monthcnt);
	retain q1-q%eval(&monthcnt);

	* Set all months to missing in temp-array at start of each ID group;
	if first.ID then do i = 1 to &monthcnt; out{i} = .; end;
		
	* Transfer value for actual month to temp-array;
	do i = 1 to &monthcnt;
		a = vname(in{i+1});
		if upcase(vname(in{i+1})) = chk then out{i} = in{i+1};
	end;

	* Transfer values from temp_array to original variables and 
		output at end of each ID group;
	if last.ID then do;
		do i = 1 to &monthcnt; in{i+1} = out{i}; end;
		output;
	end;
run;

 

PROC Star
Posts: 1,296

Re: How to use date formats to reference columns in conditional statements?

Please explain your output with the logic exclusively. Thank you!

Occasional Contributor
Posts: 13

Re: How to use date formats to reference columns in conditional statements?

Posted in reply to novinosrin

@novinosrin The logic for the data would be;

 

 

/*In this code the SAS_Date is referring to column that is referenced by the 
value in the SAS_Date column itself since the way to obtain the label
is what I am looking for in the first place*/ if SAS_Date=Prev_month then SAS_Date=. if SAS_Date^=Prev_month and Prev_month=. then SAS_Date=.; if SAS_Date^=Prev_month and Prev_month^=. then SAS_Date= Sas_Date.;
 An example of what I mean by this is,
ID Jan2015    Feb2015   SAS_Date  Prev_month
  1     1                2        Feb2015      Jan2015
 
then the if statement that would be used in this example would be ;
if feb2015^= jan2015 and jan2015^=. then feb2015=feb2015;
giving
ID Jan2015    Feb2015
 1     .                 2
Super User
Super User
Posts: 7,845

Re: How to use date formats to reference columns in conditional statements?

You could literally do what you want if you had a list of the variable names.  One version as a normal variable list and the other as a quoted string.  Note that this method has nothing to do with dates. Just picking the column before the one named in the input character variable.

data have ;
  input ID Jan2015 Feb2015 Mar2015 Apr2015 May2015 SAS_Date $ ;
cards;
1 1 1 2 5 . Feb2015
1 1 1 2 5 . Mar2015
1 1 1 2 5 . Apr2015
2 . 2 1 . 5 Mar2015
2 . 2 1 . 5 May2015
;

data tall ;
 set have ;
 array vals Jan2015 Feb2015 Mar2015 Apr2015 May2015 ;
 array names (5) $7 _temporary_ ('Jan2015' 'Feb2015' 'Mar2015' 'Apr2015' 'May2015' )  ;
 i=whichc(sas_date,of names(*));
 if i>0 then want=vals(i-1);
run;
proc transpose data=tall out=want(drop=_name_);
  by id ;
  id sas_date ;
  var want ;
run;
Occasional Contributor
Posts: 13

Re: How to use date formats to reference columns in conditional statements?

@Tom Thanks for the reply and for trying to help, this is very close to what I need. I have never used the whichc function before. When I run your code in your example I get

ID  Feb2015   Mar2015   Apr2015  May2015

1           1              1              2              .

2           .                2              .             .

 

This didn't really make sense to me because it is showing the value of the of the previous month when I need the value of the current month.

 

I changed the code so that it was 

if i>0 then want=vals(i);

This seemed to work much better and corresponded to the to the values in the SAS_date column. 

 

ID  Feb2015   Mar2015   Apr2015  May2015

1           1              2              5              .

2           .                2              .             5

Now the only thing I am still confused about this is that my last column is in fact a date and not a character value. How do I change it so that I would be able to work with a date column? Thanks so much for your help.

 

Solution
‎04-13-2018 09:53 AM
Super User
Super User
Posts: 7,845

Re: How to use date formats to reference columns in conditional statements?

It was picking the values from the previous month because that is what you said you wanted.

 

To get it to search for a date you could convert the date to a string using a format.

i=whichc(put(sasdate,monyy7.),of names(*));

Make sure to change the values used in the NAMES() array to match the values that MONYY. format generates.  

 

If you know that your variables represent contiguous months then you might be able to generate the index using INTCK() function instead and just eliminate the NAMES array.

 i=intck('month','01JAN2015'd,sas_date)+1;
 if 1 <= i <= dim(vals) then want=vals(i);
Occasional Contributor
Posts: 13

Re: How to use date formats to reference columns in conditional statements?

@ErikLund_Jensen Thanks for the reply and for trying to help me out. Unfortunately, I was not able to figure out how generalize your code to my data. I spent a while very confused with the Have data set because it wasn't the example that I provided, so I thought it was doing something different. 

 

 @Tom Thanks for the help, I was able to get your original code to work once I added in the SAS dates!

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 142 views
  • 1 like
  • 5 in conversation