BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
AntonyBlunt
Calcite | Level 5

Hi;

 

I've been asked to write some code which will deal with date variables and date ranges within a dataset (in SAS 9,3.)

 

The specification details that the code should be able to handle an unknown number of date variables (in ddmmyy10. format) , with unspecified names.  It should also remove any dates in a variable if they fall outside of a specified range...... if a range is specified that is.

 

My thought was to create a csv parameter file that could be imported in, specifying the names of any variables, and a start and end date if needed.

 

So, i  imported the file:

data test;

infile date dsd dlm='2c'x;

informat name $32. start ddmmyy10. stop ddmmyy10.;

input name start stop;

format start stop ddmmyy10.;

run;

(eg of file contents:

one_date          01/01/1990     12/12/1999 

another_date           .             .       

)

 

then created macro variables for each column:

proc sql;

select name into: date_vars separated by " " from test;

select start into: start separated by " " from test;

select stop into: stop separated by " " from test;

quit;

from here, my first thought was a simple if then statement:

 

data want;

set have;

if &start ne . then if date_vars le &start then &date_vars = .;

run;

 

Which falls over with

NOTE: Line generated by the macro variable "START".

1 01/01/1990 .

-

22

 

as well as syntax errors.

 

I also tried arrays, which ended up with similar problems as well as "too many subscripts....." messages.:

array datevars{*} &date_vars ;

array startdates{*} &start ;

do i = 1 to dim(datevars);

do j = 1 to dim2(startdates);

if startdates{j} ne '.' then

if (datevars{i} le startdates{j}) then datevars{i} = ' ';

end;

run;

 

I realise I must have made some very rookie errors, but any pointers as to where I am going wrong (probably all of it...  🐵 )or any other help to accomplish this would be greatly appreciated!  🐵

 

 

Many Thanks in advance

 

1 ACCEPTED SOLUTION

Accepted Solutions
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Yes, sorry, the "..."d only takes date9. format e.g. 01JAN2014.  Change the step to:

 

data _null_;
  set test;
  call execute('data <your_dataset>; set <your_dataset>; if "'||put(input(start,ddmmyy10.),date9.)||'"d <= date <= "'||put(input(end,ddmmyy10.),date9.)||'"d then output;');
run;

Or you could actually just use the numeric value:

data _null_;
  set test;
  call execute('data <your_dataset>; set <your_dataset>; if '||input(start,ddmmyy10.)||' <= date <= '||input(end,ddmmyy10.)||' then output;');
run;

 

View solution in original post

7 REPLIES 7
LinusH
Tourmaline | Level 20

There are many ways to do this. I suspect that you will be served a variety of solutions.

How many variables (approx) can there be?

I would probably read the csv and generate a macro variable, containing a complete where-statement. So the most logic will in that step, and then just refer to the macro variable in a PROC SQl delete from, or a data step.

Data never sleeps
AntonyBlunt
Calcite | Level 5

Many thanks for the reply..

 

There could be almost any amount of variables - 20ish would be average..

 

How would you generate a macro variable, containing a complete where-statement? My marcro variable knowledge is confined to creating simple lists..  🙂

 

Thanks again

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Hi,

 

I am thinking you will need to be more specific.  You say there can be X number of date columns in a dataset, what happens if those date ranges are overlapping, i.e. some fit the criteria, others don't?  Post some test data, in the form of a datastep, and what you want the ouput to look like.  At a guess, I would say from this step, you have the means to generate the code directly:

data test;
  infile date dsd dlm='2c'x;
  informat name $32. start ddmmyy10. stop ddmmyy10.;
  input name start stop;
  format start stop ddmmyy10.;
run;

So you could then say something like (and not tested as no test data):

data _null_;
  set test;
  call execute('data <your_dataset>; set <your_dataset>; if "'||put(start,ddmmyy10.)||'"d <= date <= "'||put(end,ddmmyy10.)||'"d then output;');
run;

This will create a SAS datastep code for each variable (you could do it all in one if you like, just kept it like this for brevity) which does an if statement and only outputs data if date is between that.  This is iterative, so each variable top to bottom drops records from the selection before it.

AntonyBlunt
Calcite | Level 5

Thankyou the replies..

 


 


@RW9 wrote:

Hi,

 

I am thinking you will need to be more specific.  You say there can be X number of date columns in a dataset, what happens if those date ranges are overlapping, i.e. some fit the criteria, others don't?

 

 


 


This is the crux of the problem..The code is required to be generic, it cant refence any range or variable directly (as they may not exist at the time the code is run). So created the parameter file as above, which could contain any changing information.

 

For example,  if I had..

 

Have:

Name    one_date             Antother_date

AB          08/06/2007         12/05/1977

BB           12/05/1996         15/05/1990

CB           17/08/1985         20/08/1985

DB          11/11/1991         14/11/1991

 

 

Then with the date ranges supplied in the parameter file,  hopefully I would get:
Want:

Name    one_date             Antother_date

AB          .                              12/05/1977

BB           12/05/1996         15/05/1990

CB           .                             20/08/1985

DB          11/11/1991         14/11/1991

 

 

RW9
Diamond | Level 26 RW9
Diamond | Level 26

So did you try the call execute code I placed above?  What that does is takes the dataset imported from your parameter file, and generates a datastep with one if statement for each row of the parameter file.  It should give you what you want.

AntonyBlunt
Calcite | Level 5

Hi,

I apologise if I've made some really simple errors here, but I am still relatively speaking quite new to SAS, and am currently out of my comfort zone (which Is good though...)

 

 I presume its down to the way that my data is set up, but when I run it I get:

ERROR 77-185: Invalid number conversion on "01/01/1990"d. and

 

ERROR: Invalid date/time/datetime constant "01/01/1990"d.

ERROR: Invalid date/time/datetime constant "31/12/1995"d.

ERROR 77-185: Invalid number conversion on "31/12/1995"d.

 

If I remove the "d"'s from the code, then it comes up with "variable date not initialised". If I change date in the code, to a variable in the dataset, then I get  a lot of "invalid numeric data" messages. All my test data are formatted to ddmmyy10.

 

I think I've overlooked something, just not sure what.. 🙂

 

 

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Yes, sorry, the "..."d only takes date9. format e.g. 01JAN2014.  Change the step to:

 

data _null_;
  set test;
  call execute('data <your_dataset>; set <your_dataset>; if "'||put(input(start,ddmmyy10.),date9.)||'"d <= date <= "'||put(input(end,ddmmyy10.),date9.)||'"d then output;');
run;

Or you could actually just use the numeric value:

data _null_;
  set test;
  call execute('data <your_dataset>; set <your_dataset>; if '||input(start,ddmmyy10.)||' <= date <= '||input(end,ddmmyy10.)||' then output;');
run;

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 4234 views
  • 0 likes
  • 3 in conversation