BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Yves_Boonen
Quartz | Level 8

I have a table in which you'll find the document_handler, kind_of_document, r_creation_date and document_state. The table has been prepared in advance based on kind_of_document "AVE", so this column can be ignored. Right now I need to make a script that counts how many documents each handler has on hold and on to do. The script below manages to do that, however, I still need to find the oldest r_creation_date for each document state for each handler. That is something else...

 

The table:

document_handlerkind_of_documentr_creation_datedocument_state
Name1AVE3/03/2016HOLD
Name1AVE29/03/2016HOLD
Name1AVE23/03/2016HOLD
Name1AVE23/02/2016HOLD
Name1AVE9/03/2016HOLD
Name2AVE1/03/2016HOLD
Name2AVE25/08/2015HOLD
Name2AVE27/02/2016HOLD
Name2AVE1/03/2016HOLD
Name2AVE22/02/2016HOLD
Name2AVE22/03/2016HOLD
Name2AVE29/01/2016HOLD
Name2AVE22/03/2016HOLD
Name2AVE22/03/2016HOLD
Name2AVE20/02/2016HOLD
Name2AVE7/03/2016HOLD
Name2AVE14/03/2016HOLD
Name2AVE2/03/2016HOLD
Name2AVE26/01/2016HOLD
Name2AVE1/03/2016HOLD
Name2AVE19/02/2016HOLD
Name2AVE29/03/2016TODO
Name2AVE29/03/2016TODO
Name2AVE31/03/2016TODO
Name2AVE31/03/2016TODO
Name2AVE30/03/2016TODO
Name3AVE14/01/2016HOLD
Name3AVE1/03/2016HOLD
Name3AVE8/02/2016HOLD
Name3AVE17/02/2016HOLD
Name3AVE29/01/2016HOLD
Name3AVE2/02/2016HOLD
Name3AVE24/02/2016HOLD
Name3AVE23/02/2016HOLD
Name3AVE29/02/2016TODO
Name3AVE15/03/2016TODO

 

The code:

data want;
set have;
by document_handler;
retain
	hold
	todo
;

if first.document_handler
then do;
  	hold=0;
	todo=0;
end;

select (document_state);

  	when ('HOLD') hold+1;
    
	when ('TODO') todo+1;

  	otherwise ignore+1;

end;

if last.document_handler then output;
keep document_handler hold todo;
run;

 

How can I tell SAS that it has to take the "min(r_creation_date)" for each document_state for each document_handler? Perhaps this doesn't require a datastep, but just a proc sql?

1 ACCEPTED SOLUTION

Accepted Solutions
RW9
Diamond | Level 26 RW9
Diamond | Level 26

As noted, there is no test data, so the code is untested.  To get tested code, post test data in the form of a datastep. 

data want (keep=document_handler hold todo);
  set have;
  by document_handler;
  retain hold todo first_date1 first_date2;
  if first.document_handler then do;
    hold=0;
    todo=0;
    first_date1=.;
    first_date2=.;
  end;
  select (document_state);
  	when ('HOLD') do;
      hold+1;
      first_date1=ifn(first_date1=. or r_creation_date < first_date1,r_creation_date,first_date1);
      end;
    when ('TODO') do;
      todo+1;
      first_date2=ifn(first_date2=. or r_creation_date < first_date2,r_creation_date,first_date2);
      end;
    otherwise ignore+1;  
  end;
  if last.document_handler then output;
run;

View solution in original post

13 REPLIES 13
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Just add another retained value, on first obs set to date, then if any subsequent is less then set them:

data want (keep document_handler hold todo);
  set have;
  by document_handler;
  retain hold todo first_date;
  if first.document_handler then do;
    hold=0;
    todo=0;
    first_date=r_creation_date;
  end;
  select (document_state);
  	when ('HOLD') hold+1;
    when ('TODO') todo+1;
  	otherwise ignore+1;  
  end;
  if r_creation_date < first_date then first_date=r_creation_date;
  if last.document_handler then output;
run;
Yves_Boonen
Quartz | Level 8

The result I get, is the same I got before.

 

document_handlerholdtodo
Name153
Name2116
Name329

 

The result I should get, requires two dates (one for each state). For example:

 

document_handlerholddate1tododate2
Name1531/03/2016327/03/2016
Name21128/03/2016627/03/2016
Name325/03/2016915/03/2016

 

Date1 should be the oldest document in HOLD for each handler (Name1, Name2, Name3,...).

Date2 should be the oldest document in TODO for each handler (Name1, Name2, Name3,...).

 

Suggestions?

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Have two variables, and do the calculation in the select block then, sorry, I ddin't see that requirement first time.  Can't test this as no test data (datastep with example data):

data want (keep document_handler hold todo);
  set have;
  by document_handler;
  retain hold todo first_date1 first_date2;
  if first.document_handler then do;
    hold=0;
    todo=0;
    first_date1=.;
    first_date2=.;
  end;
  select (document_state);
  	when ('HOLD') do;
      hold+1;
      ifn(first_date1=. or r_creation_date < first_date1,r_creation_date,first_date1);
      end;
    when ('TODO') do;
      todo+1;
      ifn(first_date2=. or r_creation_date < first_date2,r_creation_date,first_date2);
      end;
    otherwise ignore+1;  
  end;
  if last.document_handler then output;
run;
Yves_Boonen
Quartz | Level 8

I used your code and I received a few errors. The first one was the absense of the "=" in the keep statement. After correcting that, I still receive some "ifn" errors, which I am not familiar with.

 

I am using the exact same code as posted above.

 

Log:

1254  data test1 (keep = document_handler hold todo);
1255    set ave_1;
1256    by document_handler;
1257    retain hold todo first_date1 first_date2;
1258    if first.document_handler then do;
1259      hold=0;
1260      todo=0;
1261      first_date1=.;
1262      first_date2=.;
1263    end;
1264    select (document_state);
1265      when ('HOLD') do;
1266        hold+1;
1267        ifn(first_date1=. or r_creation_date < first_date1,r_creation_date,first_date1);
                                                                                           -
                                                                                           22
                                                                                           76
ERROR: Undeclared array referenced: ifn.
ERROR 22-322: Syntax error, expecting one of the following: +, =.

ERROR 76-322: Syntax error, statement will be ignored.

1268        end;
1269      when ('TODO') do;
1270        todo+1;
1271        ifn(first_date2=. or r_creation_date < first_date2,r_creation_date,first_date2);
                                                                                           -
                                                                                           22
                                                                                           76
ERROR: Undeclared array referenced: ifn.
ERROR 22-322: Syntax error, expecting one of the following: +, =.

ERROR 76-322: Syntax error, statement will be ignored.

1272        end;
1273      otherwise ignore+1;
1274    end;
1275    if last.document_handler then output;
1276  run;

NOTE: Character values have been converted to numeric values at the places given by:
      (Line):(Column).
      1267:28   1271:28
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.TEST1 may be incomplete.  When this step was stopped there were 0
         observations and 3 variables.
WARNING: Data set WORK.TEST1 was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
      real time           0.03 seconds
      cpu time            0.04 seconds


RW9
Diamond | Level 26 RW9
Diamond | Level 26

As noted, there is no test data, so the code is untested.  To get tested code, post test data in the form of a datastep. 

data want (keep=document_handler hold todo);
  set have;
  by document_handler;
  retain hold todo first_date1 first_date2;
  if first.document_handler then do;
    hold=0;
    todo=0;
    first_date1=.;
    first_date2=.;
  end;
  select (document_state);
  	when ('HOLD') do;
      hold+1;
      first_date1=ifn(first_date1=. or r_creation_date < first_date1,r_creation_date,first_date1);
      end;
    when ('TODO') do;
      todo+1;
      first_date2=ifn(first_date2=. or r_creation_date < first_date2,r_creation_date,first_date2);
      end;
    otherwise ignore+1;  
  end;
  if last.document_handler then output;
run;
Yves_Boonen
Quartz | Level 8

This script doesn't give me errors, however, it is still missing the dates and that's due to a format issue.

I have a CSV-file containing this information and much more.

In the CSV-file you will find two columns with dates, namely date_treated and r_creation_date. Both dates are in the format of "dd/mm/yyyy hh:mm". I used "format datetime16." and "informat "anydtdtm16." on both columns, but the result in both columns differs. The date_treated is imported correctly according to the format "dd/mm/yyyy hh:mm" and the r_creation_date suddenly used the format "ddmmmyyyy". Since I am using the Belgian Period region and language settings, that becomes 01mrt2016 for the first of March, 2016. If the import were to do what actually supposed to do, this problem wouldn't be happening at all, I figure.

 

What could be a cause to this import issue?

Oligolas
Barite | Level 11
data test;
input document_handler $ kind_of_document $ r_creation_date ddmmyy10. document_state $;
datalines;
Name1 AVE 3/03/2016 HOLD
Name1 AVE 29/03/2016 HOLD
Name1 AVE 23/03/2016 HOLD
Name1 AVE 23/02/2016 HOLD
Name1 AVE 9/03/2016 HOLD
Name2 AVE 1/03/2016 HOLD
Name2 AVE 25/08/2015 HOLD
Name2 AVE 27/02/2016 HOLD
Name2 AVE 1/03/2016 HOLD
Name2 AVE 22/02/2016 HOLD
Name2 AVE 22/03/2016 HOLD
Name2 AVE 29/01/2016 HOLD
Name2 AVE 22/03/2016 HOLD
Name2 AVE 22/03/2016 HOLD
Name2 AVE 20/02/2016 HOLD
Name2 AVE 7/03/2016 HOLD
Name2 AVE 14/03/2016 HOLD
Name2 AVE 2/03/2016 HOLD
Name2 AVE 26/01/2016 HOLD
Name2 AVE 1/03/2016 HOLD
Name2 AVE 19/02/2016 HOLD
Name2 AVE 29/03/2016 TODO
Name2 AVE 29/03/2016 TODO
Name2 AVE 31/03/2016 TODO
Name2 AVE 31/03/2016 TODO
Name2 AVE 30/03/2016 TODO
Name3 AVE 14/01/2016 HOLD
Name3 AVE 1/03/2016 HOLD
Name3 AVE 8/02/2016 HOLD
Name3 AVE 17/02/2016 HOLD
Name3 AVE 29/01/2016 HOLD
Name3 AVE 2/02/2016 HOLD
Name3 AVE 24/02/2016 HOLD
Name3 AVE 23/02/2016 HOLD
Name3 AVE 29/02/2016 TODO
Name3 AVE 15/03/2016 TODO
;
run;

data test;
   set test;
   format r_creation_date ddmmyy10.;
run;

PROC SQL;
   CREATE TABLE test1 AS
   select document_handler, kind_of_document, document_state, count(*) as count_documents, min(r_creation_date) as oldest_r_creation_date format ddmmyy10.
   from test a
   group by document_handler,kind_of_document, document_state
   ;
QUIT;
________________________

- Cheers -

Yves_Boonen
Quartz | Level 8

I managed to fix it by deleting the table and importing it again. Now it r_creation_date has the format datetime16. and it is in t he region language US. (Example: 20OCT2015 16:54)

 

The only problem now is that your original code keeps outputting only three variables (document_handler hold todo). Even if I add "keep document_handler hold first_date1 do first_date2;" in after output and before run, it still only outputs only three variables?

Yves_Boonen
Quartz | Level 8

Soilved by adding them to your keep stateming at the start of the code. Now I need to maintain my format in the datastep, cause it changed it to a numerical date. I guess that would be first_date1 format=datetime16. in de set statement of the code?

Yves_Boonen
Quartz | Level 8

Any idea as to where to give the date variables the format datetime16.? I can't seem to manage that?

Reeza
Super User

In the data step, I usually place at last place - before run or first, right after Set. 

 

You our have a date time variable

 

Format variable_name datetime16.;

Or if you want it to display as a date

 

Format variable_name dtdate9.;
Yves_Boonen
Quartz | Level 8

I have a huge CSV-file which I import as a whole to my unload library in SAS. In the import script I use format datetime16. and I use informat anydtdtm16. for every column containing dates.

 

If i check the table in the unload library, the CSV-file is imported correctly and the dates are in the format assigned above.

 

Next up I will use proc sql to filter out the data I don't need and to create a clean table only containing the information needed for this assignement. During this proc sql I also assign a format datetime16. to r_creation_date and when I check the clean table, the dates are in the formats required.

 

As soon as I use your part of the script to do the calculations, the dates end up in a numerical format.

 

Below you will find the entire code. The only thing I need to manage, is to maintain dates and not a numerical variable.

 

/* Collect all AVE documents in state HOLD and TODO */

proc sql;
create table ave_1 as
select document_handler, kind_of_document, r_creation_date format=datetime16., document_state
from unload.document
where kind_of_document="AVE"
and document_state in ("HOLD","TODO")
;
quit;

/* Sort on document_state */

proc sort in=ave_1 out=ave_1;
by document_state;
run;

/* Sort on document_handler. */

proc sort in=ave_1 out=ave_1;
by document_handler;
run;

/* Count all documents per state per handler and register oldest date per state per handler */

data ave_2 (keep=document_handler hold first_date1 todo first_date2);
  set ave_1;
  by document_handler;
  retain hold first_date1 todo first_date2;
  if first.document_handler then do;
    hold=0;
    first_date1=.;
	todo=0;
    first_date2=.;
  end;
  select (document_state);
  	when ('HOLD') do;
      hold+1;
      first_date1=ifn(first_date1=. or r_creation_date < first_date1,r_creation_date,first_date1);
      end;
    when ('TODO') do;
      todo+1;
      first_date2=ifn(first_date2=. or r_creation_date < first_date2,r_creation_date,first_date2);
      end;
    otherwise ignore+1;  
  end;
  if last.document_handler then output;
  keep document_handler hold first_date1 todo first_date2;
run;
Yves_Boonen
Quartz | Level 8

That did the trick. Thanks again RW9 and Reeza!

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 13 replies
  • 4224 views
  • 5 likes
  • 4 in conversation