BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
KP12
Fluorite | Level 6

Hi,

I have following datasets in my library:

libname.Assistance_20150101

libname.Assistance_20150102

libname.Assistance_20150103

libname.Assistance_20150104

libname.Assistance_20150105

.

.

.

libname.Assistance_20150222

libname.Assistance_20150223


I need to write a macro which should return a macro variable whose value should be based on the last dataset name.


(Example: if A is a macro variable & the last dataset available in the library (based on creation date) is libname.Assistance_20150223, then value of A should be 20150223)


Apart from macro, if the above result can be achived with some other method(s), then please do share your thoughts.

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

This query will look for the last name that starts with ASSISTANCE_ in the library specified by the macro variable LIBREF.

If your library is not clutter with other datasets whose name start with 'ASSISTANCE_' then it should work.

%let libref=libname;

%let lastdt=none;

proc sql noprint ;

select scan(memname,-1,'_')

into :lastdt trimmed

  from dictionary.tables

  where libname=%upcase("&libref")

      and memname like 'ASSISTANCE^_%' ESCAPE '^'

  order by memname desc

;

quit;

View solution in original post

22 REPLIES 22
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Hi,

Firstly, and most importantly, I wouldn't recommend this approach.  Macro variables are text strings and to work with dates (which are numeric representations of an ad-hoc strcuture) means complicated code with lots of ampersands, and functions.  It gets messy and is dreadfull to try to use.  There are many other posts on this type of thing on this site. 

Secondly, if you need to keep an audit history of your data, I wouldn't recommend having datasets with dates in the name.  This may seem like a workable method but ultimately it will get unweildy and will not encompass changes which occur later on.  A simple example is what will you do when (and it most likely is when even if not forseen) you have two datasets on the same date?  I would recommend looking into either a) software version control, e.g. TortoiseSVN - this will keep history behind the scenes and you only have to deal with one dataset with a fixed name, or b) journalling, - such as is done in OC where data is recieved and changes from base copy are stored in a journal (i.e. audit history).  With either of these routes you would have a datetime stamp which would address the multiples on same date, and because there is only one dataset or journal, it makes further updating far easier than having to change several datasets and easier to QC.

Now if you ignore all the above and continue, you can look at the dataset names in sashelp.vtable:

data test;

     set sashelp.vtable (where=(libname="LIBNAME"));  /* Replace LIBNAME with the libname */

     sort_date=input(substr(name,12,8),yymmdd8.);

run;

You can now sort the dataset descending on sort_date and take row 1 as that will be the maximum date (you could do the same in one step in SQL with max()).

jakarman
Barite | Level 11

@RW9 having a software version based on source text-files does not make much sense for datasets does it?

As the dataset will normally arrive on read-only location his question can makes sense. I agree with you that the datasetname date should have a real functional meaning not one as of ease of implementation.
Your supplied code is of course a good one. I would review the vtable usage as it can have problems when many RDBMS being active. The old way of using proc contents can be a by-pass in that case.

For the comment on dataset-naming conventions, there could be another approach with gdg's  SAS(R) 9.4 Language Reference: Concepts, Fourth Edition  it is supporting historical versions of datasets.

---->-- ja karman --<-----
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Hi,

Yes, I would agree with you on the SVN not being the best tool for it.  However any software implementation of version control would, to my mind, be better that a manual one.  SVN can deal with binary files, instead of keeping changes between base and future copies though, it will take a copy of the whole file at each timepoint, hence space for the Apache serer can be an issue.  However I see so often manual copying of files to dated directories or files and manual updating of things that I never get suprised when things go wrong.  Let the software take care of things, human error is less likely then.

Thanks for providing that link, it looks interesting.  Will have to look further into that.

jakarman
Barite | Level 11

Ok we agree on the goal to achieve. manual operations (data processing)  too much manual intervention in a change process (code release management) should be avoided.

The release process should be triggered and approved by hat there are manual interventions needed.

When the release/change is having many manual steps as being executed do not be surprised when something (more very often) is going to be wrong.

---->-- ja karman --<-----
KP12
Fluorite | Level 6

Hi RW9,

Thank you very much for your response Smiley Happy . As the company is following this nomenclature for the dataset names, so | can't make the changes on my own Smiley Sad

rcwright
Calcite | Level 5

The following uses two things:

1. the colon(:) operater to read in datasets that begin with 'xyz',

2. the INDSNAME option of the set statement, which assigns the name of the dataset being read to a variable

Then simply use input and scan to pick off the date from the end of the data set name

DATA x;

     LENGTH _dsn $ 70;

     SET libname.assist: INDSNAME=_dsn;

     dt = Input(Scan(_dsn,-1,'_'),MMDDYY8.);

     Call SymPut('lastdt', Put(dt, DATE.)); * end up with last date ;

     FORMAT dt DATE.;

RUN;

rcwright
Calcite | Level 5

One thing I forgot to mention, and why you might want to create a data set of dates, is that, you risk the 'last' data set not being the latest data set.

This might happen by virtue of the the collating sequeence of 'yyyymmdd' not being in sync with the date. So you could create the data set as above, then sort it, and then pick off the last date:

PROC SORT DATA=x;

     by dt;

DATA _NULL_;

     SET x;

       call symput () ... as above

Hope this makes sense.

data_null__
Jade | Level 19

rcwright wrote:

The following uses two things:

1. the colon(:) operater to read in datasets that begin with 'xyz',

2. the INDSNAME option of the set statement, which assigns the name of the dataset being read to a variable

Then simply use input and scan to pick off the date from the end of the data set name

DATA x;

     LENGTH _dsn $ 70;

     SET libname.assist: INDSNAME=_dsn;

     dt = Input(Scan(_dsn,-1,'_'),MMDDYY8.);

     Call SymPut('lastdt', Put(dt, DATE.)); * end up with last date ;

     FORMAT dt DATE.;

RUN;

Oh my!  You will read every obs from every data set just to get one value that is actually stored in the data set name.  That seems rather inefficient to me.

Plus can you guarantee that the data set with the last(largest) date will be last in the "NAME RANGE".  Likely would but if the character date part actually has the data as MMDDYY I would think no.

rcwright
Calcite | Level 5

I originally noted you can modify the above by using dataset OBS=0 (SET libname.assist:(OBS=0)) and then deleted it not wanting to get too wordy.Guess I should have included it.

Not sure what you're asking about the name range - I think that was what I was trying to address in the followup - I think here you've changed the format of the data set name from what appears to be YYMMDD to MMDDYY - I was trying to follow what was given.

And there'd be nothing preventing you from manually putting in the 'last' if you have that.

There are other approaches - You could use a sashelp.vtable, sashelp.vcolumn, sashelp.vmember, and etc approach as well. I just chose this because it illustrated a couple of things not normally seen

data_null__
Jade | Level 19

Yes you should have included the OBS bit and tested it too.  OBS=0 doesn't work because INDSNAME is not populated until an observation is read.  OBS=1 does.  This might have efficiently better than DICTIONARY access for some situations.

data a_20150101 a_20150204 a_20150903;
   set sashelp.class;
   run;
data dates;
   set a:(obs=1 drop=_all_) indsname=indsname;
   put indsname=;
   length name $32 date 8;
   name = scan(indsname,-
1,'.');
   date = input(scan(name,-1,'_'),yymmdd8.);
   format date date9.;
  
run;
proc print;
  
run;
Tom
Super User Tom
Super User

So do you want the one with the latest date in the table name? Or the one that was last created? Or last modified?

I see that your example date strings are in YYYYMMDD format, which is good because then they will sort properly when treated as character strings.

data prefix_20150302 prefix_20150301 ;

x=1;

run;

proc sql noprint ;

select memname into :last_name trimmed

  from dictionary.tables

  where libname='WORK' and memname like 'PREFIX%'

  order by memname desc

;

%put &=last_name ;

select memname into :last_cr trimmed

  from dictionary.tables

  where libname='WORK' and memname like 'PREFIX%'

  order by crdate desc

;

%put &=last_cr ;

select memname into :last_mo trimmed

  from dictionary.tables

  where libname='WORK' and memname like 'PREFIX%'

  order by modate desc

;

%put &=last_mo ;

quit;

KP12
Fluorite | Level 6

Hi all,

Apologies for not being able to get involved in the discussion, was busy in some emegency.

@tom: You got it right, I just want to have a macro variable which should return the YYYYMMDD part of the dataset name based on the latest date in the table name.

Tom
Super User Tom
Super User

This query will look for the last name that starts with ASSISTANCE_ in the library specified by the macro variable LIBREF.

If your library is not clutter with other datasets whose name start with 'ASSISTANCE_' then it should work.

%let libref=libname;

%let lastdt=none;

proc sql noprint ;

select scan(memname,-1,'_')

into :lastdt trimmed

  from dictionary.tables

  where libname=%upcase("&libref")

      and memname like 'ASSISTANCE^_%' ESCAPE '^'

  order by memname desc

;

quit;

Ksharp
Super User

dictionary.members would be faster.

data a_20150101 a_20150204 a_20150903;
   set sashelp.class;
   run; 

proc sql;
 select input(strip(scan(memname,-1,'_')),anydtdte.) as d format=yymmddn. into : d
  from dictionary.members
   having libname='WORK' and calculated d=max(calculated d);
quit;
%put &d  ;


Xia Keshan

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 22 replies
  • 2660 views
  • 6 likes
  • 7 in conversation