About ErikLund_Jensen

ErikLund_Jensen · ‎05-03-2019

Hi @Filipvdr You can use the SAS command line tools Exportpackage and ImportPackage and call them from a SAS Program, something like this (export shown): %let exportpgm = /sas/prod/sashome/SASPlatformObjectFramework/9.4/ExportPackage; %let export_connectopt = %str(-host "sas-meta-p-01.odknet.dk" -port 8561 -user "domain\userid" -password "mypassword" %let typelist = Table; /* SPK package file */ %let package = &pakkesti/package_1.spk; /* Source metadata folder */ %let objects = &MetadataSti; /* Source metadata folder */ %let eksportlog = &pakkesti/eksport.log; %let cmdstr = "&exportpgm" &export_connectopt &typelist -package "&package" -objects "&objects" -subprop -log "&eksportlog" > /dev/null; %put &cmdstr; %sysexec &cmdstr; The -subprop option writes mapping information to a subprop file outside the package, so it can be edited with a program to perform the same as the mapping-dialogue in an interactive import.. This code uses the specified metadata folder as a starting point and exports all objects of types specified in typelist. I guess it might be possible to specify a given object only. Note that you might need to change some options in the init file, that contains options similar to the checkboxes in an interactive export/import. All necessary information can be found in the documentation: SAS® 9.4 Intelligence Platform: System Administration Guide, Fourth Edition

ErikLund_Jensen · ‎05-03-2019

Hi @JJP1 First: You can write data steps in DI Studio. SAS supplies a transformation named "User Written Code" (Look in Transformations -> Data). If you use that, you just get an empty transformation in your job with input and output connections, and you can fill it out with your own code. Next: I don't think it is a good idea to join 6 tables in one SQL join, because it is difficult to debug, and it is not easy to figure out how SQL handles the joining, so you might end op with many passes over the same table. Try to split it up. In your case I would start with 4 joins covering the "small" tables: A-B, result-C, result_D and result-E. Remember to deselect "create view", so the code will run in each transfromation. The you can run the transfromations one by one and make sure they give the expected result, and you don't get cartesian products. If you don't select all columns from all tables, you might get unwanted duplicate rows in the result, so remember to select 'distinct' in each join. When that works, you could change to views in the 3 first transformations to save disk space, but keep a physical table as output in the last. Remember to save/close the job and open it again at this point to get rid of the work tables, and it might be a good idea to write the 4. output to a permanent table (at least in the development phase) so you don't have to rerun everything when you open the job, but can proceed directly to next step: Add another SQL join to join the last result with the big table. You don't give any information about the content of your data, so we don't know if there will be key duplicates in the the result, and if there are key duplicates in the big table too. But if there are key duplicates in both tables, you will get a cartesian product, and the result will probably be useless even if you get it to run. In that case rethink your process, because there are flaws in your data model.

ErikLund_Jensen · ‎05-03-2019

Hi @Patrick Thanks for the ideas. I tried to keep it simple, because i didn't want to scare @JJP1 off with too many new techniques, so please see my code as a "starter". Our transformation has several options: Minimum acceptable record count with 0 as default. Error action (abend/warn) with abend as default. Send Mail (yes/no) - default: yes. Optional Mail CC. Optional text to include in the mail. We didn't implement SQL record count, because we have a "best practice", where we create data as SAS tables in one job and load/update database tables in the next job. We use the transformation with error action=Abend in jobs creating SAS tables, so the LSF flow fails before the load job runs. The looping idea is very good. I will implement that!

ErikLund_Jensen · ‎05-02-2019

Hi @JJP1 I recommend the use of a transformation, because it works very well for us. This approach creates a mail for each table, because there is an instance of the transformation in every job we want to monitor that way. We have about 4000 DI Studio jobs in our baily batch, but if everything runs as expected, we don't get empty tables, so we are not bothered with that many mails. We chose this approach, because our main concern was to make things easy for ETL developers. When they build a job, they just pull the transformation in and conncet it to the table, and then it works. It would be difficult to fulfill your request and get a single mail only, because the jobs runs after different schedules, so not all jobs are in the batch every day, and there has to be a final job that runs after all other jobs to collect the information and send the mail. It could be done, e.g. by coding the transformation to log each job/table in a database, and then find some way to make sure that the job reporting from the data base wasn't run before all other jobs had finished. I have tried to write a simple transformation code, which is a cut-down version of our transformation. There might be some tricky things in the code if you are new to SAS, but the logical flow should be easy to follow. Note that mail addresses are hard coded and must be changed to something useful in your setup. If you don't know how to write a user transformation, this is where you start: And here is the code: /***********************************************************************************/ /* MailEmptyDS */ /* */ /* This transformation checks for zero observations in input data set and sends */ /* an error mail. */ /* */ /* The transformation works only with physical SAS tables (engine = V9 or BASE). */ /* It is not possible to get record count from SAS Views or other engines, like */ /* ODBC. In this case - or if the input data set has no physical existence - no */ /* mail sent, and the transformation ends with a warning. */ /* */ /***********************************************************************************/ %macro MailEmptyDS(dummy); %local fejlstop libref dsid engnum rc engine path viewexist dsexist rcnt thisjob JobSlutDatetime; %put INFO: Check for zero rows in data set &_INPUT.; * Mail addresses; %let receiver = somebody@yoursite.xx; %let from = SAS Batch <batchaccount@yoursite.xx> * Get library engine; %let libref = %upcase(%scan(&_INPUT,1,%str(.))); %let dsid=%sysfunc(open(sashelp.vlibnam(where=(libname="&libref")),i)); %if (&dsid ^= 0) %then %do; %let engnum=%sysfunc(varnum(&dsid,ENGINE)); %let rc=%sysfunc(fetch(&dsid)); %let engine=%sysfunc(getvarc(&dsid,&engnum)); %let rc= %sysfunc(close(&dsid.)); %end; %put &=engine; * Check: is it a V9/BASE library? - else warn/return; %if &engine ne V9 and &engine ne BASE %then %do; %let path = %sysfunc(PATHNAME(&libref)); %put WARNING: Library &libref (&path) is not a SAS BASE library - observation count not possible.; %let syscc = 4; %return; %end; * Check: Is it a view? - then warn/return; %let viewexist = %sysfunc(exist(&_INPUT,VIEW)); %if &viewexist = 1 %then %do; %put WARNING: &_INPUT is a VIEW - observation count not possible.; %let syscc = 4; %return; %end; * Check: Does physical data set exist? - else warn/return; %let dsexist = %sysfunc(exist(&_INPUT)); %if &dsexist = 0 %then %do; %put WARNING: Data set &_INPUT does not exist; %let syscc = 4; %return; %end; * Get observation count; %let rcnt = -1; %let dsid = %sysfunc(open(&_INPUT)); %if &dsid %then %do; %let rcnt = %sysfunc(attrn(&dsid,NLOBS)); %let rc = %sysfunc(close(&dsid)); %end; %put &=rcnt; * Send mail if zero observations; %if &rcnt = 0 %then %do; filename outbox email to = "&receiver" from="&from" subject="Warning - Empty data set created"; data _null_; file outbox; put "Warning - Empty data set created"; put "Table &_INPUT"; put "Job &etls_jobName"; run; %put INFO: Error mail sent to &receiver; %end; %else %put INFO: OK - data set &_INPUT has &rcnt observations.; %mend; %MailEmptyDS;

ErikLund_Jensen · ‎05-02-2019

Hi @JJP1 I certainly will. I don't know if I can find time this evening, but you will have it before monday.

ErikLund_Jensen · ‎05-01-2019

Hi @JJP1 A simple solution is to create a User Transformation in DI Studio with code to check the number of records and take appropriate action. DI Studio creates a score of macro variables at the top of each transformation describing input and output, among these is the macro variable &_INPUT with the actual library.dataset name. If the user transformation is added to a DI Studio job as the last transfromation with the table as input, you can use &_INPUT to refer to the data set in the transformation code.

ErikLund_Jensen · ‎05-01-2019

A simple SAS Put statement with no modifiers puts each line with an end-of-line mark, and number of lines in the file is the number of end-of-line marks. A different thing is what you see in the editor. Consider this example: 145 data _null_; 146 file 'c:\temp\test.csv'; 147 put 'Hansen'; 148 put 'Jensen'; 149 run; NOTE: The file 'c:\temp\test.csv' is: Filename=c:\temp\test.csv, RECFM=V,LRECL=32767,File Size (bytes)=0, Last Modified=01. maj 2019 09:52:48, Create Time=01. maj 2019 09:41:15 NOTE: 2 records were written to the file 'c:\temp\test.csv'. The minimum record length was 6. The maximum record length was 6. NOTE: DATA statement used (Total process time): real time 0.01 seconds cpu time 0.01 seconds 150 151 data _null_; 152 infile 'c:\temp\test.csv'; 153 input navn$; 154 run; NOTE: The infile 'c:\temp\test.csv' is: Filename=c:\temp\test.csv, RECFM=V,LRECL=32767,File Size (bytes)=16, Last Modified=01. maj 2019 09:52:48, Create Time=01. maj 2019 09:41:15 NOTE: 2 records were read from the infile 'c:\temp\test.csv'. The minimum record length was 6. The maximum record length was 6. NOTE: DATA statement used (Total process time): real time 0.01 seconds cpu time 0.01 seconds There are 2 lines written to the file, and 2 lines read from the file, so it is verified that the file contains 2 lines. But if you open the file in an editor, you see: So it seems that your code works, and the confusion stems from the editor's way of displaying the file.

ErikLund_Jensen · ‎05-01-2019

Hi @Junyong Retain is a compile-time statement, and it just tells SAS not to reset the value to missing before each input record is read. It is not necessary to supply a value to supply a value in the retain statement. So everything works if you assign the value of b to c in the first observation in each by group: data b; set a; by a; retain c; if first.a then c = b; else c = c * b; run;

ErikLund_Jensen · ‎04-28-2019

Hi @dario_medina It would be easier to help you if you provided some example code, or at least a skeleton code to show a sequence of steps, because it is not quite clear whar the problem is. But you might have fallen into the bad habit to reuse your input dataset as output. In the following example it is impossible to see what happens where: data want; set have; if <drop condition> then delete; run; data want; set want; if <another drop condition> then delete; run; data want; set want; if <keep contition> then output; run; Use different output data sets and keep the dropped records for control purposes. Then it is easy to track down what happens where: data want1 drop1; set have; if not <drop condition> then output want1; else output drop1 run; data want2 drop2; set want1; if not <another drop condition> then output want2 else output drop2; run; data want3 drop3; set want2; if <keep condition> then output want3; else output drop3; run; When everything works you can change your code to avoid creation of drop-datasets.

ErikLund_Jensen · ‎04-26-2019

Hi @Jack_Sabbath Just an idea - try to change single quotes around title to double quotes, because the macro variable won't resolve within single quotes.

ErikLund_Jensen · ‎04-26-2019

Hi @dsadsad It seems that you have got a solution, but I would like to explain your problem. You use 2 functions in the same expression, which means that SAS creates a temporary variable to hold the result of the inner TRANWRD, which is then used as input to the outer TRANWRD. This variable is created with a default length of 200, and there is nothing you can do to change it. But a workaround is to split the expression in 2 using a temporary column like in this example: data have; length a $10 str $500; do x = 1 to 50; a = strip(put(x,2.))||'.................'; str = catt(str,a); end; run; proc sql; create table bad as select tranwrd(tranwrd(str,'.','-'),'-','+') as str length=500 from have; quit; proc sql; create table good (drop=tmp) as select tranwrd(str,'.','-') as tmp length=500, tranwrd(calculated tmp,'-','+') as str length=500 from have; quit;

ErikLund_Jensen · ‎04-26-2019

Hi @AngusLooney I agree, mining job code can provide very useful information. In most cases I perfer to use metadata to get job information, because we don't have deployed code in our development environment, where we use data for quality control, but we use the deployed code to find warnings about "columns left empty", because it is rather tricky to get that information from metadata. I have read your SAS Forum paper "Advanced ETL Scheduling Techniques", and I think our way of using metadata in our sceduling might be of interest to you. Please forgive my clumsy language - english is not my native tongue... Background We are a danish municipality administration, and we have used SAS for more than 30 years first on mainframe, later Windows servers and now Linux Grid. In 2007 we moved our SAS data warehouse to DI Studio/LSF. It is constantly evolving, new jobs are added and existing job changed, often with changes in input tables, on a daily basis, and we have p.t. about 4000 jobs in 600 LSF flows using 6500 permanent tables, 900 external files and 25000 work tables in our daily batch. The structure is complex, with chains up to 30 flows long, where jobs in a flow depends on data from jobs in other flows, and many tables are used as input to many jobs, up to 100 jobs in 40 flows using the same table. Many flows have special run dates, but most run the night before each workday. We have ETL-developers in 6 different departments on different locations. All developers work in a development environment, and the production environment is centrally maintained, and besides naming conventions we have two rules for jobs: a LSF Flow corresponds to one job folder in DI Studio, and a given flow is limited to updating one library only. Developers request promotion by exporting the relevant items and sending a mail with a link to the spk-package, a screen-shot from DI Studio with the included items marked and notes about special run frequency etc. Then we import the package and deploy the jobs. Manitaining LSF flows and scheduling is a part of the promotion process too. Problem As the number of jobs and flows grew, we ran into problems. Building and maintaining LSF flows became very time consuming, as we had to open all jobs to determine the internal dependencies between jobs in a flow, so it could be defined with correct job dependencies and control nodes. It also became more and more difficult to control flow triggering, because each new or changed job might change the dependency on previous flows, and we had to open all jobs to find out which flows wrote the input tables, so we could ensure that a flow didn't start before all previous flows were finished. We used file event triggering by adding a last job to each flow. This job wrote a semaphore file upon completion, and all dependent flows were set up to use these files as triggering events. In addition we had a lot of trouble with enforcing naming conventions and other rules, solving deadlocks caused by crossing dependencies etc, and the semaphore file method didn't fulfill our needs, because we never figured aut how to get it working with different running schedulles, and we never found a way to exclude a flow from running, if the previous flow failed, so output tables where left undamaged, but still get further flows to run in cases where yesterday's data in one contributing table was acceptable. in 2010 we ran out of resources. We were 5 persons responsible for running servers, maintaining the development and production environments, get new data sources into the warehouse, promote jobs, run the batch, maintain our SAS portal, find and solve job errors and guide 20 ETL-developers and more than 100 EG users. So something had to be done. Solution We became aware that when we used DI Studio to extract all the necessary information to determine job and flow dependencies, we actually used it as a metadata browser, so we could use a program to extract the same information. That lead to: 1) A "Virtual Auditor", a flow running in daily batch with jobs checking violation of naming conventions, missing libname descriptions, userwritten code in metadata (we want all userwritten code as sas files), reading from/writing to "illegal" levels in the data warehouse structure, passwords in code, unmapped columns etc. The information is presented as a SAS report and also mailed to the responsible developers (LastUpdated ResponsibleParty). 2) A Visual Studio application to assist with flow definitions. The application has a left pane with the metadata tree structure similar to DI Studio, and click on a job folder analyses all deployed jobs in the folder and draws all jobs and dependencies in the right pane similar to Schedule Manager. With our application on one screen and the Schedule Manager edit pane on another it is very easy and quick to create or change a flow. The application calls a SAS program to get the folder tree and a parent-child structure for the jobs in the folder, and thanks to Proc metadata it is fast, less than 5 seconds to produce the drawing. 3) A "Virtual Operator", which is home-made scheduler with the main purpose of avoiding all definitions of triggering events in LSF. All flows are just set to "Run manually in the scheduling server" in SAS MC. It consists of 3 parts: a) A metadata extract running before the daily batch starts. The result is a database containing actual tables with jobs, flows and flow dependencies. It also contains other tables, one is an updated flow attribute table with running dates, start-before and start-after times, another is an updated table with flow dependencies and file dependencies. A new flow has default values from a parm table, but the values can be edited, so (ex.) a flow is allowed a flow to start even if a previuos flow is exculded from run, or an input file must not only exist, but has to be updated within the last 24 hours. b) A SAS job that starts at 6 PM every day and builds tables with flows (status "waiting" if included in the actual batch or otherwise "not-on-runlist") and actual jobs in these flows. All dependencies are loaded into macro variables as logic expressions, and all flows are loaded into macro variables with actual status. This is followed by a looping proces, where - an actual status is computed for all flows based on status for the jobs in the data base, - all flow macro variables are loaded with actual status taken from the data base, and - all expressions are evaluated, and based on that a flow is either marked "excluded from run" or submitted to LSF, and flow macro variables are updated accordingly. This is repeated every 30 seconds until the loop stops at 4 PM the following day, but normally the list is exhausted with all jobs having a final status of Done or in some cases Failed or Excluded at 6 AM. c) A visual studio application to control things. It has a main pane with all flows ordered and color marked by actual status and with information on start and end times, average run times etc, and a right click gives a menu to se actual status for jobs in the flow with times and mean times, browse log files, show runtime graphs, edit dependencies and rundates, set comments on failed flows and start/restart flows including flows not in the actual batch. So most things can be done from the application. If a flow is failed, the normal process will be to open the job list and see which job failed, then browse the log file, and - depending on the error - try to rerun it or correct it in DI Studio, redeploy the code and then rerun the flow. If the flow is done, the SAS loop wil detect it, so flows formerly excluded are taken up again, and the whole excluded branch run to end. This depends on one thing only. The data base must be updated whenever a job starts or ends. To that purpose a Initstmt and a Termstmt parameter is added to the command string parameters for deployed jobs, and this is the only change to the standard SAS installation. This way, a job starts with a macro logging start time and log file name in the database, and it ends with a macro logging end time and return code. Because of this implementation flows can be run and rerun not only from the application, but also from SAS MC or FlowManager and still interact with the scheduler. Conclusion It changed our work totally. We had less than 1000 jobs in 2010, and it took about 100 hours weekly to control job quality, promote jobs and run the batch. Now we have at least four times as many jobs and a more complicated setup, and deployment/promotion takes maybe 5 hours weekly, and the batch can run for weeks without any intervention except an hour or two to rerun failed jobs. So now we have got time to keep everything nice and tidy, support the developers and do some development ourselves. We wrote the sas code ourselves, it took about a month, and we were lucky to have a collegue experienced in Visual Studio to write the applications. It has proved very roubust, it has almost never failed in 8 years, The only problem has been a few cases of network problems. And it has survived migration through all SAS generations from Windows 9.3.1 to Linux 9.4M5 with a few minor adjustments.

ErikLund_Jensen · ‎04-25-2019

Hi @Cruise No problem. Here is the modified code with 2 new variables in output: days_before and days_after. Note following rules: - both set missing value if the diagnosis date is within an interval. - days_after set missing if the diagnosis date is before the first interval. - days_before set missing if the diagnosis date is after the last interval. data want3 (drop = Interval_count i); set have; format cov_beg cov_end date9.; array cov cov0 cov15 cov30 cov60 cov90 cov120 cov150 cov180 cov_plus; array beg {*} elig_beg: ; array end {*} elig_end:; * loop over non-missing intervals; Interval_count = dim(beg) - nmiss(of elig_beg:); do i = 1 to Interval_count; cov_beg = beg{i}; cov_end = end{i}; * Diagnosis in insurance interval; if beg{i} <= DATE_DIAGNOSIS <= end{i} then dist = 0; * Diagnosis before first interval start; else if i = 1 and DATE_DIAGNOSIS < beg{i} then do; dist = beg{1} - DATE_DIAGNOSIS; days_before = beg{i} - DATE_DIAGNOSIS; end; * Diagnosis after last interval end; else if i = Interval_count and DATE_DIAGNOSIS > end{Interval_count} then do; dist = DATE_DIAGNOSIS - end{i}; days_after = DATE_DIAGNOSIS - beg{i}; end; * Diagnosis in gap between intervals - find closest interval; else if i > 1 then do; if end{i-1} < DATE_DIAGNOSIS < beg{i} then do; days_after = beg{i} - DATE_DIAGNOSIS; days_before = DATE_DIAGNOSIS - end{i-1}; if days_after > days_before then do; cov_beg = beg{i-1}; cov_end = end{i-1}; dist = days_before; end; else dist = days_after; end; end; * drop out of loop after first match to save ressources; if dist ne . then leave; end; * Set code for date distance - all "overshoot" goes into last defined cov_array variable; if dist = 0 then cov{1} = 1; else if dist <= 15 then cov{2} = 1; else cov{min(dim(cov), int((dist-1)/30)+ 3)} = 1; run;

ErikLund_Jensen · ‎04-23-2019

Hi @rhapsody You could write a macro to give the wanted result. The code is based on the good advices you have got so far: %macro getlasttable; %if not %symexist(date) %then %global date; %let date =; proc sql noprint; select max(scan(memname,2,'_')) into :date from dictionary.tables where libname = upcase("tables") and memname like 'RESULT%CREDIT'; quit; %mend; %getlasttable; %put &=date; 201903

ErikLund_Jensen · ‎04-21-2019

Hi @Speedy Speedy This is a very ambitious project, and I agree with @Patrick , designing the report might be even more challenging than obtaining the information. I think SAS intended DI Studio to be not only a code-writing tool, but also a docomentation tool, the best way to make an expandable metadata report, but I also find it difficult to keep the overwiew and grasp what a job actually does on a detail level. My way is to take the job code into an editor, delete all the "crap" like column macro variables and then read the remaining code. But it is also a very interesting project, and even if you never reach the full report, you vill get a toolbox full of very usefull techniques to ensure job quality (concistency in the use of formats, description on all libraries etc), identifying jobs with things you never discover in DI Studio like pre- and postcode, finding all jobs written by a a given developer and so on. A DI Studio job is stored in metadata as a specification with the necessary information to build the job as you see it in DI Studio with tables, files and transformations and to generate the job code with library definitions, file names and SAS steps. The logical structure consists of 2 types of information: 1) objects with an ID and different attributes depending on the object type and 2) associations to connect objects. A table is an object with a name as attribute, associations to column objects with attributes like name, length, type and association to a library object, that has an association to a physical location with a path as attribute. Many types of information with undefined lengths, like an expression used in an Extraxt transformation, is not stored as an attribute, but as an object of the type TextStore. The structure is rather complicated, and it takes a lot of expierience to get familiar with it. The physical structure is SAS data sets with the different objects and associations. It is possible to access these tables (or a backup set - never try to access the "working" set of tables!), and using thise tables is actually the easiest way to get a lot of information extracted from metadata, but it is undocumented and doesn't give the actual updated information. There are 2 supported ways of getting actual metadata information. One is with SAS Data Step functions like getnobj, getattr and getnassn. It is easy to use but rather slow. To get information on all tables and columns this way takes about 40 minutes in our setup. here is an example: %macro meta_findperson(brugerid); %if not %symexist(PersonUri) %then %global PersonUri; %if not %symexist(PersonNavn) %then %global PersonNavn; %let PersonUri =; %let PersonNavn =; data _null_; length PersonUri PersonNavn $256; nobj = 1; n=1; PersonUri = ''; PersonNavn = ''; searcharg = "omsobj:Person?@Name contains '("|| trim("&brugerid")||")'"; nobj=metadata_getnobj(searcharg,n,PersonUri); if PersonUri ne '' then do; rc=metadata_getattr(PersonUri,"Name",PersonNavn); call symput('PersonUri',PersonUri); call symput('PersonNavn',PersonNavn); put 'INFO: BrugerID fundet: ' PersonNavn PersonUri; end; else put 'INFO: Angivet BrugerID ' "&brugerid" ' er ikke fundet'; run; %mend; Another way is to use Proc Metadata with XML specifications. It is fast - about 10 seconds vs 40 minutes, but it is a lot more complicated to set up, as it requires an XML file specification and a corresponding XML Map for each extract, and the learning curve for building the extract specifications is pretty steep, and there goes a lot of experiments in each extract before you get it working, especially if you try to follow long association chains in the same extract. I find it easier to keep the XML simple and get a few levels only in each extract and join afterward on ID's. Here is an example of an extract specification and the corresponding map: <GetMetadataObjects> <Reposid>$METAREPOSITORY</Reposid> <Type>Job</Type> <Objects/> <Ns>SAS</Ns> <Flags>260</Flags> <Options> <Templates> <Template TemplateName="Job"> <Job ID=""> <ResponsibleParties> <ResponsibleParty TemplateName="T4"/> </ResponsibleParties> </Job> </Template> <Template TemplateName="T4"> <ResponsibleParty ID="" Name="" Role=""> <Persons/> </ResponsibleParty> </Template> </Templates> </Options> </GetMetadataObjects> <?xml version="1.0" encoding="UTF-8"?> <SXLEMAP name="JobResponsible" version="2.1"> <NAMESPACES count="0"/>  <TABLE name="Work_Job_ResponsibleParty"> <TABLE-PATH syntax="XPath">/GetMetadataObjects/Objects/Job/ResponsibleParties/ResponsibleParty</TABLE-PATH> <COLUMN class="ORDINAL" name="Sekvens"> <INCREMENT-PATH beginend="BEGIN" syntax="XPath">/GetMetadataObjects/Objects/Job/ResponsibleParties/ResponsibleParty</INCREMENT-PATH> <TYPE>numeric</TYPE> <DATATYPE>integer</DATATYPE> </COLUMN> <COLUMN name="JobID" retain="YES"> <PATH syntax="XPath">/GetMetadataObjects/Objects/Job/@Id</PATH> <TYPE>character</TYPE> <DATATYPE>string</DATATYPE> <LENGTH>17</LENGTH> </COLUMN> <COLUMN name="ResponsiblePartyID"> <PATH syntax="XPath">/GetMetadataObjects/Objects/Job/ResponsibleParties/ResponsibleParty/@Id</PATH> <TYPE>character</TYPE> <DATATYPE>string</DATATYPE> <LENGTH>17</LENGTH> </COLUMN> <COLUMN name="ResponsiblePartyNavn"> <PATH syntax="XPath">/GetMetadataObjects/Objects/Job/ResponsibleParties/ResponsibleParty/@Name</PATH> <TYPE>character</TYPE> <DATATYPE>string</DATATYPE> <LENGTH>60</LENGTH> </COLUMN> <COLUMN name="ResponsiblePartyRolle"> <PATH syntax="XPath">/GetMetadataObjects/Objects/Job/ResponsibleParties/ResponsibleParty/@Role</PATH> <TYPE>character</TYPE> <DATATYPE>string</DATATYPE> <LENGTH>30</LENGTH> </COLUMN> <COLUMN name="PersonID"> <PATH syntax="XPath">/GetMetadataObjects/Objects/Job/ResponsibleParties/ResponsibleParty/Persons/Person/@Id</PATH> <TYPE>character</TYPE> <DATATYPE>string</DATATYPE> <LENGTH>17</LENGTH> </COLUMN> </TABLE> </SXLEMAP> Either way you have to know the metadata structure in detail, as you have to specify type names of all objects and associations and follow the chains (and beware - associations tend to have different names as seen from A to B or from B to A, and the implemented structure for different transformations is not quite consistent). The only tool to view the metadata structure is the SAS Metadata Browser found under Solutions in SAS Display Manager, and if you don't have access to that, then just drop it - you will never succeed. I think in your case the first step would be to make a better requirements specification, as "business rules, joins, filters, transformations" is not very precise, and things like "business rules" don't have a corresponding metatada type. Find a small ETL job with a few transformations. Dissect the job in DI Studio and write out all the information items you want in your report. Next step would be to start the SAS Metadata Browser, open 'Jobs' and scroll down to the relevant job, right-click and select "Explore from here". Now you are into the full metadata structure for the job. Follow the association chains and click on objects until you have found all the information items you want, and write down the object-association-object chains, object and association types and attribute names you want to extract. Example (note type of object is marked yellow): Then consider how you would build a report to present the wanted informations. At this point layout details is not important, the goal is to define a data structure to hold these information items in a way that is usable for reporting. And now you are ready to code. Start with a few simple extracts to get some experience and some basic code snippets. After that you can define your metadata extracts to populate the tables you defined in the previous step. Good luck!

Online Status	Offline
Date Last Visited	‎06-24-2025 05:37 AM

Re: SAS Metadata: list of all attributes you can use in METADATA_GETAT...

Re: SAS Metadata: list of all attributes you can use in METADATA_GETAT...

Re: Split name vertically

Re: How to flag the last AVALC = 'Y' record prior to this AVALC='N' re...

Re: The use of last.id by multiple groups

Re: combine datasets even when results are missing

Re: combine datasets even when results are missing

Re: How do i parse specific number combinations from a field in a tabl...

Re: How do i parse specific number combinations from a field in a tabl...

Re: Pseudonymization of sensitive data

Re: How to convert DATETIME to ISO8601 format ?

Re: create macro vars-End of month

Re: Flag the next value of a variable

Re: SAS Programming 1_Lesson 5_p105a03.sas: repeating the same code do...

Re: Write SAS Time fields to Excel - loses time format

Re: combine datasets even when results are missing

Re: Pseudonymization of sensitive data

Re: SAS Connect To ODBC with Big Query issue

Re: SAS code for assigning different values of race for an individual ...

Re: Unable to open a particular job in SAS DI Studio

Re: Copy metadata table between 2 metadata servers with code

Re: SAS DI Job : Join on large dataset

Re: SAS DI : Job to check for empty datasets and e-mail all empty dat...

Re: SAS DI : Job to check for empty datasets and e-mail all empty dat...

Re: SAS DI : Job to check for empty datasets and e-mail all empty dat...

Re: SAS DI : Job to check for empty datasets and e-mail all empty dat...

Re: prevent creating a new empty row at the bottom of the file

Re: RETAIN for Each BY

Re: Tools to trace back multi-step Processes : Base SAS

Re: Displaying Macro variable in Title from Loop

Re: TRANWRD function in SAS DI Studio expression

Re: DI Studio mappings

Re: Defining insurance coverage status based on the varying time range...

Re: Find the table whose name has the latest date

Re: DI Studio mappings

SAS Inner Circle Panel

SAS Analytics Explorers