About ErikLund_Jensen

ErikLund_Jensen · ‎04-19-2023

Hi @Ronein Besides the &n problem mentioned by @PeterClemmensen , your code vil not work, because you don't refer to a macro variable in the version 3 data step. Compare version 2 to version 3: b4=&b4.; vs. b&i.=b&i.; This vill transfer the macro variables to data step variables: %let b1=1; %let b2=11; %let b3=12; %let b4=13; %macro RRR; %do i=1 %to 4; b&i.=&&b&i.; %end; %mend RRR; data want3; %RRR; run;

ErikLund_Jensen · ‎04-17-2023

Hi @AhmedAl_Attar , @acordes If there are 180.000 code lines in the input file, one would expect them to be read into SAS as 180.000 lines. When this doesn't happen, it might be because SAS doesn't recognize the record terminator, but reads the file in chunks of 32767 bytes, the max. allowed in an infile statement, though I cannot see why it would read more than one observation in this case. But it could be an interesting experiment to see if a termstr= option on the infile statement specified to match the file (CR, LF or CRLF) could force SAS to read the file as 180.000 code lines. It might be simpler to parse the file this way. An alternative could be to read the file in 16 chunks and concatenate the chunks in one variable/observation before output as a SAS varchar, so the parsing for proc/run wouldn't be compromized by broken lines.

ErikLund_Jensen · ‎04-17-2023

Hi @cnn96 I think this will do it. First use a select to calculate the time_span for each PatientID then add the calculated value with a join. Note: Untested, because you didn't provide a data step to generate test data. proc sql; create table b as select distinct PatientID, count(*) as time_span from mylib.mytable group by PatientID; quit; proc sql; create table table_timespan as select a.*, b.time_span from mylib.mytable as a, b on a.PatientID = b.PatientID; quit;

ErikLund_Jensen · ‎04-16-2023

Hi @ajb Another solution. Note that &varlist gets a leading and trailing quote before the tranwrd function is applied. I works anyway, but gives annoying warning notes if done outside tranwrd. %let varlist = aaaaa bbbbb ccccc zzzzz; %let newvarlist= %sysfunc(tranwrd(%str(%')&varlist%str(%'),%str( ),%str(' ')))%str(%'); %put &=newvarlist; 23 %let varlist = aaaaa bbbbb ccccc zzzzz; 24 %let newvarlist= %sysfunc(tranwrd(%str(%')&varlist%str(%'),%str( ),%str(' '))); 25 %put &=newvarlist; NEWVARLIST='aaaaa' 'bbbbb' 'ccccc' 'zzzzz'

ErikLund_Jensen · ‎04-10-2023

@chinna0369 I agree with @ballardw . Not worth the effort. There is no way to determine the sort order from a data set that ihas not previously been sorted by SAS, so always us a Proc Sort in cases where sorting is required. If a data set has previously been sorted by SAS, the procdure is smart enough to omit sorting when data are already sorted in the wanted sequence, so It comes with no cost besides coding a couple of lines. It is mandatory in good coding practice to make sure that a data set is sorted before any use that requires sorting, which means all Data Steps with first-last variables, retain, lag or merge, and most SAS Procedures, except Proc SQL. Always use a Proc Sort before these steps. The reverse applies to the order of variables. Don't put any effort in bringing variables in a specific order while preparing data. Leave that until you come to creating reports or exporting data to Excel.

ErikLund_Jensen · ‎04-10-2023

Hi @PaigeMiller Thanks for the answer. I know that code can be deployed from SAS Studio, and that there are possibilities for scheduling jobs and defining time-events and file-events as triggers, though I haven't done any experiments yet. But there are still a lot of unanswered questions. I apologize for bothering the community with a full novel, but I have done my best to focus on some major problems and omit a lot of minor details and problem areas. Organizing objects In our DI Studio Folder Tree, objects are organized in data areas. Each data area is a folder with subfolders for data produced in the area (tables and libraries), jobs producing these tables (with deployed jobs and the corresponding flow) and external files read/written by the jobs. The data areas are organized in hierarchies with permissions set at top level, and physical storage is organized in the same structure with inherited permissions. This structure contains 2.629 folders in 785 data areas, and they contain about 21.000 objects (Jobs, Deployed Jobs, Libraries, Tables and external files – today’s count). We don't know how to maintain a similar easily manageable structure in SAS Studio, where each user has a limited view and access to physical data based on permissions, about 100 different AD Groups. We don't even know if it is possible to control logical access i SAS Studio based on AD Groups as we do today. Jobs and flows For maintenance reasons, we try to keep jobs small, meaning no more than about 50 DI Studio Transformations in a single job, preferably less, and only a single or a few tables as output. Then we build flows in SAS Management console, where we use a flow as a "running unit", a group of interconnected jobs with internal dependencies. A flow corresponds to one data area in the DI Studio Folder Tree, so the Folder Tree is the skeleton on which everything hangs. We use LSF as a convenient way to get flows executed in a SAS Grid server cluster with load balancing, but we don't use Process Manager/LSF as a scheduler. The reason is that Process Manager cannot handle triggering based on previous flow events, only time events and file events. Of the 785 flows counted today, 114 are "initial" flows, meaning they are not dependent on results from previous flows and can be started based on time events only. The rest, 671 flows, depend on results from previous flows (mostly several flows) in a complicated hierarchy, where a chain can be more than 15 flows long, and a given chain can contain scores of previous flows. This cannot be maintained manually, especially with a change rate of 220 new/changed jobs and 35 new/changed flows as a weekly average. We have built a scheduler that initiate each daily batch with building a virtual "super-flow" out of all flows that have the current day set as running date, and then proceeds with releasing or excluding flows based on previous results. It is fully automatic, so there is no manual charting of dependencies between flows or defining of triggers involved. The mechanism is based on table lineage extracted from SAS Metadata at batch start-time, The structure has been allowed to grow into such complexity over the years because there is no manual work involved, so it has being going on without anybody realizing that it might be difficult to maintain in other environments. We use about 20 hours per week to promote jobs to production, schedule jobs and define flows, monitor batch execution, identify and correct errors, rerun failed flows etc. Thanks to automation of all processes from promotion to monitoring, this has been a constant over 12 years while our Data Warehouse has grown with a factor 10 or more. And what now This boils down to five technologies that we use today to run our Data Warehouse, and it seems that SAS Viya does not offer a similar functionality to any of them, except (maybe) no. 2. I have underlined what I consider to be the primary outcome of each technology, when it comes to building a similar batch environment in SAS Viya. Deployment in DI Studio with automated generation of command strings to execute jobs. Building Process Manager Flow Definitions in SAS MC with internal job dependencies. Command-line execution of Flow Definitions by Process Manager. Automated load balancing with LSF /Grid Manager in a server cluster with a shared file system. Automatic charting of flow dependencies through lineage maintained in SAS Metadata. Our current use of about 20 hours per week to maintain and run the batch environment with many daily changes makes us anxious. It will be difficult to get senior management to accept that migration to a new and hyped platform has a price. It will be hard to obtain sufficient resources for migration while getting the existing setup running smoothly in parallel, and even harder to get them to realize that the new and smart platform might be a setback to the old mainframe days requiring 10 employees in a separate operating team.

ErikLund_Jensen · ‎04-09-2023

Hi @AhmedAl_Attar Thanks for answering. We are in the initial stages of planning a migration to Viya. We don't really see any benefits from migration except moving to a platform with no announced end-of-support. This last has been a topic for discussion, but our information so far is that we cannot expect further functionsl updates to the SAS9 platform or support for newer Redhat versions, only security updates, which will end in 2028. So we are in no hurry, but we might be facing a migration process that will take years, so we need to start planning right now. Our data preperation platform is SAS9. We do have SAS Visual Analytics with an underlying Viya 3.5 today, but if (when) we migrate from SAS9, we will migrate everything to the current Viya version at that time. So we are planning with Viya 4 and on-premise hosting.

ErikLund_Jensen · ‎04-09-2023

Hi Community We run all data preparation in our Data Warehouse in batch, and we have a setup with more than 6000 jobs / 800 flows in our daily batch. We build and deploy jobs in SAS DI Studio, promote content to production using spk-packages, create flows in SAS Management Console and use Process Manager / LSF to execute flows/jobs. Batch processing is a sadly underaddressed topic, so it is difficult to get an overview over the whole process and figure out how a similar workflow will look in SAS Viya. We have no idea about the ressources involved (counted in man-hours) to migrate the whole setup to Viya, and to handle the running development and maintenance with an average of about 220 new/changed jobs and 35 new/changed flows per week. So t would be a great help to have a ressource page here covering this topic also, something like:

ErikLund_Jensen · ‎03-30-2023

Hi @Ronein The conversion can be done i one statement with the same set of functions. There is no need for intermediate variables and if-then conditions. Note the leading ?? in the yymmdd8 format. The result is set to missing if an invalid date value (incl 0) is encountered, but all the notes "Invalid argument to function INPUT at line ..." are suppressed. data have; input YYYYMMDD ; cards; 20230327 20230324 0 0 20230322 ; run; data want; set have; want_date=compress(put(input(put(YYYYMMDD,8.),??yymmdd8.),DDMMYY10.),'.'); run; When it comes to make this dynamic, so it can handle an unknown set of input variables, there is a problem with naming the output variables, because it is difficult to create new variables in a data step with names derived from unknown input variables, not just wanted_1-wanted_n. In this example, the original variables are dropped, because there is no easy way to drop them afterwards without writing them as literals in a data step. The original variable names are reused for the new character variables in output, so it has the same number of variables with the same names and order, but with different type and content. I started out with building variable lists in macro variables, but it became rather big and complicated, so I changed to a call execute-approach. Using a data step to write another data step this way is a very powerful tool, as you can see in the following example: data have2; input YYYYMMDD1 1-8 date2 10-18 date 19-27 X4 ; cards; 20230327 20230327 20230322 20230322 20230324 0 20230322 20230322 0 20230322 0 0 0 0 0 0 20230322 20230322 0 20230322 ; Run; data _null_; set have2 (obs=1); array numdate{*} _numeric_; length ilist vlist $200;*48.; do i = 1 to dim(numdate); ilist = catx(' ', ilist, vname(numdate{i})); vlist = catx(' ', vlist, catt('want_', vname(numdate{i}))); end; call execute(catx(' ', 'data want2 (drop=i', ilist, '); set have2;')); call execute('array numdate{*} _numeric_;'); call execute(catx(' ', 'array chardate {*} $10', vlist, ';')); call execute('do i = 1 to dim(numdate);'); call execute('chardate{i} = compress(put(input(put(numdate{i},8.),??yymmdd8.),DDMMYY10.),".");'); call execute('end; run;'); run; I had a lot of fun out of this, thank you!

ErikLund_Jensen · ‎03-29-2023

Hi @analyticsguy23 I haven't tried it, but I think ihe easiest way would be to write data from SAS to a database table and read the database table from DOMO. There are several prerequisites for this: A database (MSSQL, MySQL or PostgreSQL) that is accessible from both SAS and DOMO. A running SAS platform with a separately licensed Access Module for the database. A DOMO license for the proper Database Connector. An alternative, that would work without a database and a SAS Access Module, could be to export from SAS to a CSV file and import it in DOMO using a CSV Connector. There prerequisites here are: A file system that is accessible from both SAS and DOMO (might be the tricky part) A running SAS platform A DOMO license for the CSV Connector.

ErikLund_Jensen · ‎03-28-2023

Hi @s_lassen You wrote I do not understand why you think it is so important to keep the MPRINT output and the actual macro variable values unreadable. We normally run production jobs with NOMPRINT, and MPRINT is only activated for debugging. That's why I like to put macro variable values in a human-readable form, so they appear in the log anyway and can be used in a first-level-debugging without having to read thousands of MPRINT lines. And I really see date constants in macro variables as a nuisance. A SAS Date value will always work, but a date constant in a macro variable might give all kinds of trouble. It cannot be used in a macro calculation, and the presence of quotes in the value can also be troubleome, e.g. when building strings for CALL Execute using CAT-functions. Consider this example: 161 %let date_start = "%sysfunc(intnx(month,%sysfunc(today()),-3,b),date9.)"d; 162 %let date_start = %eval(&date_start + 7); ERROR: A character operand was found in the %EVAL function or %IF condition where a numeric operand is required. The condition was: "01DEC2022"d + 7 163 %put &=date_start; DATE_START= 164 165 %let date_start = %sysfunc(intnx(month,%sysfunc(today()),-3,b)); 166 %let date_start = %eval(&date_start + 7); 167 %put %sysfunc(putn(&date_start,date9.)); 08DEC2022 Best greetings Erik

ErikLund_Jensen · ‎03-27-2023

Hi @David_Billa In SAS, dates should never be kept in a data set as text values, but always converted to SAS Date values. Similar with macro variables, they should also contain SAS Date values unless they are created for the sole purpose of being human-readable, e.g. as part of a title statement. This way the macro variables can always be used in where-statements, data step functions etc. without any conversion, so the coding becomes simpler and less error-prone. But it is also true that it is nice to have the date values written to the log in a readable form, as @s_lassen advocates. It is always a good practice to write a macro variable to the log when it has been assigned a value, and with a little effort, one can have both: %let date_start = %sysfunc(intnx(month,%sysfunc(today()),-3,b)); %put &=date_start (%sysfunc(putn(&date_start,date9.))); DATE_START=22980 (01DEC2022)

ErikLund_Jensen · ‎03-26-2023

Hi @dandsouza Sorry, forgot to copy a ; before the first run; - this hopefullky looks better: data have; input filename :$20. date :date9.; format date yymmdd10.; datalines; IncomeCalc 01Dec2021 IncomeCalc 02Dec2021 IncomeCalc 03Dec2021 IncomeCalc 04Dec2021 IncomeCalc 05Dec2021 IncomeCalc 06Dec2021 IncomeCalc 07Dec2021 DonateCalc 07Dec2021 ExpenseCalc 15Dec2021 IncomeCalc 08Dec2021 IncomeCalc 09Dec2021 IncomeCalc 19Dec2021 IncomeCalc 11Dec2021 IncomeCalc 12Dec2021 IncomeCalc 13Dec2021 IncomeCalc 14Dec2021 DonateCalc 07Dec2021 ExpenseCalc 30Dec2021 ; run; /* Step 1 - calculate the count of days since last run + the number of times a given filename reoccurs. */ proc sort data=have; by filename date; run; data interval (drop=date_last intervals_total) files (drop=date date_last interval); set have; by filename date; if first.filename then intervals_total = 0; intervals_total + 1; date_last = lag(date); if not first.filename then do; interval = date-date_last; output interval; end; if last.filename then do; intervals_total = intervals_total - 1; output files; end; run; /* Step 2 - For each combination of filename / day_count, calculate the number of occurrences. */ proc sql; create table interval_count as select filename, interval, count(*) as intervals_found from interval group by filename, interval order by filename, intervals_found ; quit; /* Step 3 - For each combination of filename / day_count, calculate the percentage of this day_count.*/ data want_intermediate; merge interval_count files; by filename; format pct_occurred 5.0; pct_occurred = (intervals_found * 100) / intervals_total; run; /* Step 4 - For each filename, take the most frequently occurring day_count and translate the day_count to a scheduling period. */ data want_final; set want_intermediate; by filename; length schedule $20; if last.filename then do; if 86 < interval < 33 then schedule = 'Quarterly'; else if 26 < interval < 33 then schedule = 'Monthly'; else if 12 < interval < 16 then schedule = 'SecondWeek'; else if interval = 7 then schedule = 'Weekly'; else if interval = 1 then schedule = 'Daily'; else schedule = 'Undetermined'; output; end; run;

ErikLund_Jensen · ‎03-26-2023

Hi @dandsouza I will suggest a slightly more complex approach: For each reoccuring filename, calculate the count of days since last run + the number of times a given filename reoccurs. For each combination of filename / day_count, calculate the number of occurrences. For each combination of filename / day_count, calculate the percentage of this day_count. For each filename, take the most frequently occurring day_count and translate the day_count to a scheduling period. The translation in step 4 may need some refinement based on the result from step 3. it is difficult to figure out from the provided data example. data have; input filename :$20. date :date9.; format date yymmdd10.; datalines; IncomeCalc 01Dec2021 IncomeCalc 02Dec2021 IncomeCalc 03Dec2021 IncomeCalc 04Dec2021 IncomeCalc 05Dec2021 IncomeCalc 06Dec2021 IncomeCalc 07Dec2021 DonateCalc 07Dec2021 ExpenseCalc 15Dec2021 IncomeCalc 08Dec2021 IncomeCalc 09Dec2021 IncomeCalc 19Dec2021 IncomeCalc 11Dec2021 IncomeCalc 12Dec2021 IncomeCalc 13Dec2021 IncomeCalc 14Dec2021 DonateCalc 07Dec2021 ExpenseCalc 30Dec2021 run; /* Step 1 - calculate the count of days since last run + the number of times a given filename reoccurs. */ proc sort data=have; by filename date; run; data interval (drop=date_last intervals_total) files (drop=date date_last interval); set have; by filename date; if first.filename then intervals_total = 0; intervals_total + 1; date_last = lag(date); if not first.filename then do; interval = date-date_last; output interval; end; if last.filename then do; intervals_total = intervals_total - 1; output files; end; run; /* Step 2 - For each combination of filename / day_count, calculate the number of occurrences. */ proc sql; create table interval_count as select filename, interval, count(*) as intervals_found from interval group by filename, interval order by filename, intervals_found ; quit; /* Step 3 - For each combination of filename / day_count, calculate the percentage of this day_count.*/ data want_intermediate; merge interval_count files; by filename; format pct_occurred 5.0; pct_occurred = (intervals_found * 100) / intervals_total; run; /* Step 4 - For each filename, take the most frequently occurring day_count and translate the day_count to a scheduling period. */ data want_final; set want_intermediate; by filename; length schedule $20; if last.filename then do; if 86 < interval < 33 then schedule = 'Quarterly'; else if 26 < interval < 33 then schedule = 'Monthly'; else if 12 < interval < 16 then schedule = 'SecondWeek'; else if interval = 7 then schedule = 'Weekly'; else if interval = 1 then schedule = 'Daily'; else schedule = 'Undetermined'; output; end; run;

ErikLund_Jensen · ‎03-24-2023

Hi @Quentin It works in Linux but not in Windows, so I think you are right in assuming it is a bug.

Online Status	Offline
Date Last Visited	‎06-24-2025 05:37 AM

Re: SAS Metadata: list of all attributes you can use in METADATA_GETAT...

Re: SAS Metadata: list of all attributes you can use in METADATA_GETAT...

Re: Split name vertically

Re: How to flag the last AVALC = 'Y' record prior to this AVALC='N' re...

Re: The use of last.id by multiple groups

Re: combine datasets even when results are missing

Re: combine datasets even when results are missing

Re: How do i parse specific number combinations from a field in a tabl...

Re: How do i parse specific number combinations from a field in a tabl...

Re: Pseudonymization of sensitive data

Re: How to convert DATETIME to ISO8601 format ?

Re: create macro vars-End of month

Re: Flag the next value of a variable

Re: SAS Programming 1_Lesson 5_p105a03.sas: repeating the same code do...

Re: Write SAS Time fields to Excel - loses time format

Re: combine datasets even when results are missing

Re: Pseudonymization of sensitive data

Re: SAS Connect To ODBC with Big Query issue

Re: SAS code for assigning different values of race for an individual ...

Re: Unable to open a particular job in SAS DI Studio

Re: using macro variables as values in data set

Re: regex prxnext ungreedy

Re: Calculate variable in different rows and create a new variable.

Re: Add single quotes to a %let varlist

Re: Find presorted variables from unknow sas dataset

Re: Batch processing in SAS Viya

Re: Batch processing in SAS Viya

Batch processing in SAS Viya

Re: Array-Apply on multiple columns

Re: SAS + Domo

Re: Macro variable with date9. format

Re: Macro variable with date9. format

Re: Understanding scheduling frequency basis execution dates

Re: Understanding scheduling frequency basis execution dates

Re: Open %IF / %ELSE to create part of a statement

SAS Inner Circle Panel

SAS Analytics Explorers