About PhilC

Patrick · ‎02-01-2025

This "Output Data Step View" as shown by @FriedEgg looks interesting especially should it also work for Viya in-cas processing (=a view for loading data into CAS). Reducing I/O is normally driven by performance requirements but it looks like the current implementation of these output views won't help with that (see log below). As long as a table fits into memory I'd be using a hash object for such a requirement. Code: options fullstimer; data work.cars(drop=_:); do _i=1 to _nobs; do _j=1 to 10000; set sashelp.cars nobs=_nobs point=_i; output; end; end; stop; run; /* option 1: output view */ data cars_asia(where=(origin='Asia')) cars_europe(where=(origin='Europe')) cars_usa(where=(origin='USA')) / view=split_sort; if 0 then set work.cars; set split_sort; run; proc sort data=work.cars out=split_sort; by msrp; run; /* option 2A: Hash table */ data _null_; if 0 then set work.cars; dcl hash h1(dataset:'work.cars', multidata:'y', ordered:'y'); h1.defineKey('origin'); h1.defineData(all:'y'); h1.defineDone(); h1.output(dataset:"cars_asia(where=(origin='Asia'))"); h1.output(dataset:"cars_asia(where=(origin='Europe'))"); h1.output(dataset:"cars_asia(where=(origin='USA'))"); run; /* option 2B: Hash table with dynamic output */ data _null_; dcl hash h1(dataset:'work.cars(obs=0)', multidata:'y', ordered:'y'); h1.defineKey('origin'); h1.defineData(all:'y'); h1.defineDone(); dcl hash h2(ordered:'y', multidata:'n'); h2.defineKey('origin'); h2.defineDone(); dcl hiter hh2('h2'); do _i=1 to _nobs; set work.cars nobs=_nobs point=_i; _rc=h1.add(); _rc=h2.ref(); end; _rc=hh2.first(); do while(_rc=0); _rc=h1.output(dataset: cats('cars_',origin,'(where=(origin="',origin,'"))') ); _rc=hh2.next(); end; stop; run; Log: 1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK; 68 69 options fullstimer; 70 data work.cars(drop=_:); 71 do _i=1 to _nobs; 72 do _j=1 to 10000; 73 set sashelp.cars nobs=_nobs point=_i; 74 output; 75 end; 76 end; 77 stop; 78 run; NOTE: The data set WORK.CARS has 4280000 observations and 15 variables. NOTE: DATA statement used (Total process time): real time 0.86 seconds user cpu time 0.55 seconds system cpu time 0.31 seconds memory 1962.25k OS Memory 22948.00k Timestamp 02/02/2025 01:41:13 AM Step Count 158 Switch Count 5 Page Faults 0 Page Reclaims 234 Page Swaps 0 Voluntary Context Switches 24 Involuntary Context Switches 9 Block Input Operations 0 Block Output Operations 1272584 79 80 /* option 1: output view */ 81 data cars_asia(where=(origin='Asia')) 82 cars_europe(where=(origin='Europe')) 83 cars_usa(where=(origin='USA')) 84 / 85 view=split_sort; 86 87 if 0 then set work.cars; 88 set split_sort; 89 run; NOTE: DATA STEP view saved on file WORK.SPLIT_SORT. NOTE: A stored DATA STEP view cannot run under a different operating system. WARNING: The definition of an output DATA step view is an experimental feature in this release and is not intended for use in the development of production applications. NOTE: DATA statement used (Total process time): real time 0.00 seconds user cpu time 0.01 seconds system cpu time 0.00 seconds memory 2415.43k OS Memory 23208.00k Timestamp 02/02/2025 01:41:13 AM Step Count 159 Switch Count 2 Page Faults 0 Page Reclaims 305 Page Swaps 0 Voluntary Context Switches 11 Involuntary Context Switches 0 Block Input Operations 0 Block Output Operations 264 90 91 proc sort data=work.cars out=split_sort; 92 by msrp; 93 run; NOTE: There were 4280000 observations read from the data set WORK.CARS. NOTE: The data set WORK.SPLIT_SORT has 4280000 observations and 15 variables. NOTE: View WORK.SPLIT_SORT.VIEW used (Total process time): real time 1:11.07 user cpu time 27.64 seconds system cpu time 36.84 seconds memory 821505.54k OS Memory 845336.00k Timestamp 02/02/2025 01:42:24 AM Step Count 160 Switch Count 4280007 Page Faults 0 Page Reclaims 54606 Page Swaps 0 Voluntary Context Switches 8560583 Involuntary Context Switches 318 Block Input Operations 0 Block Output Operations 1267232 NOTE: The data set WORK.CARS_ASIA has 1580000 observations and 15 variables. NOTE: The data set WORK.CARS_EUROPE has 1230000 observations and 15 variables. NOTE: The data set WORK.CARS_USA has 1470000 observations and 15 variables. NOTE: PROCEDURE SORT used (Total process time): real time 1:11.13 user cpu time 27.64 seconds system cpu time 36.89 seconds memory 821505.54k OS Memory 845336.00k Timestamp 02/02/2025 01:42:24 AM Step Count 160 Switch Count 4280010 Page Faults 0 Page Reclaims 54696 Page Swaps 0 Voluntary Context Switches 8560635 Involuntary Context Switches 328 Block Input Operations 0 Block Output Operations 1273920 94 95 /* option 2A: Hash table */ 96 data _null_; 97 if 0 then set work.cars; 98 dcl hash h1(dataset:'work.cars', multidata:'y', ordered:'y'); 99 h1.defineKey('origin'); 100 h1.defineData(all:'y'); 101 h1.defineDone(); 102 h1.output(dataset:"cars_asia(where=(origin='Asia'))"); 103 h1.output(dataset:"cars_asia(where=(origin='Europe'))"); 104 h1.output(dataset:"cars_asia(where=(origin='USA'))"); 105 run; NOTE: There were 4280000 observations read from the data set WORK.CARS. NOTE: The data set WORK.CARS_ASIA has 1580000 observations and 15 variables. NOTE: The data set WORK.CARS_ASIA has 1230000 observations and 15 variables. NOTE: The data set WORK.CARS_ASIA has 1470000 observations and 15 variables. NOTE: DATA STEP stopped due to looping. NOTE: DATA statement used (Total process time): real time 1.77 seconds user cpu time 1.18 seconds system cpu time 0.58 seconds memory 1000799.62k OS Memory 1023376.00k Timestamp 02/02/2025 01:42:26 AM Step Count 161 Switch Count 9 Page Faults 0 Page Reclaims 7220 Page Swaps 0 Voluntary Context Switches 46 Involuntary Context Switches 21 Block Input Operations 0 Block Output Operations 1273120 106 107 /* option 2B: Hash table with dynamic output */ 108 data _null_; 109 dcl hash h1(dataset:'work.cars(obs=0)', multidata:'y', ordered:'y'); 110 h1.defineKey('origin'); 111 h1.defineData(all:'y'); 112 h1.defineDone(); 113 dcl hash h2(ordered:'y', multidata:'n'); 114 h2.defineKey('origin'); 115 h2.defineDone(); 116 dcl hiter hh2('h2'); 117 do _i=1 to _nobs; 118 set work.cars nobs=_nobs point=_i; 119 _rc=h1.add(); 120 _rc=h2.ref(); 121 end; 122 123 _rc=hh2.first(); 124 do while(_rc=0); 125 _rc=h1.output(dataset: cats('cars_',origin,'(where=(origin="',origin,'"))') ); 126 _rc=hh2.next(); 127 end; 128 stop; 129 run; NOTE: There were 0 observations read from the data set WORK.CARS. NOTE: The data set WORK.CARS_ASIA has 1580000 observations and 15 variables. NOTE: The data set WORK.CARS_EUROPE has 1230000 observations and 15 variables. NOTE: The data set WORK.CARS_USA has 1470000 observations and 15 variables. NOTE: DATA statement used (Total process time): real time 2.89 seconds user cpu time 2.26 seconds system cpu time 0.62 seconds memory 769862.90k OS Memory 792204.00k Timestamp 02/02/2025 01:42:29 AM Step Count 162 Switch Count 17 Page Faults 0 Page Reclaims 7212 Page Swaps 0 Voluntary Context Switches 71 Involuntary Context Switches 16 Block Input Operations 0 Block Output Operations 1273112 130 131 132 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK; 142 There aren't many use cases where splitting a table into multiple tables is the optimal design and for that reason I won't upvote the idea.

cgates · ‎12-31-2024

Unfortunately nothing worked for me. I just gave up on automating the process for the protected information.

BillSawyer · ‎12-19-2023

I reported this as a problem but unfortunately no fix/change was made. Several folks on this thread mainly wanted a way of knowing the job finished and not necessarily needed the last dataset/result to open. So, in those cases, using View > Submission Status is an easy way to confirm your job finished. Alternatively, you could add a program at the end of your process flow that generates a dataset then right-click the dataset to then select Share > Email as a step in project. So that when the process flow finishes an email will be sent to you. The only set up (other than adding the program to your process flow) is in Tools > Options > Administration. See sample code to generate dummy dataset, and images below. Regards, Bill data last_step; status="EG Job xyz is Finished"; run;

FrankPoppe · ‎10-17-2022

I had the same problem, and this fixed it. Great. My XML had the following entry <CompletionTime Index="1" Width="144000000000001" /> So column2 appeared somewhere far out of reach. Two questions remain: What causes nonsensical entries like this? Will my change to the XML be persistent?

xxformat_com · ‎11-17-2021

Hi Phil, Is the following issue similar to yours? The title statement is added after the proc print/before proc odstext. title statement, ods noproctitle... doesn't help. If so, just add : sheet_interval='none'. You then only have to add sheet_interval='now' to add a title later on on the worksheet. ods excel file="&xxtest./reporting/test.xlsx" options(embedded_titles='yes' /*sheet_interval='none'*/); title 'Example'; proc print data=sashelp.class; run; proc odstext; p 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'; run; ods excel close; sheet_interval='none' On the second page, you'll have to an empty line at the top of the worksheet (unless someone has a bette solution) ods excel options(sheet_name='Sheet 2' sheet_interval='now'); proc odstext; *breakpage=yes is implicite in this case; p ''; run; *ods excel options(sheet_interval='none'); *implicit;

xxformat_com · ‎11-09-2021

Hi, If you test this example, you will notice that 128 characters are used (width=100%). Adding one extra character actually create a line break character. So 130 characters in total. Add the len() Excel function to see it. So to double the width, we can use the width=200% style attribute. Having said that, I'm not sure where 128 limits is defined. Could it be in the style template? I haven't see it there. ods excel file="&xxtest./reporting/ods_excel_test.xlsx"; proc odstext; p 'abcdefghi-abcdefghi-abcdefghi-abcdefghi-abcdefghi-abcdefghi-abcdefghi-abcdefghi-abcdefghi-abcdefghi-abcdefghi-abcdefghi-123456'; p 'abcdefghi-abcdefghi-abcdefghi-abcdefghi-abcdefghi-abcdefghi-abcdefghi-abcdefghi-abcdefghi-abcdefghi-abcdefghi-abcdefghi-1234567'; p 'abcdefghi-abcdefghi-abcdefghi-abcdefghi-abcdefghi-abcdefghi-abcdefghi-abcdefghi-abcdefghi-abcdefghi-abcdefghi-abcdefghi-12345678'; p 'abcdefghi-abcdefghi-abcdefghi-abcdefghi-abcdefghi-abcdefghi-abcdefghi-abcdefghi-abcdefghi-abcdefghi-abcdefghi-abcdefghi-123456789'; p 'abcdefghi-abcdefghi-abcdefghi-abcdefghi-abcdefghi-abcdefghi-abcdefghi-abcdefghi-abcdefghi-abcdefghi-abcdefghi-abcdefghi-123456789-'; run; ods excel close; ps. ods excel options(flow='text') does not help in this example.

Tom · ‎09-15-2021

I doubt that TRANTAB is the solution. That is just still there for backwards compatibility with older versions of SAS. Step one is to open a ticket with SAS support and have them explain to you why your SAS installation is using a different definition of $EBCDIC informat than what I am seeing. They might have a simple solution. It might be something that SAS has done and they can explain. It might be something about how your installation of SAS is configured and they can help figure that out also. If there is no simple configuration or option change solution then it might be easiest to add a step to your existing code to post-process the resulting text strings. So if the code is just reading in binary text file AA and making some number of character variables using $EBCDIC informat to create SAS dataset XX. You could then just add a step that updates that dataset by looping over all character variables and running whatever code is needed to correct the format issue. That might be a simple KCVT() call. It might be two KCVT() calls. It might be a simple TRANSLATE() call. data xx; set xx; array _c _character_; do over _c; _c = translate(_c ,'@','E0'x); end; run; You might create a macro to do that which takes the dataset name as input so you can quickly add that step in as many places as you need.

ncorbett · ‎08-30-2021

Hi PhilC, Thanks I will let you know how it works!

aguilar_john · ‎08-30-2021

Thank you for your approach, it does not always work for my scenarios. But I solved it differently. I firstly created a datafile with all ID's and months going forward, then I only join the desired entries to my "base table".

jec150 · ‎08-30-2021

Thanks for the response! I agree that modifying the data structure would work. I was hoping to be able to modify the macro so that I would not have to transpose from long to wide, run the macro, and then transpose back to long (as I need the data in long format for my next steps); however, this may be the easiest fix.

ab81 · ‎08-29-2021

Yes, each of the other type of variables in that gignatic list are coded 0/1 too.

ChrisNZ · ‎08-27-2021

The approach you chose should work. data DATA; A=1.5; UNIT='StdDev'; output; A=0.5; UNIT='PCT' ; output; run; data _F; set DATA(rename=(A=START)); retain FMTNAME 'report' HLO 'F'; LABEL=ifc( UNIT='StdDev', '5.2', 'percent.2') ; output; run; proc format cntlin=_F; run; proc report data=DATA; columns A ; define A /display format=report.; run; A 1.50 50% proc format ; value report 0-1=[percent.2] other =[5.2]; run; data REPORT; A=0.5; output; A=1.5; output; run; proc report data=REPORT; columns A ; define A /display format=report.; run; A 50% 1.50

PaigeMiller · ‎08-26-2021

@bknitch wrote: So for that I'm only concerned where if Missing is higher in the hierarchy. Meaning, if an A3 was populated in 2019 and not populated in 2020 but an A2 was populated in 2020 and not in 2019 I would exclude this/ or not look at this record. Hope that makes sense... Sorry, no I still don't understand. What does "exclude" mean in the context of your earlier description of how to assign the "Missing" "Found" "NoMatch" to each cell? But I think we have gone far enough down this path. If the code provided so far meets your need, fine, problem solved. If not, I am asking you to re-write the requirements from scratch, to address all of these issues, so that the explanation is clear and in one description of the logic, so we don't have to scroll up and down and re-read earlier comments to put it all together and understand.

Tom · ‎08-25-2021

Try reading some of the papers published on PDC calculation. @sas.com PDC calculations

PhilC · ‎08-25-2021

the "Colon Modifier" for the string comparisons in SAS is also known as "Bounded String Compare". Just to articulate the concepts so its easier search this forum.

Online Status	Offline
Date Last Visited	‎04-12-2022 11:06 PM

Re: Creating a categorical variable conditioned on a series of if-then...

Re: Creating a categorical variable conditioned on a series of if-then...

Re: Creating a categorical variable conditioned on a series of if-then...

Re: Add Excel File Name as Variable When Importing Multiple Excel File...

Re: How do I format a field using two different formats in proc report

Re: Allow PROC SORT to output multiple datasets

Re: Allow PROC SORT to output multiple datasets

Re: Looking for Like Values in Array

Re: Find Outliers and Matches in dataset

Re: Need help with code, calculating proportion of days covered by med...

Re: How do I format a field using two different formats in proc report

SAS EG 8 scroll bars

Support WITH clauses in SQL

Re: Support WITH clauses in SQL

EG 8 - Bring back the RUN command on the context menu

Re: After upgrading to 8.2, EG could not "find" local host

Re: Option to save the LOG Summary to a dataset

Re: ODS EXCEL: option for the inserting of line feeds when wrapping li...

Re: XML Mapper exporting multiple tables

Re: Looking for Like Values in Array

Re: Allow PROC SORT to output multiple datasets

Re: How To Download a Google Sheet .csv through SAS

Re: Automatically open output when running query in project in SAS EG ...

Re: Submission status displayed wrong

Re: ODS, ODSText, Titles

Re: PROC ODSTEXT, line wrapping and 139

Re: how to avoid an at sign in an email address to be converted to a f...

Re: Add Excel File Name as Variable When Importing Multiple Excel File...

Re: monthly slices

Re: Jackboot Macro Question

Re: Creating a categorical variable conditioned on a series of if-then...

Re: How do I format a field using two different formats in proc report

Re: Find Outliers and Matches in dataset

Re: Need help with code, calculating proportion of days covered by med...

Re: Looking for Like Values in Array