About kiranv_

hollandnumerics · ‎01-13-2024

Company: Holland Numerics Ltd Company background: Holland Numerics Limited is a small software consultancy based in Royston, Hertfordshire, UK. Founded in 1992 it is primarily focused on SAS development and helping clients get the maximum value from their use of SAS software. Contact: Philip R Holland Title: Consultant Country: United Kingdom Award Category: Curious Thinker Tell us how you are inspired by the work you do? My aim as a freelance contractor has always been to help my clients to the best of my ability, but also to use my independence to highlight problems which my clients need to address, even if they have tried to ignore them in the past, and to make the best use of SAS software. How has SAS played a part in your journey? 43 years working with SAS software has been my working lifetime. 43 years working with SAS software has been an achievement, which I have thoroughly enjoyed. The only part I am missing would be an award!? What products do you use and how do you use them? Over the last 43 years I have used the following SAS products: Base SAS, SAS/GRAPH, SAS/ACCESS, SAS/STAT, SAS Enterprise Guide, SAS Studio, ODS Graphics, SAS Grid, SAS/EIS, SAS/AF, SAS/CONNECT on the following SAS platforms: Windows, IBM mainframe (MVS, OS/390), OS/2, UNIX, VMS, Linux What is your greatest lesson learned that you'd like to share with others? I enjoy helping my clients and encouraging new SAS programmers, but what I still love doing is SAS programming. After over 40 years of programming I can now think in the SAS language, so, if I can sort out a solution to a problem in my head, I am able to write out a rough draft of an entire program!

hollandnumerics · ‎06-24-2023

Daniel's position on SASensei in Africa and globally is testament to his dedication and hard work using SAS

NaveenKha · ‎03-26-2023

There is no common table expression in PROC SQL, instead use subquery. The only difference is CTE's can be recursive and subqueries are not. CTE's can be used over and over as a different table. The advantage of a Subquery is that it can be used as column and only subqueries can be used in the WHERE clause!.

ballardw · ‎11-10-2022

@totoleheros The OP has not been back to the forum since 2017. So likely no need to resurrect this 5-year-old thread.

simeonthomas2 · ‎03-20-2022

I encountered a similar situation recently and the issue was that some of the attributes used in the groupby clause were very large strings. Limiting the length of the characters in these attributes fixed the issue for me. For example I reduced the limit of character length from 2000 to 50.

Quentin · ‎03-02-2022

Writing code that writes code is inherently trickier than writing code. : )

NKormanik · ‎02-22-2022

You da man, @kiranv_ !! Who would've thought.

Roger_Dodger · ‎01-11-2022

I often want to temporarily keep a report available that would be overwritten by running a portion of the same code file. The quickest workaround for me to keep the report available is, while on the results tab, right click the report name in the left side index and "share" it with another application. Sharing it with "Send to Windows default" is probably the best option for most users and if you haven't changed the default report style from HTML (in the Tools > Options > Results menu) then that should send it to a new tab in your web browser.

SASKiwi · ‎11-03-2021

@DavidRice - I also noticed that there were a number of SAS notes related to the XML92 engine pointing out other problems with it. Also be wary of ODS as it has evolved rapidly over recent SAS maintenance releases so it always pays to be using the most recent releases as far as ODS is concerned.

anscot · ‎10-04-2021

Congratulations everyone! We appreciate and love your support ❤️

Kurt_Bremser · ‎08-03-2021

This is the SUMMARY solution: proc summary data=have; class subjid; /* if that explodes your memory, sort first and use BY */ /* if dataset is already sorted, use BY anyway */ /* if you use CLASS with more than one variable, add NWAY to the PROC SUMMARY statement */ var date; output out=want (drop=_type_ _freq_) min()=min_date max()=max_date ; run;

Kurt_Bremser · ‎11-16-2020

The physical position is counted in computing-typical way, starting at zero. That's why the second variable starts at 8, and not at 9. Address of variable within a dataset page = starting address of observation + npos. This is, of course, different from the way string positions are handled in SAS code, where the first position is defined as 1. You need to see the distinction between SAS language, which is heavily targeted towards non-programmer users and "everyday thinking", and internal values for technical means, where the values as used in CPU registers are important.

ChrisNZ · ‎09-18-2020

Introduction One question that comes back often, both on this forum and around me, is about the method that a SAS programmer should use when joining tables. There is of course no definitive answer to this question, but since I am working on the third edition of my book about optimising the performance of SAS programs, I thought I would take a closer look at this common theme. No silver bullet It is impossible to make a comprehensive study about table joins. The goal of a join might be of add one or more columns, or to add none but to validate against some baseline values, or perhaps to add or delete rows. The input tables could be sorted or indexed or without any order. Some tables may be huge while others are tiny, some may be wide (many columns) and others narrow (few columns). There could be many tables in a join or just a couple. The number of matches derived from a join could vary from a few rows using unique keys to billions of rows using Cartesian products. The output tables may need to be sorted or not. The ways to combine data are countless. Furthermore, SAS has many syntaxes to read and write tables. Data steps use the SET, MERGE, UPDATE and MODIFY statements, together with options like KEY = or POINT = . Data steps can also use hash tables. SAS can also use proc sql statements like CREATE, UPDATE, INSERT and DELETE statements with operators such as JOIN, UNION,INTERSECT, and EXCEPT and their many variants as well as other nested or subquery operators. It is even possible to use other SAS features such as formats. The SAS language offers a wide array of methods to join a given set of data data. The variations are infinite, each join differs from the other and has to be examined on its own merit. Benchmark Yet, these considerations are not much help, and one must start somewhere if one is to study the most efficient way to join tables. Here is a benchmark that I found interesting enough to publish and hope you will find it interesting too. For this test, we consider a common task such as combining two tables, and adding data from the second table to the first. This example uses two tables with 10 million rows, and we gradually retrieve more rows and more variables from the second table. Four methods are used: - Sort and Merge - Index + SET with option KEY = - Index + load in memory with SASFILE + set with option KEY= - Hash table The benchmark code is provided at the bottom of this page. Listed below are the results of these four tests in my environment, with the method yielding the lowest times highlighted in color. The results are measured in elapsed (i.e. wall clock) seconds. The lower the better. Join method: Sort + Merge by # columns retrieved % of rows retrieved 1% 5% 10% 20% 50% 100% 1 79 80 81 84 79 78 5 81 79 79 79 79 79 10 89 88 89 95 93 90 20 93 94 92 95 90 90 30 95 95 95 96 95 94 40 99 98 102 99 100 99 50 104 105 105 105 105 110 Join Method: Hash table # columns retrieved % of rows retrieved 1% 5% 10% 20% 50% 100% 1 57 59 58 60 65 73 5 56 57 59 61 66 77 10 61 63 64 69 69 80 20 66 68 64 69 71 82 30 61 62 63 66 72 82 40 62 63 64 67 75 89 50 - - - - - - Join Method: Index+set key= # columns retrieved % of rows retrieved 1% 5% 10% 20% 50% 100% 1 47 61 69 94 161 268 5 46 56 68 92 153 270 10 48 59 71 94 187 297 20 49 59 71 97 159 270 30 48 57 68 90 156 267 40 47 56 68 88 156 262 50 47 56 68 92 158 263 Join Method: Index+sasfile+set key= # columns retrieved % of rows retrieved 1% 5% 10% 20% 50% 100% 1 49 53 53 58 72 93 5 50 52 55 59 71 96 10 55 58 60 66 77 99 20 61 63 67 72 82 105 30 63 66 68 73 87 110 40 68 71 72 78 94 118 50 - - - - - - Summarising the results yields the table below, showing the lowest elapse times. # columns retrieved % of rows retrieved 1% 5% 10% 20% 50% 100% 1 47 53 53 58 65 73 Hash table 5 46 52 55 59 66 77 Index+set key= 10 48 58 60 66 69 80 Index+sasfile+key= 20 49 59 64 69 71 82 Sort+merge (sorted output) 30 48 57 63 66 72 82 40 47 56 64 67 75 89 50 47 56 68 92 105 110 Based on the above listed results one can deduce that some methods are clearly better than others depending on the task at hand. Index look-ups are the fastest when few rows are needed (area highlighted in green). When more rows are fetched, loading data in memory accelerates the random reads: Using SASFILE has an edge for fewer columns (area highlighted in blue) while a hash table provides better results when more columns are retrieved (area highlighted in purple). Note that for the bottom row of the table (50 columns retrieved), the only available methods are indexing and sorting since the lookup table is too large to fit in memory. In this case, sorting becomes better than indexing when more than 20% of the rows are fetched, which is a common rule-of-thumb value. Also note that when both input tables are already sorted, using MERGE BY is always the best solution by far: it only takes 32 seconds to merge all rows and all columns. That’s why permanent tables should be sorted when possible (with the VALIDATED flag set!). Sorting is a one-off cost that yields ongoing savings. Furthermore, using MERGE BY creates a sorted table (SAS, when will we have a way to set the SORT VALIDATED flag? When?) Conclusion I hope you found these results as interesting as I did. There is no “good” or “bad” join method and as usual, one must master the different tools available in order to choose the right one for a given task. Did you expect these results? Do you have comments about the results? Do you have other experiences or insights to share? Feel free to comment. Here is the code for this simple benchmark %macro g; %let cols=1 5 10 20 30 40 50; %let rows=0.01 0.05 0.10 0.20 0.50 1; %do colno=1 %to %sysfunc(countw(&cols)); %let colnb=%scan(&cols,&colno); %do rowno=1 %to %sysfunc(countw(&rows,%str( ))); %let rowpct=%scan(&rows,&rowno,%str ( )); %put Trace: &=rowpct &=colnb *start; data TAB1 TAB2; retain VAR1-VAR50 8; do I=1 to 1e7; KEY=int(ranuni(0)*1e15); output; end; run; %put Trace: &=rowpct &=colnb *sort merge; proc sort data=TAB1 out=SORTED1 sortsize=1g details; by KEY; run; proc sort data=TAB2(keep=KEY VAR1-VAR30&colnb.) out=SORTED2 sortsize=1g details; by KEY; un; data OUT; merge SORTED1 SORTED2; by KEY; %* We always fetch all rows with a merge; run; %put Trace: &=rowpct &=colnb *index; proc sql; create index KEY on TAB2; data OUT; set TAB1; if ranuni(0) < &rowpct. then set TAB2(keep=KEY VAR1-VAR&colnb.) key=KEY; run; %put Trace: &=rowpct &=colnb *index sasfile; data REDUCED(index=(KEY)); set TAB2(keep=KEY VAR1-VAR&colnb.); run; sasfile REDUCED open; data OUT; set TAB1; if ranuni(0) < &rowpct. then set REDUCED(keep=KEY VAR1-VAR&colnb.) key=KEY; run; sasfile REDUCED close; %put Trace: &=rowpct &=colnb *hash; data OUT; set TAB1; if _N_=1 then do; dcl hash TAB2(dataset: "TAB2(keep=KEY VAR1-VAR&colnb)" ); TAB2.definekey('KEY'); TAB2.definedata('KEY'%do i=1 %to &colnb.;,%unquote(%nrbquote(' VAR&i'))%end; ); TAB2.definedone(); end; if ranuni(0) < &rowpct. then RC=TAB2.find(); run; %end; %end; %mend;

Patrick · ‎02-10-2020

or this way. data have; input Codes :$40. LOB $; datalines; BH1003 HAR BH1003 LIP BH1003,BH1005,BH1007 LIP BH1005 MA AEHJ IP BH1003,BH1006,BH1008 HAR BH1006,BH1003,BH1004 CCC ; data want; set have; if lob ne 'HAR' then do; codes=prxchange('s/\bBH1003\b,?//oi',-1,strip(codes)); if missing(codes) then delete; end; run; proc print; run;

Stuart_P · ‎01-13-2020

Sorry, I should have included this in the original post. It should look like this: 'W EXAMPLE ROAD' 'N 108TH STREET' 'S MAIN' 'SOUTH OAK ROAD' 'EAST MAIN STREET APT' I already have code to remove second unit designators such as 'APT', 'APARTMENT', 'UNIT', ect.

Online Status	Offline
Date Last Visited	‎02-16-2024 04:32 PM

Re: 2023 Customer Awards: ICON - Curious Thinker

Re: Macro code to iterate through list

Re: Creation of SAS library via macro variable

Re: removing the special code value

Re: Regular expression: Negative Lookahead

no overwriting of results in result window of SAS EG

Re: Floating Point Zero Divide

Re: SQL to Proc SQL

Re: ERROR: Result of WHEN clause 2 is not the same data type as the pr...

Re: How do i list all libraries and libref in SASApp

2024 Customer Awards: Holland Numerics Ltd - Curious Thinker

Using PROC DQSCHEME Part 1: Utilizing the CREATE statement

3rd Place Winner - 2023 Customer Awards: ICON - Curious Thinker

Re: how to split huge data into multiple datasets

Re: Column not Accepted when Creating Oracle Table from SAS Dataset

Re: Macro code to iterate through list

Re: Creation of SAS library via macro variable

Re: How to convert date character type to numeric?

Re: Monotonic function with group by proc sql

2024 Customer Awards: Holland Numerics Ltd - Curious Thinker

Re: 2023 Customer Awards: ICON - Curious Thinker

Re: common table expression in sas?

Re: Warning during PROC APPEND

Re: Error TempIndx

Re: Macro code to iterate through list

Re: Creation of SAS library via macro variable

Re: no overwriting of results in result window of SAS EG

Re: SAS error : Stack over flow

Re: SAS Support Communities launches PROC Star program

Re: how to get min and max date in a variable from data step

Re: Why is the observation length 48 in the output?

Study on the best method to join two tables

Re: removing the special code value

Re: Regular expression: Negative Lookahead

SAS Inner Circle Panel

SAS Analytics Explorers