About JackHamilton

JackHamilton · ‎08-23-2019

Can you create a single view on the database side that does all of the processing there, with no joins or wheres on the SAS side? You might have to upload some of your selection tables from SAS to the database to make this work. SASTRACE can be your friend here.

JackHamilton · ‎08-23-2019

Could it be a bandwidth problem? If you don't need all the columns in the source tables, you could use SASTRACE to find out what's being brought back. Also, if you have any ORDER BY clauses, you could try removing them, and then doing a PROC SORT in SAS. I wouldn't have thought that would matter, but it does appear to.

JackHamilton · ‎08-23-2019

I agree with Tom's diagnosis and also with his statement that you didn't provide enough information. Two things you might try: - Add " OPTIONS SASTRACE=',,,DB' SASTRACELOC=SASLOG NOSTSUFFIX; " to see what's getting passed to the database. Is an optimized query with joins being passed, or are you getting back every single record in every table? - Run a PROC CONTENTS , or something else that reads only the structure without reading any records. Is that also slow?

JackHamilton · ‎08-19-2019

I think I would use SELECT instead of IF THEN ELSE to handle the record type. That would automatically signal an error if an unexpected code shows up, and will be easier to expand if the next homework assignment is more complicated.

JackHamilton · ‎08-14-2019

The documentation on how to write a configuration file is a confused jumble. It took me much longer to figure out that one configuration file than it did to download, install, and configure Anaconda and Jupyter combined. I don't know if this is optimal, but this works for me running Jupyter under Anaconda in Windows, connecting to a grid server running on Solaris/Unix: SAS_config_names = ['winiomsolaris'] SAS_config_options = {'lock_down': False, 'verbose' : True } SAS_output_options = {'output' : 'html5'} # build out a local classpath variable to use below for Windows clients CHANGE THE PATHS TO BE CORRECT FOR YOUR INSTALLATION cpW = "C:\\Program Files\\SASHome\\SASDeploymentManager\\9.4\\products\\deploywiz__94494__prt__xx__sp0__1\\deploywiz\\sas.svc.connection.jar" cpW += ";C:\\Program Files\\SASHome\\SASDeploymentManager\\9.4\\products\\deploywiz__94494__prt__xx__sp0__1\\deploywiz\\log4j.jar" cpW += ";C:\\Program Files\\SASHome\\SASDeploymentManager\\9.4\\products\\deploywiz__94494__prt__xx__sp0__1\\deploywiz\\sas.security.sspi.jar" cpW += ";C:\\Program Files\\SASHome\\SASDeploymentManager\\9.4\\products\\deploywiz__94494__prt__xx__sp0__1\\deploywiz\\sas.core.jar" cpW += ";C:\\Users\\c449630\\AppData\\Local\\Continuum\\anaconda3\\lib\\site-packages\\saspy\java\\saspyiom.jar" # And, if you've configured IOM to use Encryption, you need these client side jars. cpW += ";C:\\Program Files\\SASHome\\SASVersionedJarRepository\\eclipse\\plugins\\sas.rutil_904500.0.0.20170816190000_v940m5\\sas.rutil.jar" cpW += ";C:\\Program Files\\SASHome\\SASVersionedJarRepository\\eclipse\\plugins\\sas.rutil.nls_904500.0.0.20170816190000_v940m5\\sas.rutil.nls.jar" cpW += ";C:\\Program Files\\SASHome\\SASVersionedJarRepository\\eclipse\\plugins\\sastpj.rutil_6.1.0.0_SAS_20121211183517\\sastpj.rutil.jar" winiomsolaris = {'java' : 'java', 'iomhost' : ['sasgrid01.kp.org','sasgrid02.kp.org','sasgrid03.kp.org'], 'iomport' : 8591, 'classpath' : cpW , 'encoding' : 'latin1' } import os os.environ["PATH"] += ";C:\\Program Files\\SASHome\\SASFoundation\\9.4\\core\\sasext\\sspiauth.dll" os.environ["PATH"] += ";C:\\Program Files\\SASHome\\SASFoundation\\9.4\\core\\sasext\\"

JackHamilton · ‎08-13-2019

A Jupyter notebook is another possible interface. I haven't tried it, but it might work. I don't know how good/standard the Javascript implemention in iOS Safari is. The Jupyter server would have to be on a remote machine. You might also try SAS/Studio just to see what happens. The fact SAS Institute doesn't support it doesnt mean it won't work, it only means you won't get help if it doesn't work. In the long run, it probably will work - Apple has been trying to integrate its code base across platforms, but they are not there yet.

JackHamilton · ‎08-09-2019

You could probably just FTP (or SFTP) the data set from one server to another. If your FTP/SFTP/FTPS/SSH method uses compression, it will probably be faster than using SAS. You might get better results if you gzip before sending (or maybe not - you might lose more time zipping and unzipping than you save by sending fewer data packets). It might be possible to do it entirely in SAS using the ZIP and sockets engines. That would make an interesting project, but I wouldn't want to have to support it for the next decade.

JackHamilton · ‎08-08-2019

There are lots of ways that SAS might do it. Another way, a bit clumsier but with other benefits, would be to allow FedSQL views to be used in base SAS; it seems to do the conversion automatically if you have declared data types correctly, but unfortunately FedSQLs are broken outside FedSQL. I was thinking that a list in a libname (SASDATEFMT or DBSASTYPE) would be the way to go, but the suggestion of having the translation list stored in a data set is a good one. It would allow you to specify different data types for the same variable name in different data sets. And if you think "No one would be so careless as to declare the same variable name in different ways in different data sets in the same library", haha, you haven't met our vendors.

JackHamilton · ‎08-08-2019

An automatic remapping feature would be very useful. Our use case: we have a SAS copy and a database copy of our data warehouse. We would like to be able to run the same code, unchanged except for libname statements, against either copy. When our database was Teradata we could do that. But we have been switched to Oracle. The problem is that SAS and Teradata have both date and datetime data types, but Oracle has only a datetime type. SAS/Access will convert dates to datetimes when sending a query to Oracle, but does not convert datetimes to dates when returning results. That breaks programs written to expect date values. The SASDATEFMT data set option could at one time be set through an environment variable, which might solve the problem, but I think that's now deprecated. Your suggestion would be a more flexible way to do that (I would want to change both the format and the data type, which is what sasdatefmt does, instead of only the format).

JackHamilton · ‎08-04-2019

Why develop a new tool? Why not use Git? It's not well incorporated into SAS at the moment, but support is increasing. Your requirements are not unique to SAS, and Git has been used for similar things in many projects (Linux being perhaps the best known example). There's no reason your package has to be written in SAS. But it doesn't sound like your fundamental problem is a lack of software. You can do most of what you want on a regular shared file system (or ftp server). Your organization is lacking a commitment to doing it, and new software won't fix that, it will only make the process easier. You can already set up shared macro and format libraries under central control. You can create folder structure templates. I think shared functions are harder (when I last looked, which was a few years ago, it was obvious that SAS hadn't thought through how to use shared functions, but that might be different now).

JackHamilton · ‎08-03-2019

Interesting article, and I think one of the things you did in the program is worthy of its own article someday. You used the DIVIDE function to eliminate "divide by zero" messages in the log. It also eliminates "missing values were generated" and "mathematical operations could not be performed" messages. DIVIDE is an underused function. There is great benefit to eliminating those messages from the log in cases when you know problematic values might appear in your data. A large number of unneeded messages in the log makes it too easy to overlook real errors. It's more work, but better programming practice, to create only messages that actually need attention. DIVIDE is a relatively new function. You can create a similar result with old-fashioned code like if y = 0 or nmiss(x, y) then z = .; else z = x / y; The result is similar but not identical to the DIVIDE function because DIVIDE might also return the special missing values .I, .M, and ._ In the olden days, we hardly ever encountered missing values other than . , and the test for missing values in a programs was a simple if x = . then /* do stuff */; But with the introduction of the DIVIDE function, that no longer suffices. It's better to code if x is missing then /* do stuff */; or if nmiss(x) then /* do stuff */; In PROC FORMAT, where those tests aren't available, you have to specify all the missing values explicitly: proc format; value missb other = [best4.] ., .a-.z, ._ = 'missing'; run; I wouldn't know without looking it up whether .A comes before or after . or ._, so it's easier for me to list all three missing values ranges. Looking it up, I see that I could have specified proc format; value missb other = [best4.] ._ - .Z = 'missing'; run; That's not an obvious order. The SAS missing value sort order is ._ . .A-.Z but even knowing the ASCII sorting sequence doesn't help; it's <space> . A-Z _ and the EBCDIC sort sequence is <space> . _ A-Z

JackHamilton · ‎07-29-2019

MERGENOBY=WARN is a better setting, I think, because it's rare to want a MERGE without BY. It does happen, though; emulating a LEAD function is one use case.

JackHamilton · ‎07-24-2019

For an approach that lets you run the SQL code in Tom's SQL, but without the need for a separate run for each entry, see my ancient paper https://support.sas.com/resources/papers/proceedings/proceedings/sugi31/046-31.pdf Basically, you would create a utility table with as many rows as you have potential entries (so 5 entries means you have rows in the new table with the values 1, 2, 3, 4, 5) and add that table to the join, using the number from the utility table in the scan function. You'll get a warning about a cartesian join. If your "have" table (or a view created from it) has a column containing the number of output records you want, you can join on that number and make the process potentially faster.

JackHamilton · ‎06-17-2019

@Reeza Putting an OPTIONS statement inside data step code may not be a best practice, but only because it might cause confusion about when the options go into effect. The documentation explicitly says that the only place the OPTIONS statement is not allowed is inside data lines, so inside a data step is valid. The behavior of global statements inside steps is well defined; the problem is that it's not immediately intuitive. options ls=150; %put NOTE1: LineSize=%sysfunc(getoption(linesize)); data _null_; %put NOTE2: LineSize=%sysfunc(getoption(linesize)); ls = getoption('linesize'); putlog 'NOTE3: ' ls=; options ls=100; %put NOTE4: LineSize=%sysfunc(getoption(linesize)); ls = getoption('linesize'); putlog 'NOTE5: ' ls=; run; %put NOTE6: LineSize=%sysfunc(getoption(linesize)); returns 69 options ls=150; 70 %put NOTE1: LineSize=%sysfunc(getoption(linesize)); NOTE1: LineSize=150 71 72 data _null_; 73 74 %put NOTE2: LineSize=%sysfunc(getoption(linesize)); NOTE2: LineSize=150 75 76 ls = getoption('linesize'); 77 putlog 'NOTE3: ' ls=; 78 79 options ls=100; 80 81 %put NOTE4: LineSize=%sysfunc(getoption(linesize)); NOTE4: LineSize=100 82 83 ls = getoption('linesize'); 84 putlog 'NOTE5: ' ls=; 85 86 run; NOTE3: ls=100 NOTE5: ls=100 NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.00 seconds 87 88 %put NOTE6: LineSize=%sysfunc(getoption(linesize)); NOTE6: LineSize=100

JackHamilton · ‎06-14-2019

Yes, that is very helpful. One thing it shows (and I've seen others make this comment) is that the Anaconda/Miniconda uninstallers don't clean up as well as they might.

Re: Using FINDW in a DO loop to flag keywords in a string

Re: Using FINDW in a DO loop to flag keywords in a string

Re: Using FINDW in a DO loop to flag keywords in a string

Re: SAS macros in Git

Re: How to control the number of lines when rolling the mouse wheel on...

Re: Automatically calculate yearly average

Re: How to anonymize the Data (not masking)

Re: SAS RTDM: Custom Details Length

Re: SAS 9.4 Base and STAT for Windows under Parallels on a Mac with an...

Re: How to get table size using proc sql

Re: Using FINDW in a DO loop to flag keywords in a string

Re: How to control the number of lines when rolling the mouse wheel on...

Re: CALL SYMPUT vs CALL SYMPUTX

Re: Import metadata from qlikview to SAS

Re: Personal Login Manager not updating metadata

Re: SAS Environment Manager - Verificar versão

Re: Make year,month and day functions work on datetime

Re: Using FINDW in a DO loop to flag keywords in a string

Re: Can I put the BY-group value into the ExcelXP sheet name?

Re: Using FINDW in a DO loop to flag keywords in a string

Re: Poor SAS View performance

Re: Poor SAS View performance

Re: Poor SAS View performance

Re: importing Hierarchical Files

Re: Configuring SASPY

Re: SAS on iPad

Re: Requesting faster methods to transfer a dataset from one server to...

Re: SAS/ACCESS: Remap default data types and SAS formats

Re: SAS/ACCESS: Remap default data types and SAS formats

Re: Seeking advice on developing SAS packaging system

Re: Investigating the Economics of Wiretapping with SAS University Edi...

Re: Change SYSTEM OPTIONS VARINITCHK and MERGENOBY (LINUX)

Re: How to create multiple rows from one field

Re: SAS Studio Stuck on "Running" and never completes

Re: No Kernel, WinError 2 for SAS Notebook in JupyterLab

SUGA

SAS Analytics Explorers