About AhmedAl_Attar

AhmedAl_Attar · ‎03-07-2024

@yabwon @Patrick Guys, Thank you for inspiring solutions, and just as an FYI, on my machine, Single level/dimension array approach was the fastest when compared to 2-Dimensional & 3-Dimensional array approaches. I wonder if you had similar findings.

AhmedAl_Attar · ‎03-06-2024

Hi @Bernd_S How about creating Data Step View of the merge, followed by Proc Summary? PROC SORT DATA=work.table_a; BY id id_2 category; Run; DATA WANT_OPTION_A_v (KEEP=i x)/VIEW=WANT_OPTION_A_v; MERGE table_a table_b; By id id_2 category; RUN; PROC SUMMARY DATA=work.WANT_OPTION_A_v NWAY; CLASS i; VAR x; output out=WANT_OPTION_A_sum(DROP=_:) sum=sum_x; RUN; if i has too large cardinality, you may want to sort the view by i, then use By statement in the Proc Summary PROC SORT DATA=work.table_a; BY id id_2 category; Run; DATA WANT_OPTION_A_v (KEEP=i x)/VIEW=WANT_OPTION_A_v; MERGE table_a table_b; By id id_2 category; RUN; PROC SORT DATA=WANT_OPTION_A_v OUT=WANT_OPTION_A_srt; BY i; RUN; PROC SUMMARY DATA=work.WANT_OPTION_A_srt NWAY; BY i; VAR x; output out=WANT_OPTION_A_sum(DROP=_:) sum=sum_x; RUN; Hope this helps, Ahmed

AhmedAl_Attar · ‎03-06-2024

Hi @devi001 Could creating a symbolic link be a workaround ? On Linux typically this how we get around such issue, maybe you can do the same on Windows!? https://www.howtogeek.com/16226/complete-guide-to-symbolic-links-symlinks-on-windows-or-linux/ Hope this helps, Ahmed

AhmedAl_Attar · ‎03-06-2024

Hi @SatishR Unfortunately you are asking the wrong crowd! I/O Throughput is a infrastructure issues (Storage: DISK / SAN configuration). These issues affect not just SAS, but your Server operation and every other software you deploy on it! The best people to ask/address would be your IT Storage team. They should know these things, it's their job. SAS communities are related to everything SAS software, and not Storage! Typically customers would try to provision servers/machines that matches the Minimum Hardware Requirements recommended by the software vendor, and if they don't, they lose the right to blame the vendor for poor performance! If you know what I mean. Just my cents, Ahmed

AhmedAl_Attar · ‎03-05-2024

Hi @blueskyxyz Not sure how technical your manager is, because his/her suggestion of separating data files from processing server goes against the technical architectures designs of the past 2 decades, and even future designs! Here is how: - Back in the early 2000s, there was a rise of the Data Warehouse Appliances (Netezza, ExaData, GreenPlum, Teradata): These servers provided a box of Disks, CPUs, RAM all in one place to minimize data movement across the Network, and get the data closer to the processing. - In the 2010s: Hadoop/Spark & HDFS were on the rise , to promote cheaper distributed computing and distributed data replications to ensure the compute nodes always have access to portion of the data locally. - AWS has introduced Amazon S3 Express One Zone, which is a high-performance, single-zone Amazon S3 storage class that is purpose-built to deliver consistent, single-digit millisecond data access for your most latency-sensitive applications. All these design/architecture trends, were/are trying to get the data closer to the compute and not away from it! Just my two cents, Ahmed

AhmedAl_Attar · ‎03-01-2024

Hi @RedUser77 I know you have multiple answers to go through, and here is yet another one data have; infile CARDS4; input date yymmdd8. Record_Number; format date yymmdd10.; cards4; 20230101 10 20230102 245 20230121 39 20230201 11 20230202 202 20230301 41 20230321 59 20230302 92 ;;;; run; /* Create a view to add the month variable */ DATA HAVE_V/VIEW=HAVE_V; set have; month = put(date,yymon8.); RUN; /* Use the DOW loop to calculate monthly totals and Percent changes */ DATA WANT(KEEP=MONTH MON_TOT PCT_CHANGE); mon_tot = 0; prev_mon_tot = 0; /* calculate monthly totals */ DO UNTIL (LAST.MONTH); SET HAVE_V; BY MONTH NOTSORTED; mon_tot + Record_Number; END; /* calculate monthly Percent changes */ prev_mon_tot = lag(mon_tot); pct_change = ifn((prev_mon_tot > 0),100*((mon_tot - prev_mon_tot)/(prev_mon_tot)),0); OUTPUT; FORMAT pct_change 8.2; RUN; Note: this solution assumes the original input table/dataset is ordered by date. Hope this helps, Ahmed

AhmedAl_Attar · ‎02-28-2024

@lichee So, Your SAS session has access to a maximum of 8GB as your -memsize value indicates. Why are you loading the 30 Million records into the Hash along with every variable in the data set? dcl hash h(dataset : 'clmfile', multidata : 'Y'); h.definekey('Person_ID'); h.definedata(all : 'Y'); h.definedone(); Try to load the smaller data set into the Hash and loop through the records of your large data set (clmfile) If you want to load the large data set into Hash, then use the technique listed on page 4 from this paper https://www.lexjansen.com/nesug/nesug11/ld/ld01.pdf "Now imagine that a real-world file LOOKUP is so large that memory shortage would prevent the hash table from being loaded with the SAT variables alongside KEY, yet we still want to use the hash object for KEY look-up! The workaround, as noted above, is to leave the SAT variables in their original place on disk and instead, load a file record identifier variable RID into the data portion of the hash table H: "

AhmedAl_Attar · ‎02-28-2024

Hi @lichee Couple of changes that could help you with the memory issue Use an explicit -memsize xG (x: number) SAS invocation option to specify how much memory the SAS process has access to. On Linux/Windows the default is 2G. You can check your SAS session's setting by running the following Proc options option=memsize; run; Explicitly specify the HashExp value in your Hash object declaration. Default: 8, Max: 16 Hope that helps, Ahmed

AhmedAl_Attar · ‎02-28-2024

Hi @Longimanus Have a look at these two papers along with their references for alternative ways - https://www.lexjansen.com/sesug/2020/SESUG2020_Paper_150_Final_PDF.pdf - https://support.sas.com/resources/papers/proceedings15/2219-2015.pdf Hope this helps, Ahmed

AhmedAl_Attar · ‎02-27-2024

Hi @alepage Typically, such questions should be directed to your Oracle DBA. But here is what I was able to find on this page Using Oracle Flashback Technology Using ORA_ROWSCN ORA_ROWSCN is a pseudocolumn of any table that is not fixed or external. It represents the SCN of the most recent change to a given row; that is, the latest COMMIT operation for the row. For example: SELECT ora_rowscn, last_name, salary FROM employees WHERE employee_id = 7788; ORA_ROWSCN NAME SALARY ---------- ---- ------ 202553 Fudd 3000 The latest COMMIT operation for the row took place at approximately SCN 202553. To convert an SCN to the corresponding TIMESTAMP value, use the function SCN_TO_TIMESTAMP. Hope this helps, Ahmed

AhmedAl_Attar · ‎02-27-2024

Hi @chandusaladi Have a look at this 2017 article Doing More with SAS Enterprise Guide Automation by @ChrisHemedinger It may have something you can use

AhmedAl_Attar · ‎02-27-2024

@Anshul2 Here is an example code data want; length Range_Start Range_End 8; Range_Start = '01Jan2023'd; Range_End = '31Jan2023'd; OUTPUT; Range_Start = '01Feb2023'd; Range_End = '15Feb2023'd; OUTPUT; Range_Start = '01Jun2023'd; Range_End = '30Jul2023'd; OUTPUT; FORMAT Range_Start Range_End date9.; RUN; data want_exp(KEEP=date); Set want; do i=Range_Start to Range_End; date = i; output; end; FORMAT date date9.; run; Use the "want_exp" table in your joins (SQL/Hash) with your 5 Millions record using equi joins (=) rather than Between (>= & <=)

AhmedAl_Attar · ‎02-27-2024

@Anshul2 I would recommend expanding the 500 records data set into a slightly larger table, by explicitly specifying the values of the ranges. i.e. Instead of just having Range_Start & Range_End columns, Add a third column of the actual value, then try to do exact match from your 5 M large table to the new expanded lookup (want) table. This way, it's a one time range expansion, rather than 5 million times, as you have been doing via your Left Join and Hash Join. Hope this helps, Ahmed

AhmedAl_Attar · ‎02-23-2024

Hi @bhca60 Have a look at the attached pdf file, it has some visual illustrations of various SQL Operations and Statements. Note: Some of the contents may not be supported by SAS's Proc SQL (Window Functions), but for the most part, it should provide you with good overview of how SQL works. Hope this helps, Ahmed

AhmedAl_Attar · ‎02-23-2024

If anyone interested, here is the link to the DuckDB's GitHub issue https://github.com/duckdb/duckdb/issues/10805

Online Status	Offline
Date Last Visited	Wednesday

Re: Any success with a SAS upgrade?

Re: Migration- SAS 9.4 to SAS Viya - Spk file export

Re: Connect to SAS server from R via SAS JDBC - dbConnect() Error

Re: Connect to SAS server from R via SAS JDBC - dbConnect() Error

Re: Connect to SAS server from R via SAS JDBC - dbConnect() Error

Re: wide or long structure

Re: SAS 9 Content Assessment

Re: SAS 9 Content Assessment

Re: Real Time of Proc Contents High Using RSubmit

Re: Parallel Execution in SAS Studio Flows on Compute

Top 10 Tips & Tricks for Building Custom Steps in SAS Viya

Code Smarter: AI as a SAS Programming Assistant

Re: Configuring OAuth2 for SAS Viya to Snowflake Connection via Azure ...

Using the DATA Step Debugger in SAS Studio on SAS Viya

SAS + DuckDB Series: Using the DuckDB Engine

Re: Migration- SAS 9.4 to SAS Viya - Spk file export

Re: Any success with a SAS upgrade?

Re: wide or long structure

Re: Parallel Execution in SAS Studio Flows on Compute

Re: Real Time of Proc Contents High Using RSubmit

Using SAS® to Extract Data From the Census Data API

Re: Join and aggregation of large tables - is there a faster way?

Re: Join and aggregation of large tables - is there a faster way?

Re: SAS Content Assessment : SAS 9 Application usage - Failure

Re: IO throughput rate in SAS Linux server

Re: Connection between FTP (S3) and SAS server.

Re: Month over Month Percent Change in SAS SQL ((8.3.1.119)

Re: Using Hash object to compare fields and create new variables runs ...

Re: Using Hash object to compare fields and create new variables runs ...

Re: Update Table A (that has more rows per key) with table B that has ...

Re: how to know when an Oracle table was last updated

Re: Automation: how export the SAS the whole EG project process flow t...

Re: Optimization of proc sql to sas hashing

Re: Optimization of proc sql to sas hashing

Re: LEft joining same tables multiple times

Re: ERROR: Error retrieving DatabaseMetaData: java.sql.SQLFeatureNotSu...

SAS Global Forum 2017

SAS Analytics Explorers