Hi Everyone,
My data is about 5GB with 20 columns and I run this below code.
At first, I dont create NEW file and simply overwrite using: data return_cal_new2; set return_cal nobs=nobs;
It take me 30+ second and in Task Manager, I see my SSD usage is 100% (My SSD has 500+GB free). It is so weird.
OK, I change the name to return_cal_NEW
(as in the code below) and run it for the first time. Now it take only 3.4 second using about with about 70% SSD. Great!
Look like over-writing is an issue.
Without deleting the NEW file, I rerun the code 2nd time, and it take me again, 3.3 second with about 70% SSD
Without deleting the NEW file, I rerun the code 3rd, and the weird behavior shows up again, 50 second and 100% SSD
I really curious what is going on here. Of course, I try several time and the pattern is similar.
Is this because of my PC (spec: intel 13700 64GB DDR5) or because of SAS (9.4)? Should I reinstall SAS?
Any explanation is very much appreciated.
HHC
data return_cal2; set return_cal;
if entry_time^=. then do;
if entry_time^=. and clock_time<='9:40't and (entry_time-clock_time)/60<=10 then fail=1;
else
if entry_time^=. and clock_time>'9:40't and (entry_time-clock_time)/60<=5 then fail=1;
else
fail=.;
end;
run;
This used to be common with SSDs, where while the SSD does its internal space management, it can slow down significantly.
Read about wear leveling and TRIM to know more.
Recent SSDs and operating systems address this well though, so seeing this nowadays is uncommon.
Another reason could be something to do with how the SSD manages its cache. Again, this issue is now much rarer than it used to be.
Note that writing 5 GB in 3 seconds is very fast, so I'd say that the most likely explanation is that the OS caches the data in memory. When the cache is full, actual disk operations take place and that's much slower. 5 GB should take 10 to 15 seconds to read and then the same to write on a SATA SSD, so 30 s seems about right.
SAS issues exactly the same code each time, so I doubt the issue is SAS-related.
If you want faster I/O, replace your SATA SSD (if that's what you have) with a M.2 SSD.
Thank you for your response.
My ssd is Samsung - 990 PRO 2TB Internal SSD PCle Gen 4x4 NVMe.
I really dont know what is going on.
Any Mem setting is SAS can alleviate the issue?
Thanks
When you tell SAS to overwrite an existing dataset it will
1) Make a NEW file using a temporary name.
2) Delete the old file.
3) Rename the new file to the old name.
So when you run the step making a new dataset then the last two steps are not needed. The RENAME takes minimal I/O, but how long it takes to delete a file depends on how the disk manages the spaces freed up by the deletion.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.