Hi All ,
I have a dataset of size 160 GB , I am trying to partition it on the basis of a column .
I tried doing it through VA but , it becomes non responsive and does not yield any result.
I tried Enterprise Guide , but again the partition step took the whole day . Though the work area(where the partioned dataset was directed) was gradually showing signs of being occupied but the datastep never ended.
Can there be a work around in SAS VA/EG ,like ,spliting the datasets -> partioning -> then combining...
Please let me know if anyone has any inputs on this.
If you did it with EG, could you please show us the code?
Hi Kurt ,
This is the code (details modified a bit).
libname mylibhe sashdat path="/hps/expl/mylib" server="xyz560n.abc.xyz.net" install="/opt/sas/software/TKGrid" ;
libname mylible sasiola port=10404 tag="hps.expl.mylib" host="xyz560n.abc.xyz.net";
data myds_part;
set mylible.myds;
run;
data mylibhe.myds_part(partition=(my_col));
set work.myds_part;
run;
proc metalib;
omr (library="VA HDFS MyLib Explore" );
update_rule=(noupdate);
report;
run;
@pratikjageera wrote:
Hi Kurt ,
This is the code (details modified a bit).
libname mylibhe sashdat path="/hps/expl/mylib" server="xyz560n.abc.xyz.net" install="/opt/sas/software/TKGrid" ;
libname mylible sasiola port=10404 tag="hps.expl.mylib" host="xyz560n.abc.xyz.net";
data myds_part;
set mylible.myds;
run;data mylibhe.myds_part(partition=(my_col));
set work.myds_part;
run;
proc metalib;
omr (library="VA HDFS MyLib Explore" );
update_rule=(noupdate);
report;
run;
So both mylible and mylibhe reside on the remote server?
In that case, think of 150GB divided by network bandwidth, and you have your answer.
What is the purpose of splitting it in the first place? First off I would check how much free memory you have in the SAS VA server you are trying to load to. If you don't have at least 160GB free then partitioning it will won't reduce the space needed.
Also consider why you need to load such a large table. Can you reduce space by reducing character column lengths?
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
See how to use one filter for multiple data sources by mapping your data from SAS’ Alexandria McCall.
Find more tutorials on the SAS Users YouTube channel.