Hi All ,
I have a dataset of size 160 GB , I am trying to partition it on the basis of a column .
I tried doing it through VA but , it becomes non responsive and does not yield any result.
I tried Enterprise Guide , but again the partition step took the whole day . Though the work area(where the partioned dataset was directed) was gradually showing signs of being occupied but the datastep never ended.
Can there be a work around in SAS VA/EG ,like ,spliting the datasets -> partioning -> then combining...
Please let me know if anyone has any inputs on this.
If you did it with EG, could you please show us the code?
Hi Kurt ,
This is the code (details modified a bit).
libname mylibhe sashdat path="/hps/expl/mylib" server="xyz560n.abc.xyz.net" install="/opt/sas/software/TKGrid" ;
libname mylible sasiola port=10404 tag="hps.expl.mylib" host="xyz560n.abc.xyz.net";
data myds_part;
set mylible.myds;
run;
data mylibhe.myds_part(partition=(my_col));
set work.myds_part;
run;
proc metalib;
omr (library="VA HDFS MyLib Explore" );
update_rule=(noupdate);
report;
run;
@pratikjageera wrote:
Hi Kurt ,
This is the code (details modified a bit).
libname mylibhe sashdat path="/hps/expl/mylib" server="xyz560n.abc.xyz.net" install="/opt/sas/software/TKGrid" ;
libname mylible sasiola port=10404 tag="hps.expl.mylib" host="xyz560n.abc.xyz.net";
data myds_part;
set mylible.myds;
run;data mylibhe.myds_part(partition=(my_col));
set work.myds_part;
run;
proc metalib;
omr (library="VA HDFS MyLib Explore" );
update_rule=(noupdate);
report;
run;
So both mylible and mylibhe reside on the remote server?
In that case, think of 150GB divided by network bandwidth, and you have your answer.
What is the purpose of splitting it in the first place? First off I would check how much free memory you have in the SAS VA server you are trying to load to. If you don't have at least 160GB free then partitioning it will won't reduce the space needed.
Also consider why you need to load such a large table. Can you reduce space by reducing character column lengths?
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
See how to use one filter for multiple data sources by mapping your data from SAS’ Alexandria McCall.
Find more tutorials on the SAS Users YouTube channel.