Proc Append works when the base dataset does not exist, does Proc Sql insert into work in this situation?
If you want to code everything in Hive QL then just execute a Create Table IF NOT EXISTS before your Insert.
Btw: I believe if the source and target table are both in Hadoop (or similar) then Proc Append will generate code that executes in-database.
I'm less convinced that if the target (base) table doesn't exist SAS would create it with the exact same data types like the source table - but it's something worth testing as if it does then using Proc Append would be the easiest way to code what you're after - unless you need more control like loading into a specific partition or the like.
No, you'll get an error. SQL requires the table to pre-exist.
proc sql;
insert into class
select * from sashelp.class;
quit;
data class;
set sashelp.class (obs=0);
run;
proc sql;
insert into class
select * from sashelp.class;
quit;
Log:
68 69 proc sql; 70 insert into class 71 select * from sashelp.class; ERROR: File WORK.CLASS.DATA does not exist. NOTE: PROC SQL set option NOEXEC and will continue to check the syntax of statements. 72 quit; NOTE: The SAS System stopped processing this step because of errors. NOTE: PROCEDURE SQL used (Total process time): real time 0.00 seconds user cpu time 0.00 seconds system cpu time 0.00 seconds memory 331.87k OS Memory 23712.00k Timestamp 2020-06-04 01:45:24 AM Step Count 30 Switch Count 0 Page Faults 0 Page Reclaims 16 Page Swaps 0 Voluntary Context Switches 0 Involuntary Context Switches 0 Block Input Operations 0 Block Output Operations 0 73 74 75 data class; 76 set sashelp.class (obs=0); 77 run; NOTE: There were 0 observations read from the data set SASHELP.CLASS. NOTE: The data set WORK.CLASS has 0 observations and 5 variables. NOTE: DATA statement used (Total process time): real time 0.00 seconds user cpu time 0.01 seconds system cpu time 0.00 seconds memory 830.50k OS Memory 24228.00k Timestamp 2020-06-04 01:45:24 AM Step Count 31 Switch Count 2 Page Faults 0 Page Reclaims 165 Page Swaps 0 Voluntary Context Switches 9 Involuntary Context Switches 0 Block Input Operations 0 Block Output Operations 264 78 79 80 proc sql; 81 insert into class 82 select * from sashelp.class; NOTE: 19 rows were inserted into WORK.CLASS. 83 quit; NOTE: PROCEDURE SQL used (Total process time): real time 0.00 seconds user cpu time 0.00 seconds system cpu time 0.00 seconds memory 5741.34k OS Memory 29352.00k Timestamp 2020-06-04 01:45:24 AM Step Count 32 Switch Count 0 Page Faults 0 Page Reclaims 98 Page Swaps 0 Voluntary Context Switches 0 Involuntary Context Switches 0 Block Input Operations 0 Block Output Operations 0
@ARC2 wrote:
Proc Append works when the base dataset does not exist, does Proc Sql insert into work in this situation?
If you need to only use SQL:
proc sql;
%sysfunc(ifc(%sysfunc(exist(CLASS))
,insert into CLASS
,create table CLASS as
))
select * FROM SASHELP.CLASS;
quit;
Note that proc append is probably much faster than SQL inserts.
That is the answer to the question I asked, but I need for it to work in hiveql and all the macro stuff won't work there. Still, thanks!
If you want to code everything in Hive QL then just execute a Create Table IF NOT EXISTS before your Insert.
Btw: I believe if the source and target table are both in Hadoop (or similar) then Proc Append will generate code that executes in-database.
I'm less convinced that if the target (base) table doesn't exist SAS would create it with the exact same data types like the source table - but it's something worth testing as if it does then using Proc Append would be the easiest way to code what you're after - unless you need more control like loading into a specific partition or the like.
I want to loop through and create the dataset in the first pass through the loop and then keep adding to it in subsequent passes. It will exist in the second pass so this won't work for me. Great idea though, thanks.
@ARC2 wrote:
I want to loop through and create the dataset in the first pass through the loop and then keep adding to it in subsequent passes. It will exist in the second pass so this won't work for me. Great idea though, thanks.
Not sure that I understand. Are we now dealing with a table in Hadoop or a SAS dataset? If it's a table in Hadoop then because of IF NOT EXISTS you can call the Create Table statement as many times as you wish. It will only ever actually execute and create the table if it doesn't exist already.
Not sure where your loop comes comes into the picture - are you executing a separate insert statement row by row? If so then I'd certainly would execute the Create Table before such a loop - but it should even work if executed within the loop.
Everything is in Hadoop.
Say, I create data set 1, then add it data set all (which doesn't exist at this point), then I create data set 2, then add it to all (which now exists), then create dataset 3 and add it to dataset all, and so on. I know how to do this with proc append, but not with sql. Will the code above do this?
On this very high level: Yes, it does.
You're now again using the term "dataset". This term is used for SAS tables but not for Hive tables.
It's may be worth if you explain us a bit more in detail what you have and what you need. Also tell us always where the data source and target is stored (SAS or Hadoop) and what SAS client you're using to implement ("loop" makes me think it could be DIS).
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.