- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Good afternoon,
I am trying to split a table with 800 million records and 48 variables into 4 tables so we can reduce the size and send compressed via FTP. I am currently doing the split individually and creating the 4 files by the codes below. Is there an easy way to create the 4 files by just reading from the MASTERFILE once . Or,is there a more efficient code to do this as we need to break up several big tables . I am a basic user of SAS and appreciate any assistance.
Thank you as always!
% let header_1 = variable1......variable12
% let header_2 = variable13.....variable24
% let header_3=variable25......variable36
% let header_4 = variable37....variable48
data step FILE1;
set MASTERFILE(keep=&header_1);
run;
data step FILE2;
set MASTERFILE(keep=&header_2);
run;
data step FILE3;
set MASTERFILE(keep=&header_3);
run;
data step FILE4;
set MASTERFILE(keep=&header_4);
run;
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi @lmtamina Why not keep it simple by-
/*This reads only once*/
data FILE1 (keep=&header_1)
FILE2(keep=&header_2)
FILE3(keep=&header_3)
FILE4(keep=&header_4);
set MASTERFILE;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Is it a requirement to split it by variables instead?
https://gist.github.com/statgeek/abc3c6ce1dbeedb84fe7f11da0603cda
Blog post with several other methods here - but still row splitting, not column wise.
https://blogs.sas.com/content/sasdummy/2015/01/26/how-to-split-one-data-set-into-many/
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@lmtamina wrote:
We are splitting by column as I understand so we do not have to run the queries to produce the master table as it takes a long time to run. We are adding a unique variable to all the 4 files so when our vendor receive them they can tie the records together using this variable. Does that make sense?
Sounds scary to me actually. I also am having a hard time understanding how this is supposed to save time.
Is your "unique variable" a unique row identifier from the source for each?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
It is not an issue of the 6 items. You really need to post the actual code and log.
I would check your macro variables and see if there's something there causing the error/issue or if it works with just the 6th data set.
data demo1 (keep = make model origin)
demo2 (keep = mpg_highway mpg_city)
demo3 (keep = Cylinders Type Weight Wheelbase Drivetrain EngineSize HorsePower)
demo4 (keep = Invoice MSRP)
demo5 (keep = Length Weight) demo6 (keep = Cylinders DriveTrain);
set sashelp.cars;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@lmtamina Can you "guarantee" LOTSSSSS of Memory(RAM). Sorry not meaning to shout with UPCASE, but I am really asking.
Also, what are the datatypes. Are those a mix of char/num. Is there a pattern in the arrangement?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi @lmtamina Why not keep it simple by-
/*This reads only once*/
data FILE1 (keep=&header_1)
FILE2(keep=&header_2)
FILE3(keep=&header_3)
FILE4(keep=&header_4);
set MASTERFILE;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
ERROR 22-322: Syntax error, expecting one of the following: a name, a quoted string, /, ;, _DATA_, _LAST_, _NULL_.
data FILE1 (keep=&header_1)
FILE2(keep=&header_2)
FILE3(keep=&header_3)
FILE4(keep=&header_4);
set MASTERFILE;
run;
but I am
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Please post the complete log (including the definition of the macro variables) into a box opened with the </> button.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
NOTE: Writing TAGSETS.SASREPORT13(EGSR) Body file: EGSR
24
25 GOPTIONS ACCESSIBLE;
26 data FILE1 (keep=&file_1)
27 FILE2(keep=&file_2)
28 FILE3(keep=&file_3)
29 FILE4(keep=&file_4)
30 FILE5(keep=&file_5)
31 FILE6(keep=&file_6)
_
22
200
ERROR 22-322: Syntax error, expecting one of the following: a name, a quoted string, /, ;, _DATA_, _LAST_, _NULL_.
ERROR 200-322: The symbol is not recognized and will be ignored.
32 ;
33 set datacagb.mxa1861_Clarify_med_claims_2018;
34 run;
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.FILE1 may be incomplete. When this step was stopped there were 0 observations and 13 variables.
WARNING: Data set WORK.FILE1 was not replaced because this step was stopped.
WARNING: The data set WORK.FILE2 may be incomplete. When this step was stopped there were 0 observations and 9 variables.
WARNING: Data set WORK.FILE2 was not replaced because this step was stopped.
WARNING: The data set WORK.FILE3 may be incomplete. When this step was stopped there were 0 observations and 6 variables.
WARNING: Data set WORK.FILE3 was not replaced because this step was stopped.
WARNING: The data set WORK.FILE4 may be incomplete. When this step was stopped there were 0 observations and 6 variables.
WARNING: Data set WORK.FILE4 was not replaced because this step was stopped.
WARNING: The data set WORK.FILE5 may be incomplete. When this step was stopped there were 0 observations and 6 variables.
WARNING: Data set WORK.FILE5 was not replaced because this step was stopped.
WARNING: The data set WORK.FILE6 may be incomplete. When this step was stopped there were 0 observations and 9 variables.
WARNING: Data set WORK.FILE6 was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
real time 0.23 seconds
2 The SAS System 15:30 Thursday, October 15, 2020
user cpu time 0.01 seconds
system cpu time 0.02 seconds
memory 4823.65k
OS Memory 33496.00k
Timestamp 10/16/2020 06:02:46 PM
Step Count 53 Switch Count 94
Page Faults 0
Page Reclaims 345
Page Swaps 0
Voluntary Context Switches 426
Involuntary Context Switches 199
Block Input Operations 0
Block Output Operations 0
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
In case I have not made myself clear:
USE THE </> BUTTON FOR POSTING LOGS!
PS including the definition of the macro variables
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content