02-24-2016 02:12 AM
I am having a requirement to create mutiple CSV files of 1GB size from a SAS Dataset . For the records of first 1GB data in my sas dataset, should go in my first file and the next set of 1 GB data should go in second file and so on...
Kindly let me know if more details required.
Any help on a solution to the requirement is highly appreciated.
02-24-2016 02:33 AM
You can use the FILEVAR= option in the FILE statement in your DATA _NULL_ step to define a variable that holds the name of the current file to be written.
Then RETAIN a counter variable that starts with 0 and is incremented in every iteration of the data step with the size of the current record (you need to calculate that by adding the lengths of your output variables; you could also cumulate the output line into a single variable and then use the length() function on that to determine the number of bytes you will add to that iteration; don't forget the line separator)
Once the counter reaches 1G, change the value of the FILEVAR variable and reset the counter to zero.
02-24-2016 04:11 AM
Thats a fairly large amount of data your dealing with then, if its multiples of 1gb in a CSV. May I ask why you need to split it into multiple files. A couple of other ideas:
Generate one CSV and stream it via HTTP
Generate one CSV and ZIP that using the inbuilt function of WinZip/RAR to split the archive into chunks of 1gb
Reassses if you really need to send that amount of data, what is the purpose of it, i.e. if they are just summarising it, why not summarise at your end and send the summarised data
Can you compress the data in any ways, for instance code longer text strings into small numbers, and provide a code list or formats catalog with the data - you would be amazed at the amount of storage space saved merely by changing a 20 character string into a 3 number coded value.