dataset size efficiency?

Reply
Contributor
Posts: 46

dataset size efficiency?

Hello I have this quesion.

So I have created a matrix but I actually just used half of it technically.  Specifically, my output is a triangle matrix so I have nothing in half of it.

In order to allow more observations to be involved into my calculation, I need more memory space in the dataset.

I was hoping in some way, I can set the other half my my matrix "empty" so I can use the spare space to accommodate more observations.

I tried to set the non-using cells "missing".  but it actually increases the size of the dataset.

can someon help me with this issue?  Thank you.


Super User
Posts: 3,105

Re: dataset size efficiency?

Check out compression of your data. This should compact missing data, but does work better for large character variables.

data out (compress = yes);

  set in;

run;

Super User
Posts: 5,256

Re: dataset size efficiency?

Can you be a bit more specific about your matrix? Are in SAS/IML, or or are you using data step, arrays...?

If you are using data step variables, it's sounds strange that setting to missing will enlarge the table (they are missing by default).

Using compress=binary on the output table might reduce disk storage (not in memory).

Data never sleeps
Contributor
Posts: 46

Re: dataset size efficiency?

I used sas/iml to create the mtrix and it's part of the procedure each number got filled into a cell of the matrix. 

so I don't know how compress() can work here.  Let me know how to do it.

In short, I am not trying to reduce the size afterwards, I need to maintain the database size efficiently while the matrix is being created at te same time. thank you for your help.

Ask a Question
Discussion stats
  • 3 replies
  • 270 views
  • 0 likes
  • 3 in conversation