Pyrite | Level 9

## excessive increase in table sizing

I have a table sizing problem that I don't understand. doing a simple left join with a small table i added a variable to a table. The strange thing is that this new table with an extra variable (13 instead of 12) has gone from 8.54Gb to 21.17Gb! I can't figure it out, I think the physical memory should have increased only a little.

I report the complete results of the dictionary of the 2 tables in question (note the "filesize" variable)

1 ACCEPTED SOLUTION

Accepted Solutions
Super User

## Re: excessive increase in table sizing

Then yes, that is correct.

You have 67,298,471 observations X 200 characters/bytes = 13,459,694,200 bytes = 13.5 GB

10 REPLIES 10
Super User

## Re: excessive increase in table sizing

What is the length of the new character variable added?

Pyrite | Level 9

200

Super User

## Re: excessive increase in table sizing

Then yes, that is correct.

You have 67,298,471 observations X 200 characters/bytes = 13,459,694,200 bytes = 13.5 GB

Super User

## Re: excessive increase in table sizing

Show the code for your "simple join".  It is likely that you added records, at a guess about doubling the number of records, because of the particular join.

Consider this small example. A "simple join" of two data sets of 6 observations each but the result has 16 observations.

```data one;
input a b;
datalines;
1  1
1  2
1  3
2  11
2  22
3  33
;

data two;
input a z;
datalines;
1 111
1 222
1 333
2 1111
2 2222
2 3333
;

proc sql;
create table example as
select one.a, one.b, two.z
from one
left join
two
on one.a = two.a
;
quit;```

Table size for many discussions is more about the number of observations and variables than bytes.

Pyrite | Level 9

## Re: excessive increase in table sizing

this is the code. I also report the data from the dictionary columns of the 2 tables in question

``````proc sql;
create table RUOLO_DANNI2  as
select RU.*
,C.DES_EDWH as RUOLO_DES
from RUOLO_DANNI1 RU
left join  EDWH_ODS.CODICE_TIPO_RUOLO C
on
RU.COD_TIPO_RUOLO = C.COD_EDWH
and C.ts_inizio_validita  <= ru.ts_inizio_validita <=  C.ts_fine_validita
and compress(C.cod_source)='1';
quit;``````

Super User

## Re: excessive increase in table sizing

You added 200 bytes to your obslen, indicating that the new variable is \$200 in length. This usually happens when the new variable is assigned a value from a function without properly defining it beforehand with a LENGTH statement.

Pyrite | Level 9

## Re: excessive increase in table sizing

I simply brought the variable forward from another table without specifying the length, thus inheriting the original length

``````proc sql;
create table RUOLO_DANNI2  as
select RU.*
,C.DES_EDWH as RUOLO_DES
from RUOLO_DANNI1 RU
left join  EDWH_ODS.CODICE_TIPO_RUOLO C
on
RU.COD_TIPO_RUOLO = C.COD_EDWH
and C.ts_inizio_validita  <= ru.ts_inizio_validita <=  C.ts_fine_validita
and compress(C.cod_source)='1';
quit;``````
Super User

## Re: excessive increase in table sizing

,C.DES_EDWH as RUOLO_DES LENGTH=\$20

Or whatever makes sense for you to try and reduce the size.
Super User

## Re: excessive increase in table sizing

Well, an almost tripling of the observation size will also result in tripling the dataset size.

Use the COMPRESS=YES dataset option if the new variable is mostly empty.

Super User

## Re: excessive increase in table sizing

Use the COMPRESS=YES system option (or COMPRESS=YES dataset option if you only want it to apply to one dataset) and it will take a lot less space on the disk.

Why attach a spreadsheet to share 8 numbers?

 memname nobs obslen nvar compress filesize RUOLO_DANNI1 67,298,471 136 12 NO 9,169,469,440 RUOLO_DANNI2 67,298,471 336 13 NO 22,734,503,936
Discussion stats
• 10 replies
• 456 views
• 1 like
• 5 in conversation