Solved
Contributor
Posts: 57

# Reducing the size of the data via dummies

Dear All,

I have a dataset (attached) with observation ids (obs) and a set of codes for each id (comcode). Some obs may have only one comcode and some others may have as many as 10. My goal is to transform the data the following way:

FROM:                             obs         comcode

1              251

1              182

1              134

2              251

2              134

2              182

3              186

3              251

3              134

3              182

4              134

4              182

TO:

obs c102   c104   c106 ......c134......c182.......c186.......

1     0           0        0          1            1            0

2     0           0         0         1            1            0

3     0           0         0         1            1            1

.

.

.

In other words, I want to reduce the size of the dataset by creating dummies for each code. I tried transposing the data, but it created errors due to repeating codes. I would appreciate your help.

Thanks,

Accepted Solutions
Solution
‎01-28-2013 11:22 AM
Posts: 3,852

## Re: Reducing the size of the data via dummies

This should be about right.  Each cell will be the frequency.

libname me '~';
proc summary data=me.s7_comm completetypes nway;

class obs comcode;
output out=counts(drop=_type_);
run;

proc transpose data=counts prefix=CODE;
by obs;
id comcode;
var _freq_;
run;

All Replies
Super Contributor
Posts: 1,636

## Re: Reducing the size of the data via dummies

example:

data have;

input obs comcode;

cards;

1 2

1 6

1 9

2 5

2 8

2 10

3 4

3 8

3 9

3 20

;

data want(drop=i);

set have;

by obs;

retain c1-c20 0;

array _com(*) c1-c20;

do i=1 to dim(_com);

if comcode=i then _com(i)=1;

end;

if last.obs then do;

output;

do i=1 to dim(_com);

_com(i)=0;

end;

end;

proc print;run;

o

m

c

O  o   o                             c  c  c  c  c  c  c  c  c  c  c

b  b   d  c  c  c  c  c  c  c  c  c  1  1  1  1  1  1  1  1  1  1  2

s  s   e  1  2  3  4  5  6  7  8  9  0  1  2  3  4  5  6  7  8  9  0

1  1   9  0  1  0  0  0  1  0  0  1  0  0  0  0  0  0  0  0  0  0  0

2  2  10  0  0  0  0  1  0  0  1  0  1  0  0  0  0  0  0  0  0  0  0

3  3  20  0  0  0  1  0  0  0  1  1  0  0  0  0  0  0  0  0  0  0  1

Super User
Posts: 6,761

## Re: Reducing the size of the data via dummies

You do have to realize, though, that the output you describe probably increases the size of your data set.  SAS will store all numerics using 8 bytes, so it takes just as much storage to hold a single digit as to hold 3 digits.  And you could have many more numbers to store than you started with.  You might affect that using compression (or alternatively, understanding how SAS stores numerics and applying the right length to your variables), or perhaps storing your data as character instead of numeric.  But transposing and converting to flags may backfire.

Contributor
Posts: 57

## Re: Reducing the size of the data via dummies

Thank you for your help and suggestions. I am going to try the code suggested by Linlin. Regarding Astounding's question, I realized that I was not clear in asking this question. By reducing the size, I actually meant reducing the number of rows. I have multiple observations for each id (obs), I want to put all of it in one single row. The reason why I am interested in this transformation is to change my dataset to a panel date setting that allows me to have dummies for each code (comcode).

Best,

Solution
‎01-28-2013 11:22 AM
Posts: 3,852

## Re: Reducing the size of the data via dummies

This should be about right.  Each cell will be the frequency.

libname me '~';
proc summary data=me.s7_comm completetypes nway;

class obs comcode;
output out=counts(drop=_type_);
run;

proc transpose data=counts prefix=CODE;
by obs;
id comcode;
var _freq_;
run;
Super Contributor
Posts: 1,636

## Re: Reducing the size of the data via dummies

Hi DN,

proc transpose data=counts prefix=CODE out=wanted_dataset;
by obs;
id comcode;
var _freq_;
run;

Contributor
Posts: 57