Solved: Re: How to count observations with the same value before each observat...

tediest · Posted 09-22-2012 03:58 PM

Sorry for the obscure title.

I have a data like this:

person	time	brand
1	100	A
1	200	A
1	300	B
1	400	C
2	100	A
2	200	B
2	300	A
2	400	B

I would like to create a new variable for each observation, representing the number of times the corresponding "brand" has occurred before for the given person. So the values of the new variable should be: 0, 1, 0, 0, 0, 0, 1, 1. As the data are pretty big, it is inefficient to do this via loops.

Would someone please help me with this? Thanks a lot in advance!

Linlin · Posted 09-22-2012 06:19 PM

Hi,

The code below works for the sample data you provided, but I am not sure it will work for all your data. :smileysilly:

data x;

input person brand $char20.;

datalines;

1 KELLOGG

1 POST

1 KASHI GO

2 KELLOGG

2 G MILLS

2 KELLOGG

;

run;

data y (drop=all);

length all $100;

do until(last.person);

set x;

by person;

howmany = count(all,compress(brand));

all = catx(', ',all,compress(brand));

output;

end;

run;

proc print;run;

obs person brand howmany

1 1 KELLOGG 0

2 1 POST 0

3 1 KASHI GO 0

4 1 KASHI GO 1

5 2 KELLOGG 0

6 2 G MILLS 0

7 2 G MILLS 1

8 2 G MILLS 2

9 2 G MILLS 3

10 2 G MILLS 4

11 2 G MILLS 5

12 2 KELLOGG 1

13 2 KELLOGG 2

14 2 KELLOGG 3

View solution in original post

MikeZdeb · Posted 09-22-2012 04:09 PM

hi ... one way ... don't worry about the LOOP since all it does is read the data set one person at a time ...

data x;

input person time brand :$1. @@;

datalines;

1 100 A 1 200 A 1 300 B 1 400 C

2 100 A 2 200 B 2 300 A 2 400 B

;

data y (drop=all);

length all $100;

do until(last.person);

set x;

by person;

howmany = countc(all,brand);

all = catt(all,brand);

output;

end;

run;

person time brand howmany

1 100 A 0

1 200 A 1

1 300 B 0

1 400 C 0

2 100 A 0

2 200 B 0

2 300 A 1

2 400 B 1

tediest · Posted 09-22-2012 04:25 PM

Thanks a lot Mike. It is pretty efficient

However there is one problem if there are multiple words contained in "brand". I changed "countc" to "count". But sometimes the "catt" function would connect two words as one, and "count" would not be able to identify the correct number of times the same brand has occurred before. Are there ways to avoid this problem? Thanks!

Linlin · Posted 09-22-2012 05:16 PM

try this:

data x;

input person time brand :$2. @@;

datalines;

1 100 AB 1 200 AB 1 300 B 1 400 C

2 100 A 2 200 BB 2 300 AB 2 400 BB

;

data y (drop=all) ;

length all $100;

do until(last.person);

set x;

by person;

howmany = count(all,brand);

all = catx(', ',all,brand,'s');

output;

end;

run;

proc print;run;

tediest · Posted 09-22-2012 05:22 PM

Thanks Linlin.

I tried this but there still appear to be problems with some observations. For instance I have one line with: all="KELLOGG, s, G MILLS, s, G MILLS, s, G MILLS, s, G MILLS, s" and brand="G MILLS". However, count(all,brand) returns 0. It is really odd. Can this be because of the specific string?

Linlin · Posted 09-22-2012 05:33 PM

post a sample of your real data.

tediest · Posted 09-22-2012 05:53 PM

I extracted a small sample with useful variables (person, brand) only.

data x;

input person brand $char20.;

datalines;

1 KELLOGG

1 POST

1 KASHI GO

2 KELLOGG

2 G MILLS

2 KELLOGG

;

run;

Thanks a lot for your help.

Linlin · Posted 09-22-2012 06:19 PM

Hi,

The code below works for the sample data you provided, but I am not sure it will work for all your data. :smileysilly:

data x;

input person brand $char20.;

datalines;

1 KELLOGG

1 POST

1 KASHI GO

2 KELLOGG

2 G MILLS

2 KELLOGG

;

run;

data y (drop=all);

length all $100;

do until(last.person);

set x;

by person;

howmany = count(all,compress(brand));

all = catx(', ',all,compress(brand));

output;

end;

run;

proc print;run;

obs person brand howmany

1 1 KELLOGG 0

2 1 POST 0

3 1 KASHI GO 0

4 1 KASHI GO 1

5 2 KELLOGG 0

6 2 G MILLS 0

7 2 G MILLS 1

8 2 G MILLS 2

9 2 G MILLS 3

10 2 G MILLS 4

11 2 G MILLS 5

12 2 KELLOGG 1

13 2 KELLOGG 2

14 2 KELLOGG 3

tediest · Posted 09-22-2012 06:24 PM

It works great! "compress" did the magical work.

Thanks Linlin

Haikuo · Posted 09-22-2012 10:31 PM

I really like the LinLin and Mike's classic data step approach. I guess you could also use array() basing on the similar logic. Here is to show Hash() can also have the job done:

data x;

input person brand $char20.;

datalines;

1 KELLOGG

1 POST

1 KASHI GO

2 KELLOGG

2 G MILLS

2 KELLOGG

;

run;

data want;

dcl hash h(multidata:'y');

h.definekey('brand');

h.definedone();

do until (last.person);

set x;

by person;

_f=h.find();

howmany=0;

if _f=0 then

do _f=0 by 0 while (_f=0);

howmany+1;

_f=h.find_next();

end;

_a=h.add();

output;

end;

drop _:;

run;

proc print;run;

Haikuo

Ksharp · Posted 09-24-2012 04:39 AM

Bian Haikuo,

You can do it better.

data x;
input person brand $char20.;
datalines;
1 KELLOGG
1 POST
1 KASHI GO
1 KASHI GO
2 KELLOGG
2 G MILLS
2 G MILLS
2 G MILLS
2 G MILLS
2 G MILLS
2 G MILLS
2 KELLOGG
2 KELLOGG
2 KELLOGG
;
run;
data want;
if _n_ eq 1 then do;
 if 0 then set x;
 declare hash ha();
  ha.definekey('brand');
  ha.definedata('count');
  ha.definedone();
end;
 set x;
 by person;
 if first.person then ha.clear();
 if ha.find(key:brand)=0 then do;count+1;ha.replace();end;
  else do;count=0;ha.replace();end;
run;

Ksharp

Haikuo · Posted 09-24-2012 09:01 AM

It is better . Thanks for sharing, Keshan.

How to count observations with the same value before each observation?

Re: How to count observations with the same value before each observation?

Re: How to count observations with the same value before each observation?

Re: How to count observations with the same value before each observation?

Re: How to count observations with the same value before each observation?

Re: How to count observations with the same value before each observation?

Re: How to count observations with the same value before each observation?

Re: How to count observations with the same value before each observation?

Re: How to count observations with the same value before each observation?

Re: How to count observations with the same value before each observation?

Re: How to count observations with the same value before each observation?

Re: How to count observations with the same value before each observation?

Re: How to count observations with the same value before each observation?

Registration is open

SAS Training: Just a Click Away