I have a dataset with brands that are ranked in the Interbrand list (top 100 most valuable brands). This list is presented each year, and I have the data from 2001 until 2011.
Coca Cola | 2001 | 68,945 | 003144 | KO |
Coca Cola | 2002 | 69,637 | 003144 | KO |
Coca Cola | 2003 | 70,453 | 003144 | KO |
Coca Cola | 2004 | 67,394 | 003144 | KO |
Coca Cola | 2005 | 67,525 | 003144 | KO |
Coca Cola | 2006 | 67,000 | 003144 | KO |
Coca Cola | 2007 | 65,324 | 003144 | KO |
Coca Cola | 2008 | 66,667 | 003144 | KO |
Coca Cola | 2009 | 68,734 | 003144 | KO |
Coca Cola | 2010 | 70,452 | 003144 | KO |
Coca Cola | 2011 | 71,861 | 003144 | KO |
This is an example of the data for Coca Cola. (Brand, Year, Interbrand Value (in bold), GVKEY and Ticker symbol)
I would like to create 4 portfolios with the following characteristics. 1: Interbrand value above 20 and increasing in value, 2: Interbrand value of 20 and decreasing in value, 3: Interbrand value below 10 and increasing in value, 4: Interbrand value below 10 and decreasing in value.
It should be possible that the portfolios change each year.
I hope somebody can help me to create this in SAS!
Thanks in advance!
Sorry again:smileysilly:. the previous one doesn't work . try this one:
data have;
brand='GE';
input Year Interbrand_Value;
cards;
2001 42
2002 41
2003 42
2004 44
2005 46
2006 18
2007 10
2008 8
2009 7
2010 9
;
proc sort data=have;
by brand year;
data want;;
set have;
by brand;
if first.brand then flag=1;
check=lag(Interbrand_Value);
run;
data p1 p2 p3 p4;
set want;
if flag=1 then delete;
if Interbrand_Value>20 then do;
if Interbrand_Value>check then output p1;
else output p2;
end;
else if Interbrand_Value<=10 then do;
if Interbrand_Value>check then output p3;
else output p4;
end;
run;
proc print data=p2;run;
Hi,
please check the code below for creating 4 datasets. also i created a macro for sorting the value in ascending and descending order. %sort is the call. Hope it helps.
data above20 below20 above10 below10;
input Brand &$10. Year Interbrand_Value :comma. GVKEY Ticker_symbol$;
if Interbrand_Value>20 then output above20;
if Interbrand_Value<=20 then output below20;
if Interbrand_Value>10 then output above10;
if Interbrand_Value<=10 then output below10;
cards;
Coca Cola 2001 68,945 003144 KO
Coca Cola 2002 69,637 003144 KO
Coca Cola 2003 70,453 003144 KO
Coca Cola 2004 67,394 003144 KO
Coca Cola 2005 67,525 003144 KO
Coca Cola 2006 67,000 003144 KO
Coca Cola 2007 65,324 003144 KO
Coca Cola 2008 66,667 003144 KO
Coca Cola 2009 68,734 003144 KO
Coca Cola 2010 70,452 003144 KO
Coca Cola 2011 71,861 003144 KO
;
run;
%macro sort(dsn=,descending=,var=,out=);
%if %lowcase(&descending)=descending %then %do;
proc sort data=&dsn out=&out;
by &descending &var;
run;
%end;
%else %do;
proc sort data=&dsn out=&out;
by &var;
run;
%end;
%mend;
%sort(dsn=,descending=,var=,out=)
I want to know how many members in each four portfolios you need ?
Ksharp
It is useful to see how you created the 4 dataset. In my case the portfolios however have two characteristics.
There are two portfolios where the brands have interbrand values above 20, in one situation the interbrand value increases (portfolio 1) and the other situation the interbrand value decreases (portfolio 2). The other two portfolios should have values below 10, portfolio 3 is increasing in value, and portfolio 4 decreasing. The first year that the portfolios can be created is in 2002, and each year the portfolio can be different. As an example: in 2002 Coca Cola should be in portfolio 1, and in 2004 in portfolio 2.
I think that there are about 150 brands in my list and it is desirable to have at least 10 brands in each portfolio. And it is not needed to have each brand in a portfolio.
I hope that somebody can give me some advice how to create these portfolios.
Thanks!
Is this what you want?
data have;
input Brand &$10. Year Interbrand_Value :comma. GVKEY Ticker_symbol$;
cards;
Coca Cola 2001 68,945 003144 KO
Coca Cola 2002 69,637 003144 KO
Coca Cola 2003 70,453 003144 KO
Coca Cola 2004 67,394 003144 KO
Coca Cola 2005 67,525 003144 KO
Coca Cola 2006 67,000 003144 KO
Coca Cola 2007 65,324 003144 KO
Coca Cola 2008 66,667 003144 KO
Coca Cola 2009 68,734 003144 KO
Coca Cola 2010 70,452 003144 KO
Coca Cola 2011 71,861 003144 KO
Coca bbbb 2001 18 003144 Ky
Coca bbbb 2002 20 003144 Ky
Coca bbbb 2003 15 003144 Ky
Coca bbbb 2004 21 003144 Ky
Coca bbbb 2005 16 003144 Ky
Coca bbbb 2006 17 003144 Ky
Coca bbbb 2007 14 003144 Ky
Coca bbbb 2008 20 003144 Ky
Coca bbbb 2009 21 003144 Ky
Coca bbbb 2010 16 003144 Ky
Coca bbbb 2011 18 003144 Ky
;
data p1 p2 p3 p4;
set have;
by Ticker_symbol;
if not first.Ticker_symbol;
if Interbrand_Value>20 then do;
if Interbrand_Value>lag(Interbrand_Value) then output p1;
else output p2; end;
else do;
if Interbrand_Value>lag(Interbrand_Value) then output p3;
else output p4; end;
run;
Linlin
Yes. It is clearer a little bit.
But I still can't understand what you mean totally. The following is based on my understanding.
data have; input Brand &$10. Year Interbrand_Value :commax. GVKEY Ticker_symbol$; cards; Coca Cola 2001 68,945 003144 KO Coca Cola 2002 69,637 003144 KO Coca Cola 2003 70,453 003144 KO Coca Cola 2004 67,394 003144 KO Coca Cola 2005 67,525 003144 KO Coca Cola 2006 67,000 003144 KO Coca Cola 2007 65,324 003144 KO Coca Cola 2008 66,667 003144 KO Coca Cola 2009 68,734 003144 KO Coca Cola 2010 70,452 003144 KO Coca Cola 2011 71,861 003144 KO Coca bbbb 2001 18 003144 Ky Coca bbbb 2002 2 003144 Ky Coca bbbb 2003 15 003144 Ky Coca bbbb 2004 21 003144 Ky Coca bbbb 2005 16 003144 Ky Coca bbbb 2006 17 003144 Ky Coca bbbb 2007 1 003144 Ky Coca bbbb 2008 20 003144 Ky Coca bbbb 2009 21 003144 Ky Coca bbbb 2010 16 003144 Ky Coca bbbb 2011 18 003144 Ky Kmart 2001 6,945 003144 KO Kmart 2002 6,637 003144 KO Kmart 2003 70,453 003144 KO Kmart 2004 7,394 003144 KO Kmart 2005 67,525 003144 KO Kmart 2006 67,000 003144 KO Kmart 2007 65,324 003144 KO Kmart 2008 6,667 003144 KO Kmart 2009 68,734 003144 KO Kmart 2010 70,452 003144 KO Kmart 2011 1,861 003144 KO Kappa 2001 18 003144 Ky Kappa 2002 2 003144 Ky Kappa 2003 15 003144 Ky Kappa 2004 2 003144 Ky Kappa 2005 16 003144 Ky Kappa 2006 17 003144 Ky Kappa 2007 14 003144 Ky Kappa 2008 2 003144 Ky Kappa 2009 21 003144 Ky Kappa 2010 1 003144 Ky Kappa 2011 18 003144 Ky ; run; proc sort data=have(where=(Interbrand_Value gt 20)) out=temp1;by Brand Year;run; data portfolio1; set temp1; by brand; if Interbrand_Value gt lag(Interbrand_Value) and not first.brand then output; run; proc sort data=have(where=(Interbrand_Value gt 20)) out=temp2;by Brand Year;run; data portfolio2; set temp2; by brand; if Interbrand_Value lt lag(Interbrand_Value) and not first.brand then output; run; proc sort data=have(where=(Interbrand_Value lt 10)) out=temp3;by Brand Year;run; data portfolio3; set temp3; by brand; if Interbrand_Value gt lag(Interbrand_Value) and not first.brand then output; run; proc sort data=have(where=(Interbrand_Value lt 10)) out=temp4;by Brand Year;run; data portfolio4; set temp4; by brand; if Interbrand_Value lt lag(Interbrand_Value) and not first.brand then output; run;
Ksharp
Thank you for your answers! I will try if I can make the portfolios with your help.
Maybe I can make my question more understandable.
The Interbrand value means how much the brand is worth in Millions. Here is a small overview of 3 brands, I did not include the GVKEY and the Ticker symbol as those numers have no influence on creating the portfolios. I need those values later on in my research.
Coca Cola | 2001 | 68,945 |
Coca Cola | 2002 | 69,637 |
Coca Cola | 2003 | 70,453 |
Coca Cola | 2004 | 67,394 |
Coca Cola | 2005 | 67,525 |
Coca Cola | 2006 | 67,000 |
Coca Cola | 2007 | 65,324 |
Coca Cola | 2008 | 66,667 |
Coca Cola | 2009 | 68,734 |
Coca Cola | 2010 | 70,452 |
Coca Cola | 2011 | 71,861 |
Microsoft | 2001 | 65,068 |
Microsoft | 2002 | 64,091 |
Microsoft | 2003 | 65,174 |
Microsoft | 2004 | 61,372 |
Microsoft | 2005 | 59,941 |
Microsoft | 2006 | 56,926 |
Microsoft | 2007 | 58,709 |
Microsoft | 2008 | 59,007 |
Microsoft | 2009 | 56,647 |
Microsoft | 2010 | 60,895 |
Microsoft | 2011 | 59,087 |
IBM | 2001 | 52,752 |
IBM | 2002 | 51,188 |
IBM | 2003 | 51,767 |
IBM | 2004 | 53,791 |
IBM | 2005 | 53,376 |
IBM | 2006 | 56,201 |
IBM | 2007 | 57,090 |
IBM | 2008 | 59,031 |
IBM | 2009 | 60,211 |
IBM | 2010 | 64,727 |
IBM | 2011 | 69,905 |
The first portfolio will be in 2002, because for creating the portfolios I need to know whether the interbrand value increased.
Portfolio 1: Interbrand value above 20 and increasing in value (interbrand value)
Portfolio 2: Interbrand value above 20 and decreasing in value
Portfolio 3: Interbrand value below 10 and increasing in value
Portfolio 4: Interbrand value below 10 and decreasing in value
I hope my question is clear now?
Hi,
Did you try the codes suggested? I believe my code works. for the sample data, would you please include some with interbrand value below 10 and post one sample output you are looking for?
I tried the codes Linlin, and yours did not work. Tomorrow I will figure out why it did not work. The code of Ksharp seems to work. I have to see whether it gives the results I want. I really want to thank you both for your help!
These are 3 brands with Interbrand values below 10.
Panasonic | 2001 | 3,490 | 007114 | PC |
Panasonic | 2002 | 3,141 | 007114 | PC |
Panasonic | 2003 | 3,257 | 007114 | PC |
Panasonic | 2004 | 3,480 | 007114 | PC |
Panasonic | 2005 | 3,714 | 007114 | PC |
Panasonic | 2006 | 3,977 | 007114 | PC |
Panasonic | 2007 | 4,135 | 007114 | PC |
Panasonic | 2008 | 4,281 | 007114 | PC |
Panasonic | 2009 | 4,225 | 007114 | PC |
Panasonic | 2010 | 4,351 | 007114 | PC |
Panasonic | 2011 | 5,047 | 007114 | PC |
Tiffany & Co | 2001 | 3,483 | 013646 | TIF |
Tiffany & Co | 2002 | 3,482 | 013646 | TIF |
Tiffany & Co | 2003 | 3,540 | 013646 | TIF |
Tiffany & Co | 2004 | 3,638 | 013646 | TIF |
Tiffany & Co | 2005 | 3,618 | 013646 | TIF |
Tiffany & Co | 2006 | 3,819 | 013646 | TIF |
Tiffany & Co | 2007 | 4,003 | 013646 | TIF |
Tiffany & Co | 2008 | 4,208 | 013646 | TIF |
Tiffany & Co | 2009 | 4,000 | 013646 | TIF |
Tiffany & Co | 2010 | 4,127 | 013646 | TIF |
Tiffany & Co | 2011 | 4,498 | 013646 | TIF |
BP | 2001 | 3,247 | 002410 | BP |
BP | 2002 | 3,390 | 002410 | BP |
BP | 2003 | 3,582 | 002410 | BP |
BP | 2004 | 3,662 | 002410 | BP |
BP | 2005 | 3,802 | 002410 | BP |
BP | 2006 | 4,010 | 002410 | BP |
BP | 2007 | 3,794 | 002410 | BP |
BP | 2008 | 3,911 | 002410 | BP |
BP | 2009 | 3,716 | 002410 | BP |
BP | 2010 | 002410 | BP | |
BP | 2011 | 002410 | BP |
Hi,
Using Ksharp's dataset, the output datasets from Ksharp's and my code are different. for example, Kmart 2003 is in my dataset p1, not in Ksharp's portfolio1. According to your rule (value over 20 and increased value) I believe it should be in p1:smileysilly:.
data have;
input Brand &$10. Year Interbrand_Value :commax. GVKEY Ticker_symbol$;
cards;
Coca Cola 2001 68,945 003144 KO
Coca Cola 2002 69,637 003144 KO
Coca Cola 2003 70,453 003144 KO
Coca Cola 2004 67,394 003144 KO
Coca Cola 2005 67,525 003144 KO
Coca Cola 2006 67,000 003144 KO
Coca Cola 2007 65,324 003144 KO
Coca Cola 2008 66,667 003144 KO
Coca Cola 2009 68,734 003144 KO
Coca Cola 2010 70,452 003144 KO
Coca Cola 2011 71,861 003144 KO
Coca bbbb 2001 18 003144 Ky
Coca bbbb 2002 2 003144 Ky
Coca bbbb 2003 15 003144 Ky
Coca bbbb 2004 21 003144 Ky
Coca bbbb 2005 16 003144 Ky
Coca bbbb 2006 17 003144 Ky
Coca bbbb 2007 1 003144 Ky
Coca bbbb 2008 20 003144 Ky
Coca bbbb 2009 21 003144 Ky
Coca bbbb 2010 16 003144 Ky
Coca bbbb 2011 18 003144 Ky
Kmart 2001 6,945 003144 KO
Kmart 2002 6,637 003144 KO
Kmart 2003 70,453 003144 KO
Kmart 2004 7,394 003144 KO
Kmart 2005 67,525 003144 KO
Kmart 2006 67,000 003144 KO
Kmart 2007 65,324 003144 KO
Kmart 2008 6,667 003144 KO
Kmart 2009 68,734 003144 KO
Kmart 2010 70,452 003144 KO
Kmart 2011 1,861 003144 KO
Kappa 2001 18 003144 Ky
Kappa 2002 2 003144 Ky
Kappa 2003 15 003144 Ky
Kappa 2004 2 003144 Ky
Kappa 2005 16 003144 Ky
Kappa 2006 17 003144 Ky
Kappa 2007 14 003144 Ky
Kappa 2008 2 003144 Ky
Kappa 2009 21 003144 Ky
Kappa 2010 1 003144 Ky
Kappa 2011 18 003144 Ky
;
run;
proc sort data=have;
by brand year;
data p1 p2 p3 p4;
set have;
by brand;
if not first.brand;
if Interbrand_Value>20 then do;
if Interbrand_Value>lag(Interbrand_Value) then output p1;
else output p2; end;
else do;
if Interbrand_Value>lag(Interbrand_Value) then output p3;
else output p4; end;
run;
You are right Linlin. The codes you gave me put the brands in the right portfolio. There is only a small problem left. The first year should not be in the portfolios, however for 2002 I have to know whether the interbrand value increased from 2001 to 2002. For 2002 there are some brands in the wrong portfolio...
Hi,
I did not see any records from 2001(first year) in my output datasets. Which brands are in the wrong output datasets(p1,p2,p3,p4)?
GE | 2001 | 42,396 | 005047 | GE |
GE | 2002 | 41,311 | 005047 | GE |
GE | 2003 | 42,340 | 005047 | GE |
GE | 2004 | 44,111 | 005047 | GE |
GE | 2005 | 46,996 | 005047 | GE |
GE | 2006 | 48,907 | 005047 | GE |
GE | 2007 | 51,569 | 005047 | GE |
GE | 2008 | 53,086 | 005047 | GE |
GE | 2009 | 47,777 | 005047 | GE |
GE | 2010 | 42,808 | 005047 | GE |
GE | 2011 | 42,808 | 005047 | GE |
This brand for example: GE is in 2002 in p1, but it should be in p2. I also do not have any records from 2001 in my output dataset, however this causes that there are some mistakes in 2002. Is it possible to use the value of 2001, but not include the records in the portfolios?
sorry. Try this one: This one does not work.
data have;
brand='GE';
input Year Interbrand_Value;
cards;
2001 42
2002 41
2003 42
2004 44
2005 46
2006 18
2007 10
2008 8
2009 7
2010 9
;
proc sort data=have;
by brand year;
data p1 p2 p3 p4;
set have;
by brand;
if first.brand then flag=1;
if Interbrand_Value>20 then do;
if Interbrand_Value>lag(Interbrand_Value) and flag ne 1 then output p1;
if Interbrand_Value<=lag(Interbrand_Value) and flag ne 1 then output p2;
end;
if Interbrand_Value<=10 then do;
if Interbrand_Value>lag(Interbrand_Value) and flag ne 1 then output p3;
if Interbrand_Value<=lag(Interbrand_Value) and flag ne 1 then output p4;
end;
run;
proc print data=p2;run;
Thank you for your help Linlin!! 🙂
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.