BookmarkSubscribeRSS Feed
deleted_user
Not applicable
Hi All,

I have data set like this,

Name Period Ret
A 20010131 0.1
..........................................
A 20011231 0.12
A 20020231 0.22
..........................................
A 20021231 -0.33
B 20010131 0.12
..........................................
B 20011231 0.15
B 20020231 0.44
..........................................
B 20021231 -0.06

As you can see,
they are grouped by name and within each name group, data is ordered by date.
I want the sumproduct of (1+ret) with respect to each year. For example, for A, there are two year-period data, 2001 and 2002 (the real dataset might contain more years). Here, I assume in each year, there are only two monthly data(the real dataset might contain at most 12 month periods).

I also need one cell denoting the number of periods with respect to each year.

the output will look like,

Name period ret NO.
A 2001 0.012 2
A 2002 -0.0726 2
B 2001 0.018 2
B 2002 -0.0264 2

0.012 comes from 0.1*0.12; -0.0726 comes from 0.22*(-0.33). the same principle applies to B. NO. means 2 periods/numbers are included in calculation.

Thanks a lot!
14 REPLIES 14
SPR
Quartz | Level 8 SPR
Quartz | Level 8
Hello Jie.su2134,

This is asolution:
[pre]data i;
input Name $ Period Ret;
datalines;
A 20010131 0.1
A 20011231 0.12
A 20020231 0.22
A 20021231 -0.33
B 20010131 0.12
B 20011231 0.15
B 20020231 0.44
B 20021231 -0.06
run;
proc SQL;
create table t as
select name, INT(period/10000) as Year, Ret from i
order by Name, Year
;quit;
data r (drop=ret rename=(r=Ret Year=Period));
retain Name Year r No;
set t;
if FIRST.Year then do; No=0; r=1; end;
No+1;
r=r*ret;
if LAST.year then output;
by Name Year;
run;
[/pre]
Sincerely,
SPR
deleted_user
Not applicable
Hi SPR,

I tried your codes in sas and they work, but my DATE varible is character type, which cannot be divided by 10000? as Sbb said. I might try to pick "year" part" and convert that into numberic and see whether that works.

Thanks,
Cynthia_sas
SAS Super FREQ
Hi:
You do not necessarily need to divide your date value to get the year. You could also turn your date value from a character string into a SAS date value using the INPUT function. Then, once you had a SAS date value, the YEAR function would extract the year from the date.

The code below reads a variable called CHARDATE into a character string. For the first row, the string '20010131' would get turned into the numeric value 15006 (the number of days from Jan 1, 1960 until Jan 31, 2001) and then the year 2001 (a numeric value) would be extracted from the date value with the YEAR function. Alternately, you could extract the first 4 characters of the CHARDATE field into a character string, with the substr function. It really depends on whether you need the year value to be character or numeric.

The downside of creating a numeric date value is that an invalid date, like 02/31/2002 would become a missing value, while the SUBSTR function could still extract the year portion of the invalid date string.

cynthia
[pre]
data charyr;
infile datalines;
input name $ chardate $ ret;
makeNumDt = input(chardate,anydtdte.);
year = year(makeNumDt);
charyr = substr(chardate,1,4);
put chardate= makeNumDt= yymmdd10. year= charyr=;
return;
datalines;
A 20010131 0.1
A 20011231 0.12
A 20020231 0.22
A 20021231 -0.33
B 20010131 0.12
B 20011231 0.15
B 20020231 0.44
B 20021231 -0.06
;
run;
[/pre]
deleted_user
Not applicable
Hi Cynthia,

Thanks! 02312001 is a typo. Stupid me!!! I'll try your codes.
SPR
Quartz | Level 8 SPR
Quartz | Level 8
Hello Jie.su2134,

It is very easy to convert character to numberic.

Instead of INT(period/10000)

Use INT(INPUT(period,8.)/10000

in my code.

Sincerely,
SPR
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
Why would the OP want to manipulate a "date" string in character format, simply by dividing it, without any validation along the way?

The power of SAS provides the data-accuracy and data-quality validation, which can be achieved using the INPUT function and the appropriate SAS INFORMAT. And, so, an invalid date will not be allowed to pass into the system.

Then, also, additional SAS functions YEAR, INTNX, etc. can be exploited to derive additional useful data-variables, which can help the user improve their data analysis processing, as has been explained in this post-thread.

Scott Barry
SBBWorks, Inc.
SPR
Quartz | Level 8 SPR
Quartz | Level 8
Hello Scott,

I understood your concern and got your point. This is a corrected solution:
[pre]
data i;
input Name $ Period anydtdte. Ret;
format period date7.;
datalines;
A 20010131 0.1
A 20011231 0.12
A 20020228 0.22
A 20021231 -0.33
B 20010131 0.12
B 20011231 0.15
B 20020228 0.44
B 20021231 -0.06
run;
proc SQL;
create table t as
select name, YEAR(period) as Year, Ret from i order by Name, Year;
quit;
data r (drop=ret rename=(r=Ret Year=Period));
retain Name Year r No;
set t;
if FIRST.Year then
do;
No=0;
r=1;
end;
No+1;
r=r*ret;
if LAST.year then
output;
by Name Year;
run;
[/pre]
Sincerely,
SPR
Ksharp
Super User
Hi. Your code is wrong.
>No+1;

That mean no will add one for each observation in the same year.But there will be some duplicated months in the same year which op maybe not want.

Ksharp
SPR
Quartz | Level 8 SPR
Quartz | Level 8
Hello Ksharp,

It looks like you are wrong. Initial post says that "(the real dataset might contain at most 12 month periods)."

Sincerely,
SPR
darrylovia
Quartz | Level 8
I don't have the prettiest solution but it gets the job done without too much high level coding. One note your dates include impossiblities like 31-Feb which doesn't exist. See the below code:

By ordering the data and using SAS's first./last. and the retain statement you can build rolling statistis within groups.


proc sql;
create table one
(name char(1)
,period char(8)
,ret num);

insert into one
values('A', '20010131', 0.1)
values('A', '20011231', 0.12)
values('A', '20020228', 0.22)
values('A', '20021231', -0.33)
values('B', '20010131', 0.12)
values('B', '20011231', 0.15)
values('B', '20020228', 0.44)
values('B', '20021231', -0.06)
;

create table two as
select name, put(input(period,yymmdd10.),year4.) as year ,input(period,yymmdd10.) as period format=yymmdd10., ret
from one
order by 1,2;
quit;



data three;
set two;
by name year;
retain sum_product NO 0;

if first.year then do;
sum_product=1;
NO=0;
end;
NO+1;
sum_product=sum_product*ret;

if last.year;

run;
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
Use INTNX function to derive a period-start date from which you can then count each period and roll-up observations to another new variable that has YEAR (using YEARPART or YEAR function). Of course, it's unclear if you are working with SAS numeric DATE type variables or numeric/character data in format "yyyymmdd". If not SAS DATE variables, I would encourage you to get those data-values converted to SAS DATE type variables, so you can manipulate/reference then to get month-start and year-start derivations.

Scott Barry
SBBWorks, Inc.
Ksharp
Super User
Hi.From your user name ,It looks like Chinese guy?
Assuming Ret has no missing value.


[pre]
data temp;
input name $ period : yymmdd10. ret ;
format period yymmdd8.;
datalines;
A 20010131 0.1
A 20011231 0.12
A 20020228 0.22
A 20021231 -0.33
B 20010131 0.12
B 20011231 0.15
B 20020228 0.44
B 20021231 -0.06
;
run;
proc sort data=temp;
by name period;
run;
data temp;
set temp;
_period=year(period);
_month=month(period);
run;
proc sort data=temp;
by name _period _month;
run;
data result(drop=period ret _month);
set temp;
by name _period _month;
retain _ret;
if first._period then do;
_ret=1;
no=0;
end;
_ret=_ret*ret;
if first._month then no+1;
if last._period then output;
run;

[/pre]


Ksharp
deleted_user
Not applicable
Hi Ksharp,

I will try your codes later. Thanks for your reply in my "fill age by condition" thread. The codes work!!! I just started to learn SAS. This Forum is great!

Really appreciate your help.
DF
Fluorite | Level 6 DF
Fluorite | Level 6
Some googling found me the below rather interesting method of doing this in SQL. I've adapted it a little to work in SAS (it was on EE for an SQL Server question):

data i;
format name $10.;
format Period date9.;
format Ret 8.;
input Name $ Period:yymmdd8. Ret;
datalines;
A 20010131 0.1
A 20011231 0.12
A 20020228 0.22
A 20021231 -0.33
B 20010131 0.12
B 20011231 0.15
B 20020228 0.44
B 20021231 -0.06
C 20010131 0
C 20010228 0.15
C 20010328 -0.44
C 20010430 -0.06
D 20010228 -0.5
D 20010328 -0.6
D 20010430 -0.7
E 20010228 0.5
E 20010328 -0.6
E 20010430 -0.7

;
run;

proc sql;
create table test as
select
name,
year(Period) as Year,
case
when (sum(sign(ret)=0) ~= 0) then 0 /*zero found*/
when (mod(sum(sign(ret)=-1),2) = 1) then -1 /*odd*/
else 1 /*otherwise pos*/
end * EXP(SUM(LOG(abs(Ret)))) as Ret,
count(*) as count
from i
group by
name,
calculated year;
quit;

It's a bit of an ugly solution, and I couldn't be sure of its efficiency, but it's interesting to see it in a single step.

The use of log requires positive numbers, hence the use of ABS(). This also meant a secondary method was required to determine the resulting signage of the multiplication and to deal with the circumstance of a 0 appearing in the source data.

I've tried to add a few additional examples to show these different scenarios, but it might be worth doing some further testing too.

EE source was http://www.experts-exchange.com/Microsoft/Development/MS-SQL-Server/Q_10226918.html.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 14 replies
  • 2811 views
  • 0 likes
  • 7 in conversation