- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I hope you could help me
I have this type of data
date type value type 2 name
01-Jan-11 a1 10 1 ae
01-Feb-11 a1 10 1 ae
01-Mar-11 a1 10 2 ae
01-Apr-11 a1 10 2 ae
01-Jan-11 A2 5 1 at
01-Feb-11 A2 50 1 at
01-Mar-11 A2 5 2 at
01-Apr-11 A2 50 2 at
01-Jan-11 a3 20 1 au
01-Feb-11 a3 60 1 au
01-Mar-11 a3 20 2 au
01-Apr-11 a3 60 2 au
and I would like to have the cumulative values by type type2 by name.
What is the best method to do this?
Thank you for helping me out.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
What do you want your output to look like? If you just want the cumulative sum of value, as they accumulate, you can do that in a datastep, using retain or a sum in a form that automatically retains to keep the sum, and a by statement.
But, what you will need will depend upon what you want to end up with.
Does the following approximate what you want to do?:
data have;
informat date anydtdte9.;
format date date9.;
input date type $ value type2 name $;
cards;
01-Jan-11 a1 10 1 ae
01-Feb-11 a1 10 1 ae
01-Mar-11 a1 10 2 ae
01-Apr-11 a1 10 2 ae
01-Jan-11 A2 5 1 at
01-Feb-11 A2 50 1 at
01-Mar-11 A2 5 2 at
01-Apr-11 A2 50 2 at
01-Jan-11 a3 20 1 au
01-Feb-11 a3 60 1 au
01-Mar-11 a3 20 2 au
01-Apr-11 a3 60 2 au
;
proc sort data=have;
by type type2 name date;
run;
data want;
set have;
by type type2 name;
if first.name then cumulative=0;
cumulative+value;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data have;
format date eurdfde9.;
input date eurdfde9. type $ value type2 name $;
cards;
01-Jan-11 a1 10 1 ae
01-Feb-11 a1 10 1 ae
01-Mar-11 a1 10 2 ae
01-Apr-11 a1 10 2 ae
01-Jan-11 A2 5 1 at
01-Feb-11 A2 50 1 at
01-Mar-11 A2 5 2 at
01-Apr-11 A2 50 2 at
01-Jan-11 a3 20 1 au
01-Feb-11 a3 60 1 au
01-Mar-11 a3 20 2 au
01-Apr-11 a3 60 2 au
;
run;
proc sql;
select type, type2, name, sum(value) as total_value
from have
group by type, type2, name;
quit;
type type2 name total_value
-----------------------------------------
A2 1 at 55
A2 2 at 55
a1 1 ae 20
a1 2 ae 20
a3 1 au 80
a3 2 au 80
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I tried this, but how do I manage when the data does not start in Jan but e.g. in Feb. I always would like to have Jan, Jan+Feb, Jan+Feb+Mar ,etc (even if there is a cero.
data work.cum;
set work.data;
by type2 code1 type date;
if first.type then cumwght=0;
cumval + value;
output;
run;
example data: data:
type 1 name1 value date type2
E 3 40 01-Jan-11 A1
E 3 10 01-Feb-11 A1
E 3 49 01-Mar-11 A1
E 3 8 01-Apr-11 A1
E 3 56 01-May-11 A1
E 3 33 01-Jun-11 A1
E 3 60 01-Jul-11 A1
E 3 700 01-Feb-11 a2
E 3 145 01-Jan-11 a2
E 3 190 01-Feb-11 a2
E 3 100 01-Mar-11 a2
E 3 80 01-Apr-11 a2
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Your data fields don't match your code and, since you didn't show the result that you want, we can only guess.
My guess is that you don't want to include date as a by variable but, rather, year, but with the data having been sorted by date.
That way, it would be irrelevant which month any of your data really begin with. And, why do you initialize cumwght, but then use cumval in your sum statement?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
proc format;
value myfmt (multilabel)
1 = 'Jan'
2 = 'Feb'
3 = 'Mar'
4 = 'Apr'
5 = 'May'
6 = 'Jun'
7 = 'Jul'
8 = 'Aug'
9 = 'Sep'
10 = 'Oct'
11 = 'Nov'
12 = 'Dec'
1 , 2 = 'Jan + Feb'
1 , 2 , 3 = 'Jan + Feb + Mar'
/* etc... */
;
run;
data have;
input type1 $ name1 value date eurdfde9. type2 $;
month=month(date);
year=year(date);
cards;
E 3 40 01-Jan-11 A1
E 3 10 01-Feb-11 A1
E 3 49 01-Mar-11 A1
E 3 8 01-Apr-11 A1
E 3 56 01-May-11 A1
E 3 33 01-Jun-11 A1
E 3 60 01-Jul-11 A1
E 3 700 01-Feb-11 a2
E 3 145 01-Jan-11 a2
E 3 190 01-Feb-11 a2
E 3 100 01-Mar-11 a2
E 3 80 01-Apr-11 a2
;
run;
proc means data=have sum nonobs nway noprint;
class year;
class month /mlf;
class type1 type2;
var value;
format month myfmt.;
output out=want sum=total_value;
run;
proc print data=want; run;
total_
Obs year month type1 type2 _TYPE_ _FREQ_ value
1 2011 Apr E A1 15 1 8
2 2011 Apr E a2 15 1 80
3 2011 Feb E A1 15 1 10
4 2011 Feb E a2 15 2 890
5 2011 Jan E A1 15 1 40
6 2011 Jan E a2 15 1 145
7 2011 Jan + Feb E A1 15 2 50
8 2011 Jan + Feb E a2 15 3 1035
9 2011 Jan + Feb + Mar E A1 15 3 99
10 2011 Jan + Feb + Mar E a2 15 4 1135
11 2011 Jul E A1 15 1 60
12 2011 Jun E A1 15 1 33
13 2011 Mar E A1 15 1 49
14 2011 Mar E a2 15 1 100
15 2011 May E A1 15 1 56
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
How about:
data have; input type1 $ name1 value date eurdfde9. type2 $; format date date9.; cards; E 3 40 01-Jan-11 A1 E 3 10 01-Feb-11 A1 E 3 49 01-Mar-11 A1 E 3 8 01-Apr-11 A1 E 3 56 01-May-11 A1 E 3 33 01-Jun-11 A1 E 3 60 01-Jul-11 A1 E 3 700 01-Feb-11 a2 E 3 145 01-Jan-11 a2 E 3 190 01-Feb-11 a2 E 3 100 01-Mar-11 a2 E 3 80 01-Apr-11 a2 ; run; proc sql noprint; create table temp as select * from (select distinct * from have(keep=type1 type2 name1) ) , (select distinct date from have) order by type1,type2,name1,date; quit; proc sort data=have; by type1 type2 name1 date; run; data want; merge have temp; by type1 type2 name1 date; output; call missing(value); run; data want(drop=value); set want; by type1 type2 name1 date; if first.name1 then sum_value=0; sum_value+value; if last.date then output; run;
Ksharp
Message was edited by: xia keshan
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for your help. I tried this but get this response
NOTE: The execution of this query involves performing one or more Cartesian product joins
that can not be optimized.
NOTE: Table WORK.TEMP created, with 14 rows and 4 columns.
But in my real data, what I need is that if Jan11 there is data, and in Feb11 no data and again in Mar11 is data again.
Jan value is 10, feb no value, Mar value is 20.
What I would like to see is
Jan 10, Feb 10 and Mar 20
I know there is a solution, but I can't find it yet.
Thanks, I really appreciate this forum, it really helps people that are less skilled in SAS.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Did you look at the results of Ksharp's code? It appears to be doing what you have asked for.
The messageabout the cartesian product is NOT an error. He simply used the capability of proc sql to provide all possible combinations.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I looked at the ersults of Ksharp, but when I applied to my real data, the missing months were given a "0" instead of taking the previous month.
Let's say there was a value in Jan11, then in Mar11, nothing in Feb11. So what I would like to see is Feb11 taking the value of Jan11, instead of a cero.
Is this possible with SAS?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
That is precisely what his code does! I don't believe you ever answered my question from a few days ago. Your initial example code used different fields than your data actually had. Specifically, a field called code1.
If your codes match the ones in Ksharps program, I think that you'll find that the results are what you expect.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Can you post some more data, So I can check where the problem is.
Thanks Art.
Ksharp
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Please see an data example below:
and thank you for your time in helping me out. As you can see the first line is Jan11 and the second is Mar11. So I would like the data to appear:
Jan11 is 458, Feb11 is 458, Mar11 is 950 (458+492), etc. until the end of the code.
type 1 type 2 value date type 3
A 1 458 01-Jan-11 ba
A 1 492 01-Mar-11 ba
A 1 9 01-Apr-11 ba
A 1 1 01-May-11 ba
A 1 333 01-Jun-11 ba
A 1 1 01-Jul-11 ba
A 2 14 01-Jan-11 ba
A 2 20 01-Feb-11 ba
A 2 18 01-Mar-11 ba
A 2 13 01-Apr-11 ba
A 2 4605 01-May-11 ba
A 2 1599 01-Jun-11 ba
A 2 58 01-Jul-11 ba
A 3 7 01-Jan-11 ba
A 3 3 01-Feb-11 ba
A 3 4 01-Mar-11 ba
A 3 3 01-Apr-11 ba
A 3 4 01-May-11 ba
A 3 2 01-Jun-11 ba
B 1 5 01-Feb-11 ba
B 1 2 01-Mar-11 ba
B 1 7 01-Apr-11 ba
B 1 1 01-May-11 ba
B 1 4 01-Jul-11 ba
B 1 20 01-Mar-11 ba
B 2 11 01-Apr-11 ba
B 2 7 01-Jun-11 ba
B 2 70 01-May-11 ba
B 2 1 01-Jan-11 ba
B 2 1 01-Feb-11 ba
B 2 1 01-Mar-11 ba
B 2 1 01-Apr-11 ba
B 3 1 01-May-11 ba
B 3 1 01-Jun-11 ba
B 3 1 01-Jul-11 ba
B 3 1 01-Jan-11 ba
B 3 1 01-Apr-11 ba
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You change the variable name.
How about this:
data have; input type1 $ type2 value date date9. type3 $; format date date9.; cards; A 1 458 01-Jan-11 ba A 1 492 01-Mar-11 ba A 1 9 01-Apr-11 ba A 1 1 01-May-11 ba A 1 333 01-Jun-11 ba A 1 1 01-Jul-11 ba A 2 14 01-Jan-11 ba A 2 20 01-Feb-11 ba A 2 18 01-Mar-11 ba A 2 13 01-Apr-11 ba A 2 4605 01-May-11 ba A 2 1599 01-Jun-11 ba A 2 58 01-Jul-11 ba A 3 7 01-Jan-11 ba A 3 3 01-Feb-11 ba A 3 4 01-Mar-11 ba A 3 3 01-Apr-11 ba A 3 4 01-May-11 ba A 3 2 01-Jun-11 ba B 1 5 01-Feb-11 ba B 1 2 01-Mar-11 ba B 1 7 01-Apr-11 ba B 1 1 01-May-11 ba B 1 4 01-Jul-11 ba B 1 20 01-Mar-11 ba B 2 11 01-Apr-11 ba B 2 7 01-Jun-11 ba B 2 70 01-May-11 ba B 2 1 01-Jan-11 ba B 2 1 01-Feb-11 ba B 2 1 01-Mar-11 ba B 2 1 01-Apr-11 ba B 3 1 01-May-11 ba B 3 1 01-Jun-11 ba B 3 1 01-Jul-11 ba B 3 1 01-Jan-11 ba B 3 1 01-Apr-11 ba ; run; proc sql noprint; create table temp as select * from (select distinct * from have(keep=type1 type2 type3) ) , (select distinct date from have) order by type1,type2,type3,date; quit; proc sort data=have; by type1 type2 type3 date; run; data want; merge have temp; by type1 type2 type3 date; output; call missing(value); run; data want(drop=value); set want; by type1 type2 type3 date; if first.type3 then sum_value=0; sum_value+value; if last.date then output; run;
Ksharp
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Given that all of the type3s have the same value, can we assume that it is really either irrelevant or has the highest priority. The following revision of Ksharp's code produces the desired result:
proc sql noprint;
create table temp as
select *
from (select distinct * from have(keep=type3 type1 type2) ) ,
(select distinct date from have)
order by type3,type1,type2,date;
quit;
proc sort data=have;
by type3 type1 type2 date;
run;
data want;
merge have temp;
by type3 type1 type2 date;
output;
call missing(value);
run;
data want(drop=value);
set want;
by type3 type1 type2 date;
if first.type2 then sum_value=0;
sum_value+value;
if last.date then output;
run;