BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
HelpPlease
Calcite | Level 5

Good day.I am trying to create a summary of my data by using arrays. the objective is to sum my cost and budget variable for certain dates for every ID.My data looks like the following:

ID DATE              COST    BUDGET 
01   01JAN02      50           900
01   01JAN02      20           800 
01   01APR02     60           500
01   01JUL02      80           606
01   01JUL02      30           900 
01   01JUL02      20           652 
01   01OCT02    50            879 
02   01JAN02     30            997 
02   01JAN02     80           1000 ect 

So I want to use an array to get the sum of the cost for each date and the sum of the budget for each date as a variable. I want one entry per ID.

eg. ID COST_01JAN02 COST_01APR02 COST_01JUN02 COST_01OCT02 BUDGET_01JAN02.........

This is the code i tried but it doesn't seem to give the desired result, any ideas on where I'm going wrong? 

data one; 
set original; 
by ID ; 
array now COST_01JAN02 COST_01APR02 COST_01JUL02 
                 COST_01OCT02 BUDGET_01JAN02 BUDGET_01APR02 
                 BUDGET_01JUL02 BUDGET_01OCT02; 
if first.ID then DO OVER now;
now=.;
end; 
if DATE='01JAN02'd then do ; 
COST_01JAN02=sum(of COST); BUDGET_01JAN02=sum(of BUDGET); end;
if DATE='01APR02'd then do ; 
COST_01APR02=sum(of COST); BUDGET_01APR02=sum(of BUDGET); end; 
if DATE='01JUL02'd then do ; 
COST_01JUL02=sum(of COST); BUDGET_01JUL02=sum(of BUDGET); end;
if DATE='01OCT02'd then do ; 
COST_01OCT02=sum(of COST); BUDGET_01OCT02=sum(of BUDGET); end; 
if last.B_CODE then output; 
Keep ID COST_01JAN02 COST_01APR02 COST_01JUL02 
         COST_01OCT02 BUDGET_01JAN02 BUDGET_01APR02 
          BUDGET_01JUL02 BUDGET_01OCT02; 
run;

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
RW9
Diamond | Level 26 RW9
Diamond | Level 26

The question is asked to test your knowledge.  Its not a good way of working in any sense.  You could for instance;

proc means data=have;
  by id date;
  var cost;
  output out=cost sum=sum;
run;
proc means data=have;
  by id date;
  var budget;
  output out=budget sum=sum;
run;
proc transpose data=cost out=cost_t prefix=cost;
  by id;
  var sum;
  id date;
run;
proc transpose data=budget out=budget_t prefix=budget;
  by id;
  var sum;
  id date;
run;

data want;
  merge cost_t budget_t;
  by id;
run;

Assumes sorted by the way.  This utilises transposing.  You can do it via array with a bit more work:

http://support.sas.com/resources/papers/proceedings13/082-2013.pdf

Its called reshaping data.  Personally the only time I would ever keep data in a transposed format (i.e. data going across the page) is for an output file.  All programming work is many times simpler and more robust when data goes down the page (normalised).

View solution in original post

10 REPLIES 10
andreas_lds
Jade | Level 19

Your data is in a perfect format for using proc summary and i recommend keeping the format, because wide formats, especially when variable names include date information, will cause headaches sooner or later, sooner in 12 of 13 cases.

HelpPlease
Calcite | Level 5
so you suggest I do a proc summary? my assignment is to use arrays to get my desired results? any suggestions on how I can do this. Thank you for your reply
RW9
Diamond | Level 26 RW9
Diamond | Level 26

The question is asked to test your knowledge.  Its not a good way of working in any sense.  You could for instance;

proc means data=have;
  by id date;
  var cost;
  output out=cost sum=sum;
run;
proc means data=have;
  by id date;
  var budget;
  output out=budget sum=sum;
run;
proc transpose data=cost out=cost_t prefix=cost;
  by id;
  var sum;
  id date;
run;
proc transpose data=budget out=budget_t prefix=budget;
  by id;
  var sum;
  id date;
run;

data want;
  merge cost_t budget_t;
  by id;
run;

Assumes sorted by the way.  This utilises transposing.  You can do it via array with a bit more work:

http://support.sas.com/resources/papers/proceedings13/082-2013.pdf

Its called reshaping data.  Personally the only time I would ever keep data in a transposed format (i.e. data going across the page) is for an output file.  All programming work is many times simpler and more robust when data goes down the page (normalised).

Oligolas
Barite | Level 11

I think arrays should start a #metoo campaign if you plan to really use this for analysis.

 

For pure didactical purpose you could do this:

data have;
input ID DATE $ COST BUDGET;
datalines;
01 01JAN02 50 900
01 01JAN02 20 800 
01 01APR02 60 500
01 01JUL02 80 606
01 01JUL02 30 900 
01 01JUL02 20 652 
01 01OCT02 50 879 
02 01JAN02 30 997 
02 01JAN02 80 1000
;
run;

proc sort data=have (drop=id);
   by date;
run;
proc transpose data=have out=h1 prefix=cost;
by date;
id cost;
run;
proc transpose data=have out=h2 prefix=budget;
by date;
id BUDGET;
run;

data h3;
   length sum_cost sum_budget 8;
   merge h1 h2 end=last;
   by date;
   sum_cost=.;
   array costs cost:;
   array budgets budget:;
   dim_costs=dim(costs);
   dim_budgets=dim(budgets);
   if _N_ eq 1 then do;
      call execute('data want; set h3;'); 
      do i=1 to dim_costs;
         if i eq 1 then call execute('sum_cost=sum(');
         call execute(vname(costs(i)));
         if i ne dim_costs then call execute(',');
         else call execute(');');
      end;

      do j=1 to dim_budgets;
         if j eq 1 then call execute('sum_budget=sum(');
         call execute(vname(budgets(j)));
         if j ne dim_budgets then call execute(',');
         else call execute(');run;');
      end;
   end;  
run;
________________________

- Cheers -

Reeza
Super User

@Oligolas Not funny.

 

I think arrays should start a #metoo campaign if you plan to really use this for analysis.

HelpPlease
Calcite | Level 5
haha thanks anyway!!
HelpPlease
Calcite | Level 5
Thank you!
Reeza
Super User

@RW9 wrote:

  All programming work is many times simpler and more robust when data goes down the page (normalised).


@RW9 R has conceptualized this under the concept of 'tidy data'. He wrote a good paper on it, attached.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Nice paper.  Have seen a slight movement in this over the years.  Notably in the pharma industry there is the CDISC guidelines which structures data in a normalised way, paramter/response type setup.  Most databases also do it this.  Unfortunately the prevalence of the Excel virus (much like the other MS wares like IE) in all walks of computing means that way of thinking is inbuilt at a young age from schools and universities through into mainstream activities.  Take almost any major incident and likely there will be an Excel file somewhere in the background as the root of it all, sitting there all unvalidated, mashed together with pretty colors perfect for some suited management type to pivot.  

ballardw
Super User

@HelpPlease wrote:
so you suggest I do a proc summary? my assignment is to use arrays to get my desired results? any suggestions on how I can do this. Thank you for your reply

Then post ALL of the conditions and limitations imposed by instructions. We cannot tell from your initial post that you have something external, like "homework rules", that limit solutions.

 

And if you have an instructor that is forcing output to look like the junk in Excel, examine everything else in this class very carefully.

 

If I understand the output look you want you do not even need to do any manipulation of the data to get a person-readable report in that format.

proc tabulate data=have;
   class id;
   class date;
   format date date7.;
   var cost budget;
   tables id,
          date*(cost budget)*sum
   ;
run;

for instance.

 

Almost everything that creates the "wide" totals you are requested require additional work for most other steps.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 2210 views
  • 3 likes
  • 6 in conversation