DATA Step, Macro, Functions and more

array help to transpose and summarise data

Reply
Contributor
Posts: 44

array help to transpose and summarise data

Hello All,

I am back on this community since yesterday. Thought all my problems disappeared after yesterday - So wrong! 

Cutting to the chase. I have a vertical dataset (as every SAS table) and currently trying to transform this into a horizontal format for reporting purposes. I have created an array statement to do this and then to add the values up to provide a total.

I present below my have and want files:


data have;
input SEGMENT $2. RATING $4. snapshot 4. list 4. Nobs 2. Nobs_missing 2. Total_Invalid 2.;
datalines;
SE 6B 2008 11 5 0  0
SE 7  2008 12 3 0  0
SE 8  2008 13 3 0  0
SE R  2008 14 1 0  0
SE 8  2009 13 1 0  0
SE R  2009 14 1 0 0
SE R1 2009  . 0 0 10
SE R2 2009  . 0 0 10
SE V  2010  . 0 0 2
SE    2013  . . 10 .
SE    2014  . . 10 .
TL 1  2011  1 2 0  0
TL 2  2012  2 4 0  0
TL 3  2013  3 6 0  0
TL    2014  . . 10  0
;
run;


data want1;
infile datalines delimiter=",";
input SEGMENT $3. snapshot $4. RG_1 RG_2 RG_3 RG_4A RG_4B RG_4C RG_5A RG_5B RG_5C RG_6A RG_6B RG_7 RG_8 RG_N RG_R Total_Nobs Total_Nobs_missing Total_Invalid ;
datalines;
SE,2008,.,.,.,.,.,.,.,.,.,.,5,3,3,.,1,12,5,0
SE,2009,.,.,.,.,.,.,.,.,.,.,.,.,1,.,1,2,0,20
SE,2010,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,2
SE,2013,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,10,.,.
SE,2014,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,10,.,.
TL,2011,2,.,.,.,.,.,.,.,.,.,.,.,.,.,.,2,0,0
TL,2012,.,4,.,.,.,.,.,.,.,.,.,.,.,.,.,4,0,0
TL,2013,.,.,6,.,.,.,.,.,.,.,.,.,.,.,.,6,.,0
TL,2014,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,10,.
;
run;
data want2;
infile datalines delimiter=",";
input snapshot $4. RG_1 RG_2 RG_3 RG_4A RG_4B RG_4C RG_5A RG_5B RG_5C RG_6A RG_6B RG_7 RG_8 RG_N RG_R Total_Nobs Total_Nobs_missing Total_Invalid ;
datalines;
2008,.,.,.,.,.,.,.,.,.,.,5,3,3,.,1,12,5,0
2009,.,.,.,.,.,.,.,.,.,.,.,.,1,.,1,2,0,20
2010,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,2
2013,.,.,6,.,.,.,.,.,.,.,.,.,.,.,.,6,10,.,.
2014,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,20,.,.
2011,2,.,.,.,.,.,.,.,.,.,.,.,.,.,.,2,0,0
2012,.,4,.,.,.,.,.,.,.,.,.,.,.,.,.,4,0,0
;
run;

In input table "have" I would like that the data is transposed using the following code:

 

proc transpose data=have out=tmp  prefix=RG_ name=obs_cnt_var;
by segment snapshot;
id rating;
var Nobs Nobs_missing;
run;

I would like that dataset "want1" presents the "nobs" as variable "total_nobs" where variable "list" is not equal to Null. If variable "rating" eq "" then i should get variable "Total_Nobs_missing" grouped by segment and snapshot. If variable "rating" eq "" and variable "list" eq null then variable "total_invalid" should have all observations added up grouped by segment and snapshot. This is for table "want1".

The same concept applies to table "want2" with the exception that variable "segment" is missing from want2. If someone could post a solution for Want1, i could reverse engineer the solution and do it for want2. or vice versa.

I post below my sick array code:

data tmp_all;
set have end=last;
array allgrades[15] RG_1 RG_2 RG_3 RG_4A RG_4B RG_4C RG_5A RG_5B RG_5C RG_6A RG_6B RG_7 RG_8 RG_N RG_R;
by Segment snapshot;
 if list ne . then do;
  allgrades[list]=Nobs;
 end;
if list eq . then do;
invalid_values=Total_Invalid;
end;
retain RG_1 RG_2 RG_3 RG_4A RG_4B RG_4C RG_5A RG_5B RG_5C RG_6A RG_6B RG_7 RG_8 RG_N RG_R;
/*output;*/
drop Rating list nobs Nobs_missing Total_Invalid;
/*if last.segment_pillar_iii then output;*/
las1=last.segment;
las2=last.snapshot;
run;

I used array as it is more flexible.

 

please do let me know.

regards,

S

Super User
Posts: 10,466

Re: array help to transpose and summarise data

You might describe the "report" layout needed.

I suspect either Proc Tabulate or Proc Report will generate the desired report with much less fuss.

Contributor
Posts: 44

Re: array help to transpose and summarise data

I need to take this data merge it with another dataset, do some iterations and then I could use an ODS output feature to export it to html form. Hence, possibly proc tabulate may not address my issue. I will look at it though, as i am not familiar with this procedure.
Contributor
Posts: 44

Re: array help to transpose and summarise data

Hello Balladw,

THank you for your swift response.

So i tried proc report - seems really good.

Except 2 immediate issues;

1) i need the table to list out the variables for rating to list out like under "want1". How would i do that? The output table "tmp" has very difference variable naming. Ideally a rename function would work, but in this instance, it will not be possible as the list "RATING" is never a fixed length.

2) the number of observations mentioned under ("1","2","3","4a"), etc has a number 1 written. I would like that to display the correct figures for the relevant group. Would you know how to do that? I got no experience with proc report and just knocked something up after reading a few bits online.

proc report data=work.have out=tmp;
columns segment snapshot rating nobs,sum Nobs_missing,sum Total_Invalid,sum;
define segment/group;
define snapshot/group ;
define rating/across missing;
define nobs/sum;
define nobs_missing/sum;
define total_invalid/sum;
run;

 I still prefer the array method because although the output is unpolished and raw, I could do further transformations with other datasets and then use proc print to display output.

Super User
Posts: 10,466

Re: array help to transpose and summarise data

I think you need to explain what exactly all those RG values actually are in relation to the starting data.

It appears that you are summing across rows within year for some of the variables but then including the individual values for some variables within that year. How do you expect to get variable names like RG_1 and RG_4B or RG_N from your current data? If you are using the "rating" variable then what would the rules be for Missing or blank rating.

 

You say: ) i need the table to list out the variables for rating to list out like under "want1". How would i do that? The output table "tmp" has very difference variable naming. Ideally a rename function would work, but in this instance, it will not be possible as the list "RATING" is never a fixed length.

 

What is in table tmp? What is the relation to the "rating" variable? Are you thinking of using the rating variable as part of variable names? If it varies enough, such that the combined length of the base variable name plus the added text cannot exceed 32 characters.

Contributor
Posts: 44

Re: array help to transpose and summarise data

Hello Ballardw,
The RG values are picked up from variable "Rating" of table "want". There are sometimes 10 different "Rating" values and at other times 15 different "Rating" values. This "rating" list is set and the size (10/15) is something i dictate prior to execution of Array/ proc report. Hence, i defined the array with "RG_1 RG_2 RG_3 .... RG_R".
Table " tmp" is the output generated from the "proc report" statement, which you suggested i use instead of array. THe variables would not be greater than 32 characters. It would be really a pain to type "32 character" variables in the code Smiley Happy
Many thanks.
S
Super User
Posts: 5,254

Re: array help to transpose and summarise data

And what are those reporting purposes?
Data never sleeps
Contributor
Posts: 44

Re: array help to transpose and summarise data

I need to take this data merge it with another dataset, do some iterations and then I could use an ODS output feature to export it to html form.
Ask a Question
Discussion stats
  • 7 replies
  • 465 views
  • 0 likes
  • 3 in conversation