Hello All,
I am back on this community since yesterday. Thought all my problems disappeared after yesterday - So wrong!
Cutting to the chase. I have a vertical dataset (as every SAS table) and currently trying to transform this into a horizontal format for reporting purposes. I have created an array statement to do this and then to add the values up to provide a total.
I present below my have and want files:
data have;
input SEGMENT $2. RATING $4. snapshot 4. list 4. Nobs 2. Nobs_missing 2. Total_Invalid 2.;
datalines;
SE 6B 2008 11 5 0 0
SE 7 2008 12 3 0 0
SE 8 2008 13 3 0 0
SE R 2008 14 1 0 0
SE 8 2009 13 1 0 0
SE R 2009 14 1 0 0
SE R1 2009 . 0 0 10
SE R2 2009 . 0 0 10
SE V 2010 . 0 0 2
SE 2013 . . 10 .
SE 2014 . . 10 .
TL 1 2011 1 2 0 0
TL 2 2012 2 4 0 0
TL 3 2013 3 6 0 0
TL 2014 . . 10 0
;
run;
data want1;
infile datalines delimiter=",";
input SEGMENT $3. snapshot $4. RG_1 RG_2 RG_3 RG_4A RG_4B RG_4C RG_5A RG_5B RG_5C RG_6A RG_6B RG_7 RG_8 RG_N RG_R Total_Nobs Total_Nobs_missing Total_Invalid ;
datalines;
SE,2008,.,.,.,.,.,.,.,.,.,.,5,3,3,.,1,12,5,0
SE,2009,.,.,.,.,.,.,.,.,.,.,.,.,1,.,1,2,0,20
SE,2010,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,2
SE,2013,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,10,.,.
SE,2014,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,10,.,.
TL,2011,2,.,.,.,.,.,.,.,.,.,.,.,.,.,.,2,0,0
TL,2012,.,4,.,.,.,.,.,.,.,.,.,.,.,.,.,4,0,0
TL,2013,.,.,6,.,.,.,.,.,.,.,.,.,.,.,.,6,.,0
TL,2014,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,10,.
;
run;
data want2;
infile datalines delimiter=",";
input snapshot $4. RG_1 RG_2 RG_3 RG_4A RG_4B RG_4C RG_5A RG_5B RG_5C RG_6A RG_6B RG_7 RG_8 RG_N RG_R Total_Nobs Total_Nobs_missing Total_Invalid ;
datalines;
2008,.,.,.,.,.,.,.,.,.,.,5,3,3,.,1,12,5,0
2009,.,.,.,.,.,.,.,.,.,.,.,.,1,.,1,2,0,20
2010,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,2
2013,.,.,6,.,.,.,.,.,.,.,.,.,.,.,.,6,10,.,.
2014,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,20,.,.
2011,2,.,.,.,.,.,.,.,.,.,.,.,.,.,.,2,0,0
2012,.,4,.,.,.,.,.,.,.,.,.,.,.,.,.,4,0,0
;
run;
In input table "have" I would like that the data is transposed using the following code:
proc transpose data=have out=tmp prefix=RG_ name=obs_cnt_var;
by segment snapshot;
id rating;
var Nobs Nobs_missing;
run;
I would like that dataset "want1" presents the "nobs" as variable "total_nobs" where variable "list" is not equal to Null. If variable "rating" eq "" then i should get variable "Total_Nobs_missing" grouped by segment and snapshot. If variable "rating" eq "" and variable "list" eq null then variable "total_invalid" should have all observations added up grouped by segment and snapshot. This is for table "want1".
The same concept applies to table "want2" with the exception that variable "segment" is missing from want2. If someone could post a solution for Want1, i could reverse engineer the solution and do it for want2. or vice versa.
I post below my sick array code:
data tmp_all;
set have end=last;
array allgrades[15] RG_1 RG_2 RG_3 RG_4A RG_4B RG_4C RG_5A RG_5B RG_5C RG_6A RG_6B RG_7 RG_8 RG_N RG_R;
by Segment snapshot;
if list ne . then do;
allgrades[list]=Nobs;
end;
if list eq . then do;
invalid_values=Total_Invalid;
end;
retain RG_1 RG_2 RG_3 RG_4A RG_4B RG_4C RG_5A RG_5B RG_5C RG_6A RG_6B RG_7 RG_8 RG_N RG_R;
/*output;*/
drop Rating list nobs Nobs_missing Total_Invalid;
/*if last.segment_pillar_iii then output;*/
las1=last.segment;
las2=last.snapshot;
run;
I used array as it is more flexible.
please do let me know.
regards,
S
You might describe the "report" layout needed.
I suspect either Proc Tabulate or Proc Report will generate the desired report with much less fuss.
Hello Balladw,
THank you for your swift response.
So i tried proc report - seems really good.
Except 2 immediate issues;
1) i need the table to list out the variables for rating to list out like under "want1". How would i do that? The output table "tmp" has very difference variable naming. Ideally a rename function would work, but in this instance, it will not be possible as the list "RATING" is never a fixed length.
2) the number of observations mentioned under ("1","2","3","4a"), etc has a number 1 written. I would like that to display the correct figures for the relevant group. Would you know how to do that? I got no experience with proc report and just knocked something up after reading a few bits online.
proc report data=work.have out=tmp;
columns segment snapshot rating nobs,sum Nobs_missing,sum Total_Invalid,sum;
define segment/group;
define snapshot/group ;
define rating/across missing;
define nobs/sum;
define nobs_missing/sum;
define total_invalid/sum;
run;
I still prefer the array method because although the output is unpolished and raw, I could do further transformations with other datasets and then use proc print to display output.
I think you need to explain what exactly all those RG values actually are in relation to the starting data.
It appears that you are summing across rows within year for some of the variables but then including the individual values for some variables within that year. How do you expect to get variable names like RG_1 and RG_4B or RG_N from your current data? If you are using the "rating" variable then what would the rules be for Missing or blank rating.
You say: ) i need the table to list out the variables for rating to list out like under "want1". How would i do that? The output table "tmp" has very difference variable naming. Ideally a rename function would work, but in this instance, it will not be possible as the list "RATING" is never a fixed length.
What is in table tmp? What is the relation to the "rating" variable? Are you thinking of using the rating variable as part of variable names? If it varies enough, such that the combined length of the base variable name plus the added text cannot exceed 32 characters.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.