DATA Step, Macro, Functions and more

Collapse Dataset On Particular Character Variable

Accepted Solution Solved
Reply
Regular Contributor
Posts: 223
Accepted Solution

Collapse Dataset On Particular Character Variable

Please peruse following partial dataset:

 

i_X	N	Diff_2b	Ratio_2a	Slope_5050x_3	Slope_5050x_5
_20206_6	25	0.48	-1.41	1.82	0.21
_20206_6	75	0.82	2.40	-0.39	0.91
_20206_6	92	1.09	0.80	-1.45	0.19
_22101_0	14	0.28	0.23	1.09	-0.22
_22101_0	85	1.31	0.22	0.30	-0.51
_22101_0	49	-1.19	-0.40	0.23	-0.52
_22101_0	20	1.58	-0.03	-0.70	1.49
_22101_0	64	-0.73	-0.13	1.05	-0.55
_23002_6	97	1.22	1.64	0.40	1.28
_23002_6	24	-1.72	-0.12	-0.61	0.07
_23002_6	80	-0.09	0.25	2.65	-0.31
_20502_6	45	-0.02	-0.48	-0.35	0.20
_20502_6	71	-1.91	0.61	-0.86	0.31
_20502_6	13	1.43	-0.71	0.97	-0.43

I would like to collapse the dataset by the first column.

 

All the other numeric columns should be kept.  Values in those columns should be averages based on the character values being collapsed on.

 

Any suggestions greatly appreciated.

 

Nicholas Kormanik

 

 


Accepted Solutions
Solution
‎09-05-2016 07:54 PM
SAS Super FREQ
Posts: 8,866

Re: Collapse Dataset On Particular Character Variable

Hi:

  In addition to Reeza's suggestion for SQL, you could also do this with TABULATE or REPORT. I prefer REPORT in something like this because the structure of the output dataset will be in the structure specified in your COLUMN statement and depending on your usage for the i_X variable, you will either get collapsing (GROUP usage) or ordering (ORDER usage).

 

  Here's the PROC REPORT example. What you called N in your example, I called CNT because I wanted to use N in PROC REPORT to give me the number of rows.

 

cynthia

 

report_slope.png

View solution in original post


All Replies
Super Contributor
Posts: 266

Re: Collapse Dataset On Particular Character Variable

Posted in reply to NicholasKormanik

 

I am not sure if you looking for this. Otherwise please elaborate

proc sql;
create table want as
select  i_X	
,avg(N) 
,avg(	Diff_2b) 
,avg(Ratio_2a)
,avg(	Slope_5050x_3)
,avg(	Slope_5050x_5)
from want
group by i_X;
quit;


 

Super User
Posts: 19,815

Re: Collapse Dataset On Particular Character Variable

Posted in reply to NicholasKormanik
Solution
‎09-05-2016 07:54 PM
SAS Super FREQ
Posts: 8,866

Re: Collapse Dataset On Particular Character Variable

Hi:

  In addition to Reeza's suggestion for SQL, you could also do this with TABULATE or REPORT. I prefer REPORT in something like this because the structure of the output dataset will be in the structure specified in your COLUMN statement and depending on your usage for the i_X variable, you will either get collapsing (GROUP usage) or ordering (ORDER usage).

 

  Here's the PROC REPORT example. What you called N in your example, I called CNT because I wanted to use N in PROC REPORT to give me the number of rows.

 

cynthia

 

report_slope.png

Regular Contributor
Posts: 223

Re: Collapse Dataset On Particular Character Variable

Posted in reply to Cynthia_sas

Beautifully done and presented, Cynthia.  And further demonstration of the powers of Proc Report.

 

Thanks very much!

 

 

Regular Contributor
Posts: 223

Re: Collapse Dataset On Particular Character Variable

[ Edited ]
Posted in reply to Cynthia_sas

Cynthia, two follow-up questions, please:

 

1) Your code as provided produces two adjacent columns (columns 2 and 3), "numrows" and "cnt". I would like to see as column 4 the product of those two columns, c2*c3, called, perhaps, "rows_x_cols". What would be the additional code to produce that?

 

2) Most columns resulting contain way too many decimal places, making things a little hard to read. What would be the additional code to format various columns as having just three (3) decimal places?

 

[Edit:]  I see now, from other posts, that I could add the following toward the top:

 

format _numeric_ 9.3;

Adding this works to change the format of the report columns.  Super!  However, I would like to change the format of the exported dataset as well.  Should I do that as a subsequent data step?

 

Thanks very much!

 

 

Super User
Posts: 19,815

Re: Collapse Dataset On Particular Character Variable

Posted in reply to NicholasKormanik
SAS Super FREQ
Posts: 8,866

Re: Collapse Dataset On Particular Character Variable

Posted in reply to NicholasKormanik
HI:
Do I remember correctly that you might be sending your output from the REPORT step or the dataset to Excel? If so, I would not look for the SAS FORMAT statement to impact ODS TAGSETS.EXCELXP or ODS EXCEL or even ODS CSV -- Excel is very persnickety about whether it uses SAS formats -- for the 2 XML based Excel destinations you can use TAGATTR to control the Microsoft format to be used by Excel. Most other SAS destinations (HTML, RTF, PDF will respect SAS formats).

To change the report, you only need to use a format statement as you show OR, if you want different formats for different variables -- as, for example if you have dates and numbers, then put your format= or f= in the DEFINE statement.

If you want to "compute" a new report column based on the value of other columns on the report, such as you describe, then you would need to use a COMPUTE block. There are a lot of previous forum postings about the "left to right" rule of PROC REPORT and the special way you need refer to analysis variables. However, in my example, numrows is an alias and CNT is an analysis variable.

So if he COLUMN statement changed to something like this:
column grpvar n=numrows cnt newvar var3 var4 var5;

Then you would need:
define newvar / computed;

and then later in the PROC REPORT step:
compute newvar;
newvar = numrows * cnt.sum;
endcomp;

I suggest you look in the PROC REPORT documenation for more examples of calculating report columns using a COMPUTE block.

cynthia
Regular Contributor
Posts: 223

Re: Collapse Dataset On Particular Character Variable

Posted in reply to Cynthia_sas

Worked perfectly, Cynthia.

 

Thanks a million!

 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 507 views
  • 6 likes
  • 4 in conversation