BookmarkSubscribeRSS Feed
Pabster
Obsidian | Level 7

Hi all

I'm currently reading about the proc tabulate. I need to be able to obtain a table as follow (see picture) 57710711_598732707285621_7190044470644047872_n.jpg

Fromt left to right column are :
the Ratios (X1 to X5)
Statistic (Mean , Standard Dev, Q1, Median, Q3, Min , MAX
Status of the company (1 for Bankrupt, 0 for non-bankrupt)

On the row (top to bottom):
Years prior the bankruptcy(1 to 5)
D (which actually is going to be called STATUS)

I hope it`s clear, if needed I can upload through We-transfer the data file where my data will be selected from

5 REPLIES 5
ballardw
Super User

For a number of reasons it is preferable to upload example data as data step. Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as text to show exactly what you have and that we can test code against.

 

We likely don't need all of the records but enough to demonstrate the data and have enough of the variables to test code. The data step gives us the variable types and names as well as values while avoiding certain problems with cross operating system behaviors. 

 

One thing to be very clear of when using proc tabulate: if any of the class (categorical row/column header) variables has missing values the entire record is not used for the table unless you use the option missing for the variable(s). If you use missing then MISSING becomes a display category by default.

 

Without all of the actual variable names and such I am guessing that a start would look something like:

proc tabulate data=have;
   var x1 - x5;
   class years bankrupt d  ;
   table (years d) *(mean stddev Q1 median Q3 min max),
         (x1 - x5)*bankrupt 
         /misstext=' '
   ;
run;

 

Pabster
Obsidian | Level 7

Hi  @ballardw ,

 

Here is a sample of my data:

For the moment I have yet not used a proper code to be able to see the data as I'have drawn it. Or else I would have upload it.

The sample is only for one specific year ( H=1, so 1 year prior bankruptcy). I have to roll my program 5 times to obtain the 5 data sets and then I would merge them and use for the final set to obtain what I showed. But I first need to understand how I can arrange the data is the proper order and then I suppose the merge if not going to be to hard.

 

ballardw
Super User

What variable represents "years before bankruptcy"? or D (that you said would be called status just to confuse things with a variable already named Status in your data set).

 

To use Proc Tabulate, I would expect records to have :

a single variable that holds the values for "years before bankruptcy" having values of 1 to 5; a variable containing bankruptcy status (looks like either your current STATUS variable or COSTAT might hold that), your other D or Status or whatever variable; the 5 numeric variables x1 to x5 that you want summarized.

Your data is about half way there, using your example:

proc tabulate data=sample;
  class status fyear costat;
  var x1-x5;
  table (fyear costat)*(mean stddev q1 median q3 min max) ,
        (x1-x5)*status
        /misstext=' '
   ;
run;

Since merge in SAS terms usually means to combine making data wider I would NOT merge data. I would SET your 5 data sets together so that your H (not really clearly defined) has a value of 1, 2 , 3, 4 or 5.

Create the data set for the tabulate code using something like:

data want;
   set year1
         year2
         year3
         year4
         year5
   ;
run;

You can add the "year" variable in the data step that combines the data and use it instead of the FYEAR variable in the example code above.

Pabster
Obsidian | Level 7

Hi @ballardw ,

Sorry for the wait. I went back to properly arrange some things in my data here is how I arranged it:
-Variable H is the horizon (so how many years prior bankruptcy I am analyzing, example h=1 then one year prior h=2its two years...)
-Variabel Costat is the status of the company (if compustat classified it as Bankrupt =1 or Non-Bankrupt =0). The variable called Status if for further analyse and is not relevant here

-The D is useless , the system inputs :"Active/Inactive Status marker" and that' s enough

Here is my present code with proc tabulate (but I cant manage to arrange it as I showed in the 1st picture, although its going somewhere):

proc tabulate data=Analyse_All;
	class Costat H ;
	var x1 x2 x3 x4 x5;
	table (H Costat)*(Mean STD P25 P50 P75 MIN MAX),
	      (X1 - X5)*costat; 
	run; 

The new data is in the attachements

ballardw
Super User

First thing is to address bits like this that result from your code:

WARNING: A class variable is crossed with itself in the table statement at line 807.  This may
         cause unpredictable results.
WARNING: A class variable is crossed with itself in the table statement at line 807.  This may
         cause unpredictable results.
WARNING: A class variable is crossed with itself in the table statement at line 807.  This may
         cause unpredictable results.
WARNING: A class variable is crossed with itself in the table statement at line 807.  This may
         cause unpredictable results.
WARNING: A class variable is crossed with itself in the table statement at line 807.  This may
         cause unpredictable results.

Tabulate is not intended to summarize a variable with itself. So this

	table (H Costat)*(Mean STD P25 P50 P75 MIN MAX),
	      (X1 - X5)*costat; 

is really not right in any form. If you want to get a separate column for COSTAT under each value of X1 to X5 then better would be to add another variable to your data set an use that instead in the second position

 

You do need to describe what is not correct about the output. It has rows of data for the levels of the COSTAT variable. You did not SHOW any values in the body of the picture so it is not possible to tell what was intended.

 

HINT: pick 1 variable such as X1, maybe 2 statistics, and two levels of H. Create a small data set with just that where you can calculate the results by relatively easily, (MAX and MIN are usually easy) may 10 rows of data. The show the entire results for that example data. Exclude for now any variables that are not involved in the specific problem.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1573 views
  • 0 likes
  • 2 in conversation