BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
odesh
Quartz | Level 8

Hello,

In the context of PROC IMSTAT , I am trying to understand the meaning of "SCHEMA".

 

SCHEMA  stores the results in a temporary table that can be accessed by using &_TEMPLAST_.

SCHEMA can also be used to join a fact table to one or more dimension tables.

 

Questions:

1. Are both of the above statements correct ?

2. Is there any relationship between the two statements or is the use of the word "SCHEMA"  purely coincidental ?

 

No attachments.

 

Thanks.

Odesh.

 

 

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Cynthia_sas
Diamond | Level 26

Hi:
  SCHEMA is a PROC IMSTAT statement. SCHEMA is not just a "word" that is used in Chapter 3. When you see SCHEMA in capital letters, the reference is to the SCHEMA keyword as the beginning of a statement that is designed to be used with PROC IMSTAT. You would not for example use the SCHEMA statement in PROC PRINT or in a DATA step program. The SCHEMA statement is designed to be used in PROC IMSTAT as described in this class.

 

  You find the SCHEMA statement in the documentation for PROC IMSTAT here: https://go.documentation.sas.com/?docsetId=inmsref&docsetTarget=p0qtu1ckl3ymi5n1eytqxs41z7w4.htm&doc...

In the documentation, the description of the SCHEMA statement is nearly the same as in the course, although they elaborate a bit more:  "The SCHEMA statement is used to define a simple star schema in the server from a single fact table and one or more dimension tables. The result of the SCHEMA statement is a temporary table that you can use as it is, or with the PROMOTE statement."

 

  On this documentation page: https://go.documentation.sas.com/?docsetId=inmsref&docsetTarget=p0qtu1ckl3ymi5n1eytqxs41z7w4.htm&doc... you see the required arguments for the SCHEMA statement, the options and the use of the macro variable &_TEMPLAST_.

 

  There are a lot of statements in PROC IMSTAT and the course does not cover the statements in one chapter. The discussion in the class is organized by function and the PROC IMSTAT statements in the class are covered in Chapter 2 and Chapter 3.

 

  In Chapter 2 most of the statements are discussed in terms of Data Exploration functions. In Chapter 2, in the lecture entitled PROC IMSTAT Selected Statements, you see a list of some, but not all of the statements that are used by PROC IMSTAT. Since this chapter is focused on Data Exploration, the SCHEMA statement syntax for PROC IMSTAT is not discussed in Chapter 2.

 

  However, in Chapter 3, more PROC IMSTAT statements are discussed in the context of Data Manipulation functionality, which involves creating new columns and joining tables.

  In the lecture in Chapter 3, entitled Data Management: Common Statements, the lecture echoes the documentation when it says: "The SCHEMA statement defines a star schema in the server from a single fact table and one or more dimension tables."

Then, in Chapter 3, in the lecture entitled, Schema Joins and Temporary Tables, it explains that when you use the SCHEMA statement in PROC IMSTAT "The name of the most recently generated temporary table is also going to be stored as a macro variable. And the name of that macro variable is called _TEMPLAST_."


  The use of "SCHEMA" in Chapter 3 is not coincidental. The SCHEMA statement is part of PROC IMSTAT. Chapter 2 and Chapter 3 are discussing PROC IMSTAT statements and usage. IF you use PROC IMSTAT to make a temporary table (using the SCHEMA techniques for data manipulation shown in Chapter 3), then you get the name of that temporary table in a macro variable, as described in the class and in the documentation.

 

  The Chapter 3 lecture on the SCHEMA statement entitled Data Management: Common Statements ends with the note that says "The star schema is really the only choice that we have when we're trying to join tables together inside of PROC IMSTAT."  You need to define that star schema with the SCHEMA statement. So the references to the SCHEMA statement in chapter 3 are not coincidental.

 

  A large portion of Chapter 3 is discussing the SCHEMA statement as part of PROC IMSTAT syntax.

 

  If all you need to do is data exploration on your data, then you would be more likely to use the PROC IMSTAT statements covered in Chapter 2.

 

  However, if you need to do data manipulation or to join tables, then you will need to understand the additional statements, like the SCHEMA statement covered in Chapter 3. And, if you are going to use the SCHEMA statement with PROC IMSTAT, then Chapter 3 explains what options you use with the SCHEMA statement and how it behaves.

 

  I've revised your assertions about the SCHEMA statement for clarification. You originally had this:

SCHEMA stores the results in a temporary table that can be accessed by using &_TEMPLAST_.

SCHEMA can also be used to join a fact table to one or more dimension tables.

 

and I have revised them for clarification purposes in this way:

The SCHEMA statement in PROC IMSTAT stores the statement results in a temporary table that can be accessed by using the macro variable &_TEMPLAST_.  Every time that a temporary table is created, the macro variable contents gets overwritten by the most recent name. 

       

The SCHEMA statement in PROC IMSTAT is what we use to define a simple star schema in the server to join a single fact table to one or more dimension tables. This star schema is the only choice that we have when we're trying to join tables together inside of PROC IMSTAT .

 

  Hope this helps explain the purpose of the SCHEMA statement in PROC IMSTAT is to create a temporary table, and that one of the features of using the SCHEMA statement to create your temporary table is to store the name of the last created temporary table in a macro variable for ease of reference.

 

Cynthia

View solution in original post

2 REPLIES 2
Cynthia_sas
Diamond | Level 26

Hi:
  SCHEMA is a PROC IMSTAT statement. SCHEMA is not just a "word" that is used in Chapter 3. When you see SCHEMA in capital letters, the reference is to the SCHEMA keyword as the beginning of a statement that is designed to be used with PROC IMSTAT. You would not for example use the SCHEMA statement in PROC PRINT or in a DATA step program. The SCHEMA statement is designed to be used in PROC IMSTAT as described in this class.

 

  You find the SCHEMA statement in the documentation for PROC IMSTAT here: https://go.documentation.sas.com/?docsetId=inmsref&docsetTarget=p0qtu1ckl3ymi5n1eytqxs41z7w4.htm&doc...

In the documentation, the description of the SCHEMA statement is nearly the same as in the course, although they elaborate a bit more:  "The SCHEMA statement is used to define a simple star schema in the server from a single fact table and one or more dimension tables. The result of the SCHEMA statement is a temporary table that you can use as it is, or with the PROMOTE statement."

 

  On this documentation page: https://go.documentation.sas.com/?docsetId=inmsref&docsetTarget=p0qtu1ckl3ymi5n1eytqxs41z7w4.htm&doc... you see the required arguments for the SCHEMA statement, the options and the use of the macro variable &_TEMPLAST_.

 

  There are a lot of statements in PROC IMSTAT and the course does not cover the statements in one chapter. The discussion in the class is organized by function and the PROC IMSTAT statements in the class are covered in Chapter 2 and Chapter 3.

 

  In Chapter 2 most of the statements are discussed in terms of Data Exploration functions. In Chapter 2, in the lecture entitled PROC IMSTAT Selected Statements, you see a list of some, but not all of the statements that are used by PROC IMSTAT. Since this chapter is focused on Data Exploration, the SCHEMA statement syntax for PROC IMSTAT is not discussed in Chapter 2.

 

  However, in Chapter 3, more PROC IMSTAT statements are discussed in the context of Data Manipulation functionality, which involves creating new columns and joining tables.

  In the lecture in Chapter 3, entitled Data Management: Common Statements, the lecture echoes the documentation when it says: "The SCHEMA statement defines a star schema in the server from a single fact table and one or more dimension tables."

Then, in Chapter 3, in the lecture entitled, Schema Joins and Temporary Tables, it explains that when you use the SCHEMA statement in PROC IMSTAT "The name of the most recently generated temporary table is also going to be stored as a macro variable. And the name of that macro variable is called _TEMPLAST_."


  The use of "SCHEMA" in Chapter 3 is not coincidental. The SCHEMA statement is part of PROC IMSTAT. Chapter 2 and Chapter 3 are discussing PROC IMSTAT statements and usage. IF you use PROC IMSTAT to make a temporary table (using the SCHEMA techniques for data manipulation shown in Chapter 3), then you get the name of that temporary table in a macro variable, as described in the class and in the documentation.

 

  The Chapter 3 lecture on the SCHEMA statement entitled Data Management: Common Statements ends with the note that says "The star schema is really the only choice that we have when we're trying to join tables together inside of PROC IMSTAT."  You need to define that star schema with the SCHEMA statement. So the references to the SCHEMA statement in chapter 3 are not coincidental.

 

  A large portion of Chapter 3 is discussing the SCHEMA statement as part of PROC IMSTAT syntax.

 

  If all you need to do is data exploration on your data, then you would be more likely to use the PROC IMSTAT statements covered in Chapter 2.

 

  However, if you need to do data manipulation or to join tables, then you will need to understand the additional statements, like the SCHEMA statement covered in Chapter 3. And, if you are going to use the SCHEMA statement with PROC IMSTAT, then Chapter 3 explains what options you use with the SCHEMA statement and how it behaves.

 

  I've revised your assertions about the SCHEMA statement for clarification. You originally had this:

SCHEMA stores the results in a temporary table that can be accessed by using &_TEMPLAST_.

SCHEMA can also be used to join a fact table to one or more dimension tables.

 

and I have revised them for clarification purposes in this way:

The SCHEMA statement in PROC IMSTAT stores the statement results in a temporary table that can be accessed by using the macro variable &_TEMPLAST_.  Every time that a temporary table is created, the macro variable contents gets overwritten by the most recent name. 

       

The SCHEMA statement in PROC IMSTAT is what we use to define a simple star schema in the server to join a single fact table to one or more dimension tables. This star schema is the only choice that we have when we're trying to join tables together inside of PROC IMSTAT .

 

  Hope this helps explain the purpose of the SCHEMA statement in PROC IMSTAT is to create a temporary table, and that one of the features of using the SCHEMA statement to create your temporary table is to store the name of the last created temporary table in a macro variable for ease of reference.

 

Cynthia

odesh
Quartz | Level 8

Thanks Cynthia. This is a nice detailed discussion. I can see the relationship now.

 

Odesh.