BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Mikey03
Obsidian | Level 7

Hi,

 

I have a general question about the dataset that is read into a graphic procedure like SGPlot.

 

When building a visualisation across several datasets (for example AE, Dosing plus reflines) do you have to build one highly denormalised dataset that spans all the source domains or is their an ability to join the source datasets within the proc statement?

 

I come from a database build perspective and dislike the idea of building the highly denormalised dataset just to feed it into a proc statement. From my point-of-view it would be an advantage if the "data=" assignment could reference a SQL join statement. I understand that a work area temp table can be built for this purpose but for me this isn't a 'good practice' method.

 

If anyone has seen a white paper that provides background in this area of building visualisations please let me know.

 

Thanks for your assistance. (Using SAS 9.4)

1 ACCEPTED SOLUTION

Accepted Solutions
GraphGuy
Meteorite | Level 14

Here are two other techniques (with traditional SAS/Graph), which I often use in complex graphs and dashboards that need to pull data from multiple sources, that don't require combining the data all into 1 dataset ...

 

I often use an annotate dataset to overlay layers onto a graph (a simple example would be overlaying a line on a bar chart).

 

And I often create several individual charts from single datasets, and then use Proc Greplay to overlay them together (either on top of each other, or tiled as a dashboard).

 

View solution in original post

8 REPLIES 8
Jay54
Meteorite | Level 14

The SGPLOT procedure leaves the data manipulation to SAS.  SAS provides many procedures and the Data Step for you to combine your data in the way you want.  SGPLOT only plots the data.  For your specific issue, I presented a paper last year on how to combine the data to produce a combined AE+CM graph.  You could leverage the ideas to combine other data over a common time line.

Reeza
Super User

I come from a database build perspective and dislike the idea of building the highly denormalised dataset just to feed it into a proc statement.

You could create a view instead but to be honest standard star or snowflake schemas don't provide data in a form that can be used for statistical analysis  or machine learning, in general. So rather than a lot of views analysts generally prefer that highly denormalized data set for analysis. 

 

At least SAS uses standard table structure, the other languages, D3 and JavaScript libraries like JSON and XML and those make my head hurt. 

PGStats
Opal | Level 21

SAS graphing procedures read data from a single table. You can avoid creating a temporary table with a SQL view which is run only when the graphic proc is executed.

PG
Mikey03
Obsidian | Level 7

Do you have an example of a SG procedure driven by a view? Would the view be linked to the "data" statement?

 

I assume using GTL will also give you more control over the input dataset but for the moment I would like to understand the SG procedures.

Reeza
Super User

GTL still needs only one data set, you use SGRENDER to create the graph in the end and it also only handles one data set.

Useless example here:

 

data demo / view=demo;
Set sashelp.class;
Run;
Proc sgplot data=demo;
Scatter x=height y=weight;
Run;
PGStats
Opal | Level 21

Here is an example combining two SASHELP datasets into a SQL view

 

proc sql;
create view allBMI as
select 
    "Heart Study" as source,
    ageAtStart as age,
    (weight*0.45359237)/(height*0.0254)**2 as BMI
from sashelp.heart
union all
select
    "NCHS",
    age,
    BMI
from sashelp.bmimen;
quit;

proc sgplot data=allBMI;
scatter y=bmi x=age / group=source markerattrs=(size=3) jitter;
run;
PG
GraphGuy
Meteorite | Level 14

Here are two other techniques (with traditional SAS/Graph), which I often use in complex graphs and dashboards that need to pull data from multiple sources, that don't require combining the data all into 1 dataset ...

 

I often use an annotate dataset to overlay layers onto a graph (a simple example would be overlaying a line on a bar chart).

 

And I often create several individual charts from single datasets, and then use Proc Greplay to overlay them together (either on top of each other, or tiled as a dashboard).

 

Mikey03
Obsidian | Level 7

Thanks for the overlay idea. It sounds like it may get around the requirement to denormalise datasets.

 

In one of my graphs I have a dataset that is used to create reflines - the number of lines can be anything between 0 through 15 per plot. So currently i am joining this refline dataset with the "main" dataset and have 15 denormalised columns (one for each refline.) It sounds that using this overlay approach I dont need to join the datasets.

 

I will look into your 2 techniques - hope they arent too difficult to follow!

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 1073 views
  • 2 likes
  • 5 in conversation