BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
sergie89
Quartz | Level 8

Hello guys,

 

We have a Greenplum db connection on our SAS Visual Analytics environment. We want to load some tables from this database into LASR and refresh the data frequently (5 - 10 minutes).

 

Does anyone has experience whit this kind of stuff ? What is the best opportunity to perform this action ?

 

Thanks in advance.

1 ACCEPTED SOLUTION

Accepted Solutions
SASKiwi
PROC Star

You've chosen an interesting approach to automatically generate a load program. What does the program do? Do a complete reload of the table each time? If so I would be careful with this approach as it may not be very efficient as your data volumes grow. You may want to consider an incremental load process that updates the existing LASR with new data only. Alternatively you could start with a simple load program like you have generated and then see how it performs. It is a lot quicker to get started this way.

 

My personal preference would be to manually write a SAS program to do this so you get better control and can do things like processing loops and incremental loads. For incremental loads you can use the IMSTAT procedure:

https://documentation.sas.com/?docsetId=inmsref&docsetTarget=p10dosb1fybvpzn1hw38gxuotopk.htm&docset...

The downside of this approach is a greater skill level and effort is required to build it and get it working reliably.

View solution in original post

15 REPLIES 15
SASKiwi
PROC Star

Probably the best approach would be to schedule a batch job on VA that runs in a loop to load the data then sleeps for 5-10 mins then repeats the loop. Take a look at the SAS SLEEP function:

https://documentation.sas.com/?docsetId=lefunctionsref&docsetTarget=p0a6vn2td7bjr2n1viy8y4lgvq61.htm...

 

I would probably run one batch job per day, with code that loops for 24 hours then exits. That way logs wont get too large.

 

How much data is loaded in a typical load? Do you need to update, add or delete rows? What version of VA are you using? 

sergie89
Quartz | Level 8

Hi @SASKiwi,

 

Thank you for your quick reply.

 

We are running on SAS VA 7.3. The tables containts sensor data from ESP (Event Stream Process). So it could be grow very fast. I think so. But for now the tables are very small (2.5 MB), because we are expirementing now. And adding new rows is enough for this action.

 

Is there a standard sas program that we can use and schedule for these kind of purposes?

sergie89
Quartz | Level 8

Hi @SASKiwi,

 

I have found the following option in SAS:

 

Knipsel.PNG

 

You can record the actions in SAS Visual Analytics as SAS statements.

 

After enabling this option you need to load once manually the regarding tables.

 

Knipsel2.PNG

 

After that SAS creates a load.sas program in the tmp directory. I can schedule this program in crontab now.

 

But is this the right way to perform this action or are other efficient ways to do this ?

 

Thank you.

SASKiwi
PROC Star

You've chosen an interesting approach to automatically generate a load program. What does the program do? Do a complete reload of the table each time? If so I would be careful with this approach as it may not be very efficient as your data volumes grow. You may want to consider an incremental load process that updates the existing LASR with new data only. Alternatively you could start with a simple load program like you have generated and then see how it performs. It is a lot quicker to get started this way.

 

My personal preference would be to manually write a SAS program to do this so you get better control and can do things like processing loops and incremental loads. For incremental loads you can use the IMSTAT procedure:

https://documentation.sas.com/?docsetId=inmsref&docsetTarget=p10dosb1fybvpzn1hw38gxuotopk.htm&docset...

The downside of this approach is a greater skill level and effort is required to build it and get it working reliably.

ashok1213
Calcite | Level 5

i reload the class dataset in sas visual analytics(LASR Server) by using the libname method.
now problem is i adding the two columns in class dataset,again i reload the class dataset by using libname method.
i open the SAS VA Dashboard,these two columns are not reflecting in SAS VA
because change the structure of class dataset.So how to rectify this problem.

SASKiwi
PROC Star

You need to run PROC METALIB to update the LASR metadata to reflect the change:

 

 proc metalib;
   omr (library = "VALASR");
   folder       = "MyVAMetadataFolderName";
   select       = ("class");
 run;
JuanS_OCS
Amethyst | Level 16

Hello @sergie89,

 

very interesting question, I do like it. So, ESP storing data in GreenPlum and Visual Analytics.

 

ESP does indeed collect a bunch of data, and quickly. You probably are also looking for quick updates in SAS VA, to show real-time data.

 

The bigguest tick is to load an initial amount of data, as small as possible, and to focus on doing quick appends more or less frequently, to the data already loaded.

 

This normally requires a collective efort, within your IT, to make it work. 

 

I would recommend, first, to have VA only to load data and show reports, not ETL transformations. To bring data from a database to SAS is already considered a transformation.

 

Having this in mind:

 

1- If you can, have a separate SAS environment for ETL (DI, BI,...) that will connect to database and load stuff into VA.

2- Maintain a local SAS datamart (in disk), in the SAS ETL server, with the data you need to load in VA.

3- Make your DB to push the new data (as few records as possible) to the SAS ETL server.

4- The SAS ETL server will have all the logic, to load full stuff, or to only append, data, into SAS VA. For appending;

proc LASR, Append option: http://support.sas.com/documentation/cdl/en/inmsref/67306/HTML/default/viewer.htm#p0d2qm1tabupzkn19s...

proc IMSTAT, update options http://support.sas.com/documentation/cdl/en/inmsref/68736/HTML/default/viewer.htm#n1o5jdcmtr41jxn1ml...

5- (optional) When available for you, I highly recommend to move from VA 7.X to the new VA 8.x in SAS Viya. Viya has a great integration with ESP, regarding performance and many other features.

 

Hope it helps,

 

Kind regards,

Juan

sergie89
Quartz | Level 8

Hi guys @SASKiwi @JuanS_OCS,

 

Thank you for your suggestion.

 

@JuanS_OCSESP is running on a separte environment and writes data directly to Greenplum database. We are using SAS/ACCESS Interface GreenPlum in VA to connect the Greenplum database. We want to load some tables from this database to LASR and update the data frequently.

 

@kiwi

 

I have tried to write a SAS program to perform the load.
Proc LASR, Append option works fine, but proc IMSTAT, update option not. Is there something wrong with my code? 

 

 

/* Load into server */                                                                                                                                 
LIBNAME greenplum GREENPLM  DATABASE=  SERVER=""  SCHEMA=  USER=""  PASSWORD="" ;                                           
                                                                                                                                                                                                                                                
/* Assign the target library */                                                                                                                                                                                                        
LIBNAME lasr SASIOLA  TAG=lasr PORT=10051 SIGNER="https://va.example.com:443/SASLASRAuthorization"  HOST="va.example.com" ; 


/* Optimize Load With PROC LASR */                                                                                                                                                                                                     
 PROC LASR add data=lasr.dataset noclass port=10051;                                                                                                                                                                                                
           performance host="va.example.com";                                                                                                                                                                                                                                                                                                                                                                                                                      
 RUN;   

 /* Add new records to the the table */        

DATA lasr.dataset (append=yes);
    set greenplum.dataset;
RUN;

PROC IMSTAT data=lasr.dataset;
        update data=greenplum.dataset;
RUN;

Thanks in advance for your help

 

JuanS_OCS
Amethyst | Level 16

Hi @sergie89,

 

a question, is VA distributed or non-distributed. I am not sure if IMSTAT does work on non distributed environments because its main functionality is for partitioned tables.

 

Leaving that bit aside, I would like to insist just once more, that it is better to have first the relevant data in the server where you run code.

 

Explanation: if you only do a libname and then doing any operation with a table that is in a a source A (say, a database A) and with a table that is in a source/destination B (say, SAS LASR or SAS BAse, or another database), what SAS does is to bring all the tables involved, full tables, do the operation, and bring the final data to the destination. Which is VERY slow and resource consuming. 

 

And that's the reason to ensure only filtered data is present before any query/join/operation between 2 tables.

 

Only then, you can run an append operation with optimal performance. And that is what you want/need, because you do not want overlapping append operations, you need them to be as quick as possible.

 

Or am I misunderstanding something here?

sergie89
Quartz | Level 8

@JuanS_OCS,

 

We have distributed VA-environment.

 

I understand you, but I am not a high level programmer, only a good SAS administrator. Smiley LOL

 

So I am not very familiar with append option. Indeed, what we want to do is check if there new records are available in the source table and load only the new records to the target table.

 

What kind of modifications should we do if we want only append the new records to the target table ? Can I use a where clause with datetime to filter the records?

 

 

JuanS_OCS
Amethyst | Level 16

Hi there again @sergie89,

 

I would say I might have a similar profile. Not the best programmer either, just a geek-y that is into architecture of SAS.

 

About IMSTAT, it should work, then. What error do you get?

 

You can see below a small modification to the code, that I am sure many peers can improve it even more.

 

As you can see, first bringing the relevant data to local, then updating. Perhaps it would be interesting to keep record of from where to where do you get data, in a file or something else. Then you know what was the update. The date and time is a good idea.

 

 

/* Load into server */                                                                                                                                 
LIBNAME greenplum GREENPLM  DATABASE=  SERVER=""  SCHEMA=  USER=""  PASSWORD="" ;                                           
                                                                                                                                                                                                                                                
/* Assign the target library */                                                                                                                                                                                                        
LIBNAME lasr SASIOLA  TAG=lasr PORT=10051 SIGNER="https://va.example.com:443/SASLASRAuthorization"  HOST="va.example.com" ; 

/* Assing a local library */
LIBNAME local BASE "/..../my_library_folder";
/* Optimize Load With PROC LASR */ PROC LASR add data=lasr.dataset noclass port=10051; performance host="va.example.com"; RUN;

/* Get new records from Greemplum to local */
data local.mytable (keep= .... drop=...); /* interesting for improving even more the query */
set greenplum.mytable (where= );
run;

/* Add new records to the the table */ DATA lasr.dataset (append=yes); set local.dataset; RUN; PROC IMSTAT data=lasr.dataset; update data=local.dataset; RUN;

 

 

 

 

 

SASKiwi
PROC Star

@sergie89 - What does the SAS log report for your IMSTAT step?

sergie89
Quartz | Level 8

Hi @SASKiwi,

 

It's very strange. When I run the code bellow,  I get the error message:"data could not be located in the LASR". However the table is already loaded in LASR. 

 

41         PROC IMSTAT data=lasr.raw_sensor_data;
42                 update data=greenplum.raw_sensor_data;
43         RUN;

 

ERROR: The table greenplum.RAW_SENSOR_DATA could not be located in the LASR Analytic Server image.
WARNING: The LASR Analytic Server action request for the UPDATE statement did not complete successfully.

 

SASKiwi
PROC Star

How did you load the table in the first place? If you used just a DATA step then you also need to register the table in SAS metadata - this is a necessary requirement for all LASR data. You use PROC METALIB to register tables:

https://documentation.sas.com/?docsetId=lrmeta&docsetTarget=p0x3vb146tx931n1olxsy430zepl.htm&docsetV...

 

 proc metalib;
   omr (library = "LASR");
   folder       = "MyVAFolder";
   select       = ("raw_sensor_data");
 run;

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Tips for filtering data sources in SAS Visual Analytics

See how to use one filter for multiple data sources by mapping your data from SAS’ Alexandria McCall.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 15 replies
  • 5779 views
  • 11 likes
  • 4 in conversation