BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
bohonghong
Fluorite | Level 6

Hi, I tried to get just 10 observations sample from a large database server. but I had to wait 8 hours to get these 10 observations. Is there a quick way to get it?

 

Thanks

1 ACCEPTED SOLUTION

Accepted Solutions
Kalind_Patel
Lapis Lazuli | Level 10

Hi @bohonghong;

 

Please refer this document for speed up your processing;

Because too many parameters we have to check to speed up the processing;

this link will give you proper idea about PROC SURVEYSELECT and its options to speed up this procedure,

Link : http://www2.sas.com/proceedings/forum2007/183-2007.pdf

Thanks.

View solution in original post

16 REPLIES 16
mohamed_zaki
Barite | Level 11

How large is your data? What is you DB? What code you used for that?

bohonghong
Fluorite | Level 6

I couldn't find how many observation on my server database, the ROW showing Unknown. and it is created like 1960.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Would need to know further information to help, DB used, code etc.  However, if you just want 10 observastions from a database, why not just go onto the database, runs some SQL and drop the results into a CSV.  Probably quickest option unless you are planning to do it regularly?

bohonghong
Fluorite | Level 6
I actually do not know the DB meaning, the code is PROC SURVEYSELECT DATA=ECLIB000.JAN2016()
OUT=WORK.RANDRandomSampleJAN2016
METHOD=SRS
N=10;
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Sorry, you have totally lost me there.  You said you were having problems getting data from a database (DB) server.  So I have responded with, why not go back to the database server and extract the data you want.  You have now posted something about a proc surveyselect, which is a SAS procedure?  Please clarify what you have problems with.  I would also check, do you extract the data to a SAS dataset first, or are you doing this surveyselct on libnames to a database?  Post your full code if possible.

bohonghong
Fluorite | Level 6
You know under the data tab when you open the table in EG. There is a Random Sample choice, right, and then you click run. This code submitted on the server database. but it took really long just for one database to get 10 observations.
bohonghong
Fluorite | Level 6
It is a ramdon sample I need, because the database have duplicates. I cannot have 10 observation which came from one person from different year.
LinusH
Tourmaline | Level 20

This is about query performance, so EG is probably not the "problem". Are you using the Query Builder?

 

When you say "large database server", can you be a bit more specific? Vendor/version...

 

Share the code/log from your not so well performing query.

Chances are that your subset criteria is not evaluated in the source database, so that all records are moved to SAS.

Add 

options sastrace=',,,d' sastraceloc=saslog nostsuffix msglevel=i;

to your code, and your will get information in log on what is going on, and where.

 

It may also depend on how the data base table is defined (lack of indexes, no/old index statistics, no/bad partitioning).

Data never sleeps
bohonghong
Fluorite | Level 6
I see, I need to build a model from two small samples (merged later on) from 2 Master Server: SASApp data base. The observations from both data bases are Unknown.

so when I run random sampling, it run more than 8 hours. I just thought why not just copy the table row and put on excel.
LinusH
Tourmaline | Level 20

SASApp is the "name" of your SAS server where your code executes.

It's your library definition that tells you where your data is actually stored. Try:

libname _all_ list;

to get information about your source data base.

If it's non-SAS, you have two major options:

  • Move the data to SAS, then execute your sampling. A good option if you will do this frequently
  • Take your sampling to the database, using SAS in-database technology. Requires a separate license, and an investigation if your set of code can benefit from this.

If this is SAS data, you definitely need to look at the SAS server, physical data storage and CPU/memory capacity.

Add:

options fullstimer;

to better analyse your SAS code excution.

Data never sleeps
bohonghong
Fluorite | Level 6
So how do I know if it is non-SAS? I got Libref= ODS
Scope= IOM ROOT COMP ENV
Engine= ORACLE

Is it not sas?
bohonghong
Fluorite | Level 6

Well now. When I wanna check the log. It is blank and my SAS EG is now frozen, it won't minimize and I had to use task manager to close it. Bummer

LinusH
Tourmaline | Level 20

No, it's Oracle.

Data never sleeps
Kalind_Patel
Lapis Lazuli | Level 10

Hi @bohonghong;

 

Please refer this document for speed up your processing;

Because too many parameters we have to check to speed up the processing;

this link will give you proper idea about PROC SURVEYSELECT and its options to speed up this procedure,

Link : http://www2.sas.com/proceedings/forum2007/183-2007.pdf

Thanks.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 16 replies
  • 1515 views
  • 3 likes
  • 6 in conversation