Hi I've a two part question I currently have a access table that I need to import into sas. The table has over 100k records . The Access table is in a share directory which takes a long time to do an import (Proc Import- (currently it freezes my laptop) I tried Libname an its brings in everthing fast ( which is great) the issue is there are a lot tables in the the mdb that when i click on the libname it takes on avg 25-30 min to open. Is there a way I can use libname to refer to one table named "TEST" only ? and not everything in there. What I want to do is on table 'TEST" I need to pull the last 30 days worth of data. Is there a way to use proc import and do a "Date" -30 or in libname select table "TEST" and do a -30? Thanks for your assistance
Your query should read:
create table depcount as
select Terminal_ID, count(Terminal_ID) as Repeat
from MyData
where ON_SCHEDULE = "False"
group by Terminal_ID;
This counts the number of records for each Terminal_ID that have ON_SCHEDULE = "False".
Can you post the part of the SQL job that pulls the MS-Access data into a local table? I find it strange that it takes so long. Please include the table record count as well.
PG
Try ACCESS=READONLY and DEFER=YES in your libname statement. It might speed things up.
Another way would be to try the SQL pass-through facility.
Another way would be to create a small ACCESS database with only the tables that you want as linked tables.
PG
Is there a way in doing the libname I can filter for only last 30 days worth of data? Where would I insert ACCESS=READONLY and DEFER=YES what I hV E NOW IS " libname Access c:\ Complinance.mdb;"
libname myLib Access "c:\ Complinance.mdb" READONLY DEFER=YES;
Data selection cannot be done in the libname statement.
PG
The best method would be to use the SQL Pass-through facility because the subsetting would be done by Access. On my machine (Win-XP, Access 2007, SAS 9.3) the following works:
proc sql;
connect to Access as db (path="C:\Documents and Settings\******\myDatabase.accdb");
create view myData as
select * from connection to db (select * from myTable where myTable.date >= int(now())-30 );
quit;
proc print data=myData; run;
The query being stored in a view, it is run only when you run the proc print. Note that the SQL in parenthesis must be valid Microsoft SQL, not SAS SQL.
PG
Hi PG Stats
I plug in your code an did test run on a access db I 'v on my desktop an it worked great. When I tried it on the real Access db I rec this error.Does it matter type of access it was to a .mdb not a accdb? The "Settle" is in a date/ time format dd/mm/yy 00:00:00. The real Path is to a shared directory
ERROR: Prepare: Syntax error in query. Incomplete query clause.
SQL statement: select * from "tbl_dep" where tbl_dep.Settle
>= int(now())-30
ERROR: SQL View WORK.MYDATA could not be processed because at least one of the data sets, or views,
referenced directly (or indirectly) by it could not be located, or opened successfully.
Thanks Again for all your help
"tbl_dep" is not a valid table name in MS-Access, use [tbl_dep] or simply tbl_dep.
PG
proc sql; | ||
33 create table depcount as | ||
34 (select | ||
35 Terminal_ID, | ||
36 Count(Terminal_ID) As Repeat | ||
37 from Mydata | ||
38 Where ON_SCHEDULE eq 'False' | ||
NOTE: This SAS session is using a registry in WORK. All changes will be lost at the end of this | ||
session. | ||
NOTE: This SAS session is using a registry in WORK. All changes will be lost at the end of this | ||
session. | ||
ERROR: Maximum level of nesting of macro functions exceeded. | ||
ERROR: Maximum level of nesting of macro functions exceeded. |
I can't make sense of that log. I remember getting the same type of messages when I tried to run two sessions at once.
But yes, SAS views can be used like normal tables. Sometimes, it is preferable to transform them into local temporary tables (replace the word view with table in the query) to avoid querying the remote database each time you use them.
PG
Hi PGStats,
I change it to table from view an now its avg about 1hr to pull data. I might have to learn to live with that unless you have any suggestion it has over 500k records an grows everyday a little... I tried creating a count of all the TID that have "false" an it states remerging what do you think is causing that?
175 create table depcount as |
176 (select |
177 Terminal_ID, |
178 Count(distinct Terminal_ID) As Repeat |
179 from Mydata |
180 Where ON_SCHEDULE eq 'False' |
181 ) |
182 ; |
NOTE: The query requires remerging summary statistics back with the original data. |
NOTE: Table WORK.DEPCOUNT created, with 15 rows and 2 columns. |
Your query should read:
create table depcount as
select Terminal_ID, count(Terminal_ID) as Repeat
from MyData
where ON_SCHEDULE = "False"
group by Terminal_ID;
This counts the number of records for each Terminal_ID that have ON_SCHEDULE = "False".
Can you post the part of the SQL job that pulls the MS-Access data into a local table? I find it strange that it takes so long. Please include the table record count as well.
PG
Part1)This takes 1 hr on avg to pull the table has 500k records. When the table gets created with the 30 filter it produces 23k records
procsql;
connectto Accessas db (path="g:\Shared Databases\CCMS 2\CCMS2Production.mdb");
createtable myDataas
select *from connection to db (select * from [tbl_deposit]
where tbl_deposit.Scheduled_Settle >= int(now())-30 );
quit;
2nd Part) If I want to add an additional criteria " And Repeat >1" is just an "AND"?
SAS is a funny program how would I add wild card too pick up all types of False(I.E false,False,FALSE)
procsql;
createtable depcountas
select Terminal_ID,Count(Terminal_ID)As Repeat
from Mydata
Where ON_SCHEDULE ="FALSE"
groupby Terminal_ID;
quit;
tHANKS AGAIN PGSTATS
PGSTATS,
How do I select for correct answer ?
Hello BETO, sorry to take so long...
Part 1) Now I understand. It is the part where Access, running on your machine, accesses your database on a network that takes so long. The only way to improve that I can think of, would be to have an index on variable Scheduled_Settle,. if it doesn't already exists.
Part 2) where upcase(on_schedule) = "FALSE" and repeat > 1
good luck
PG
Hi PGStats,
Thank you so much for your assistance with my question... BTW I fig out how to reduce the time of the import from 1hr to 7 min.. I went into the table an change all the text columns in access to 'Unicode compress" .... How can I select that the question has been answer? Thanks a bunch
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.