Hi all the data is below. Assume this is excel file sheet1 data.
Employee Details on 10/03/2015 | ||||
eno | ename | salary | ||
1 | aaa | 1000 | ||
2 | bbb | 2000 | ||
3 | ccc | 3000 | ||
4 | ddd | 4000 | ||
5 | eee | 5000 | ||
6 | aaa | 6000 | ||
7 | bbb | 7000 | ||
8 | ccc | 8000 | ||
9 | ddd | 9000 | ||
10 | eee | 10000 | ||
next day 10 records are added | ||||
next day 10 records are added | ||||
Student Details on 10/03/2015 | ||||
std_id | std_name | marks | ||
101 | aaa | 50 | ||
102 | bbb | 60 | ||
103 | ccc | 95 | ||
104 | ddd | 82 | ||
105 | eee | 96 | ||
106 | aaa | 74 | ||
107 | bbb | 52 | ||
108 | ccc | 38 | ||
109 | ddd | 40 | ||
110 | eee | 64 | ||
next day 10 records are added | ||||
next day 10 records are added |
Here two types of data is available. employee and student.
we read this data into sas with two different datasets emp and std;
the code of reading these is as follows.
PROC IMPORT DATAFILE="C:\Users\SYSTEM1\Desktop\Book2.XLS"
OUT=EMP
DBMS=EXCEL
REPLACE;
SHEET='SHEET1';
RANGE='SHEET1$A2:C12'
GETNAMES=YES;
RUN;
PROC IMPORT DATAFILE="C:\Users\SYSTEM1\Desktop\Book2.XLS"
OUT=STD
DBMS=EXCEL
REPLACE;
SHEET='SHEET1';
RANGE='SHEET1$A20:C30'
GETNAMES=YES;
RUN;
So now by next day employee data 10 records added to existing one. and also student details are added.
What my question if we know the range then simply mention in range option. But if we don't know the range then how the total data is added to the existing one. how to automate this. The variables are same but rows are added day by day.
If i am running the code next day then automatically the new are added to existing one. i.e. we need total data of employee including new records.
and same type of student.
All your valid comments are welcome.
Thanks in advance.
Well, my first answer is going to be remarkably similar to one I give whenever people mention Excel, just don't. Over the last 20+ years, development has been ongoing in the field of databases, standard, xml based etc. These are specifically designed and built to capture data, incrementally, with audit trails, with data management functionality, with entry screens, with access control etc. SAS can directly link into these via drivers.
Use the right tool for the right task.
Excel tends to be used as its available, flexible and most people have a basic understanding of it. However it is in no way a database, data management system, auditable, or structured, and the use of it will just end up causing you more work than it saves. In your above sample, what happens when you add a column? Or put text in eno, or leave it missing, or put a picture in ename, or remove ename? Excel's flexibility is its biggest problem when talking about data.
However if your mindset on going down the route of banging in a nail with a stick of wet celery then you could do incremental updates by:
1) read in base file to main dataset
2) read in file from the next day to a temporary dataset
3) Have a merge from main dataset and temporary dataset where in=only from temporary dataset.
The question is what happens if existing data changes, or is removed etc. A DB would handle all that for you, however you would now need to do that yourself. There are some helpers: http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000202975.htm
But you still need to do a lot yourself.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.