Consider the dataset SASHELP.CLASS...it has a set of variables. If I want to add a new variable to make a new version of the CLASS dataset, I have to:
1) Use a DATA step program (or a PROC SQL program)
2) Read in the existing dataset (with a SET statement or a FROM, if SQL)
3) Create my new variable with an assignment statement
newage = age + 5;
proc print data=work.newclass;
In this program, my "created" variable is called NEWAGE....it is based on adding 5 to the student's current age to find out how old the student will be in 5 years.
The input dataset is SASHELP.CLASS and the output dataset, WORK.NEWCLASS, will be an exact copy of the original dataset, but with a new variable. After you are sure the new version of the data is correct and is the way you want, then you could rename or replace the original dataset with the new dataset. I don't generally recommend that you replace your original data with changed data until after you are sure that all your program logic worked correctly.
That is the beauty of having WORK files. You can experiment and code and try things without losing your original file. Then, once things are working correctly, you can use utility programs to perform any housekeeping or renaming that you might want to do.
Ah, I see, you want to add observations, not just create new variables in an existing data set. There are several ways to do it. One would be to read the "starting" data from an input file and another way would be to have a data step program with multiple output statements.
length employee $16;
proc print data=empfile;
In the above program, WORK.EMPFILE is the dataset being created. There are 2 variables: EMPLOYEE and AGE. Four observations are being created (4 OUTPUT statements). The program starts with the keyword "DATA" and ends with the keyword "RUN"; The program statements will execute one time, but every OUTPUT statement creates an observation to be written to the output dataset.
Alternately, I could read the data from "inline" data lines, like this:
proc print data=empfile2;
title 'From "inline" data';
In this second program, the program statements start with the keyword DATA and end with the keyword RETURN. The data that will become the observations comes after the DATALINES statement. SAS will loop through the DATA step statements one time for every line of data. There's no explicit OUTPUT statement in this program because there is an implied output at the "bottom" of every DATA step program.
If the data lines were stored in a file on disk, c:\temp\mydata.txt, then the above program could be written as:
length employee $16;
input employee $ age;
proc print data=empfile3;
title 'From data on disk';
There are many different forms that the INPUT statement could follow, depending on how the data lines are stored in the disk file. You could even read delimited data, CSV data and/or Excel files or other types of files, with using different options and statements.
The documentation on the subject of creating a SAS dataset by reading data "into" SAS format is quite thorough and contains lots of good examples. You just need to find the example that most closely resembles your data and then read about the statements used in those sample programs.