I move from Relational Database and new to SAS.
Now I need to import several CSV files into SAS, and there are relationships between those files. Just like the relationships between tables in database.
I am wondering, is there the same concept exits in SAS such as the foreign key which I need to set up, or I just import those files directly regardless of the relationships because no such things in SAS?
You cannot create this kind of constraint between tables in SAS. This represents really poor design. Why does Proj_ID take 10 characters if only 4 are needed? Why create a link based on substrings? Fix this (if you have the opportunity) by splitting the primary index in table Response into two fields.
PG
probably not the way that you do as a database admin
Here is one programmer's ideas about identifying table types
http://www.sascommunity.org/wiki/Database_Design_References
there is no exact mechanism in SAS
to link a dimension table to a column that is a foreign key.
the closest is a format
the association is made in two steps
1.
proc format;
value f_key_st 'AL' = 'Alabama'
'FL' = 'Florida';
2. assign format to variable
DATA X;
attrib st length = $2 format = f_key_st.;
be aware that proc format creates a catalog (compiled code) from the value statement.
In data x there is no 'link', only a reference to the fact
that there is an in-memory look-up table,
which is available when displaying the data via procedures.
3. data sets of dimension tables can be converted to formats
using the cntlin option
proc format cntlin=library.dimension_st;
RTFM because the data structure of a format control data set is constrained.
Hmm, formats instead of referential integrity...?
Please.
If you are working with Base SAS libraries, you can use constraints that enforces referential integrity. Question is, do you really need to do that? Often FK constraints can be obsolete if you are using look.up operations during load.
SAS(R) 9.4 Language Reference: Concepts, Fourth Edition
You can define general and referential constraints for datasets:
SAS(R) 9.3 Language Reference: Concepts, Second Edition
And the Query Builder in SAS Enterprise Guide will usually try to identify which variables can be used as keys for joins.
The joins are exactly what I need.
SAS isn't going to "automatically" link anything but you can define views with either data step code or proc sql to duplicate some of that behavior.
I am still blurring now. I just start from yesterday on SAS with 0 experience, and I have a task need to complete in 3 days.
So, suppose I have two data sets: Project and Response.
data structure for Project and Response, and their relationship:
footnote #4 says: The first 4 characters of Resp_ID represent relevant Proj_ID.
Sample for Proj_ID: P001
Sample for Resp_ID: P00101.
How could I set up link to those two in SAS?
My first attempt would involve the existing DBMS and see if there is a schema describer whatever they might call it that would have SQL code hiding what is done with Info_project and Response tables. If you can find SQL code there then that is likely relatively easy to translate to a SAS Proc SQL statement to do similar.
SAS doesn't, to my knowledge at least, maintain any sort of background connections that just happen to exist. They pretty much have to be explicitly stated, though that can be a pretty complex statement.
Some thing like this (using aliases to avoid LOTS of table references after the tables Info_project and Response have been created in a SAS library named Test)
proc sql;
create view test.project_response as
select a.*, b.* /* some don't like use of * but this is example code*/
from test.Info_project as a join test.Response as b on a.Proj_id = substr(b.Resp_ID,1,4);
quit;
Would have view that could be accessed with all of the info in combined tables and updates if the tables are updated. The code except for the line with the "create view" could be used as a nested query involving further joins. Being a pessimist sort, I would test each piece and level before nesting the queries. The aliases at that point are not the best idea from a documentation point as the nesting levels increase.
Nice explanation. Help me a lot.
You cannot create this kind of constraint between tables in SAS. This represents really poor design. Why does Proj_ID take 10 characters if only 4 are needed? Why create a link based on substrings? Fix this (if you have the opportunity) by splitting the primary index in table Response into two fields.
PG
This is really a good idea. Split them up.
Why do you need to setup "relations" between the datasets? Are you cleaning the data? Are you trying to create a transactional system using SAS?
Or do you just want to load a copy of the data that someone has sent you? In the later case why would you need to create relations?
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.