Hi everyone,
I am completely new to SAS (University Edition, newest version) and actually programming as well, so probably these questions have arised many times before (but I have googled for hours, watched YouTube videos and am still stuck with the basics).
Just for the background, I have to analyze a relatively complex data set, split into several tables, for the university.
In the first step, I have to merge "Table 1" with "Table 3". There is not a single column in both tables which can be used for the data merge, but I have a linking table, "Table 2", which assigns a specific identification Key from Table 3 to the identification Keys from Table 1 and allows me to merge both tables. I would start with merging Table 2 and Table 1, so that I can merge Table 1 and Table 3 afterwards.
To make it a litte bit more complex, the data in Table 1 is given on a quarterly basis, with the "Date" column being in the format "31MAR2003". The data in Table 3 is given on a yearly basis, with the "Year" column being in the format "2003". I would like to assign, based on the identification key, the yearly datapoints of Table 3 to every quarterly datapoint in Table 1, of course just for the quarters which belong to the respective year (so that the data from Table 1 for the year 2003 is assigend to every quarterly datapoint, within 2003, of Table 1).
I hope that this is comprehensible, otherwise just ask me for clarifications.
Thank you in advance!!
If you use the GUI tasks to join the data, the code will be generated for you. This way you can play around and test the different options to ensure you get what you need.
1. Add a Query (Tasks>Utilities>Query)
2. In Tables, add both your tables
3. Click Tables again and select Join
4. Choose your join types (explained here: https://www.w3schools.com/sql/sql_join.asp)
5. Click on Columns and select columns from each table you want
6. See code generated at bottom of screen.
Rinse and repeat for next table.
Fictional example here:
@svw1900 wrote:
Hi everyone,
I am completely new to SAS (University Edition, newest version) and actually programming as well, so probably these questions have arised many times before (but I have googled for hours, watched YouTube videos and am still stuck with the basics).
Just for the background, I have to analyze a relatively complex data set, split into several tables, for the university.
In the first step, I have to merge "Table 1" with "Table 3". There is not a single column in both tables which can be used for the data merge, but I have a linking table, "Table 2", which assigns a specific identification Key from Table 3 to the identification Keys from Table 1 and allows me to merge both tables. I would start with merging Table 2 and Table 1, so that I can merge Table 1 and Table 3 afterwards.
To make it a litte bit more complex, the data in Table 1 is given on a quarterly basis, with the "Date" column being in the format "31MAR2003". The data in Table 3 is given on a yearly basis, with the "Year" column being in the format "2003". I would like to assign, based on the identification key, the yearly datapoints of Table 3 to every quarterly datapoint in Table 1, of course just for the quarters which belong to the respective year (so that the data from Table 1 for the year 2003 is assigend to every quarterly datapoint, within 2003, of Table 1).
I hope that this is comprehensible, otherwise just ask me for clarifications.
Thank you in advance!!
If you use the GUI tasks to join the data, the code will be generated for you. This way you can play around and test the different options to ensure you get what you need.
1. Add a Query (Tasks>Utilities>Query)
2. In Tables, add both your tables
3. Click Tables again and select Join
4. Choose your join types (explained here: https://www.w3schools.com/sql/sql_join.asp)
5. Click on Columns and select columns from each table you want
6. See code generated at bottom of screen.
Rinse and repeat for next table.
Fictional example here:
@svw1900 wrote:
Hi everyone,
I am completely new to SAS (University Edition, newest version) and actually programming as well, so probably these questions have arised many times before (but I have googled for hours, watched YouTube videos and am still stuck with the basics).
Just for the background, I have to analyze a relatively complex data set, split into several tables, for the university.
In the first step, I have to merge "Table 1" with "Table 3". There is not a single column in both tables which can be used for the data merge, but I have a linking table, "Table 2", which assigns a specific identification Key from Table 3 to the identification Keys from Table 1 and allows me to merge both tables. I would start with merging Table 2 and Table 1, so that I can merge Table 1 and Table 3 afterwards.
To make it a litte bit more complex, the data in Table 1 is given on a quarterly basis, with the "Date" column being in the format "31MAR2003". The data in Table 3 is given on a yearly basis, with the "Year" column being in the format "2003". I would like to assign, based on the identification key, the yearly datapoints of Table 3 to every quarterly datapoint in Table 1, of course just for the quarters which belong to the respective year (so that the data from Table 1 for the year 2003 is assigend to every quarterly datapoint, within 2003, of Table 1).
I hope that this is comprehensible, otherwise just ask me for clarifications.
Thank you in advance!!
Thank you! I think this might help a lot. But there is still a problem: When merging the tables SAS says, that there is not enough space in Work.Query.Data. Since this would be the only dataset in the work library, I think I have to make the data set smaller in advance.. or is there any other solution?
If I have to make the dataset smaller, this would be possible but I need your help as well. The large dataset contains quarterly fund holdings of stocks. To make it smaller, I could sum up the shares held by the parent company of the fund in the given quarter.
This means, that I could sum up all shares which are held by funds belonging to the same asset management company, for a given quarter in a given stock. Afterwards, the dataset should be much smaller and I could try to merge again.
Thanks a lot (again)
Maybe a small example to visualize it:
Data set looks like this
Quarter Fund_Number Stock_Identifier ParentCompany(Fund) NumberofShares Year
31MAR2003 001 YYYYY ZZZ 1000 2003
31MAR2003 002 YYYYY ZZZ 1000 2003
31MAR2003 003 YYYYY ZZZ 1000 2003
31MAR2003 004 YYYYY ZZZ 1000 2003
31MAR2003 005 YYYYY ZZZ 1000 2003
Output:
Quarter Stock_Identifier ParentCompany(Fund) NumberofShares Year
31MAR2003 YYYYY ZZZ 5000 2003
I want to summarize the data so that I get the number of shares which is held by the fund's parent company in one single stock in a quarter. Given that there are 10 funds managed by one company Z, each of them holding 100 shares of Apple at the end of March in 2003, I want to have one single row which says that parent company Z holds 1000 shares of Apple at the end of March 2003.
The other rows with fund identifiers etc. would have to be cancelled out in the output data of course, because you cannot sum them up or anything.
Here's a write up on calculating summary statistics for a data set by group.
https://github.com/statgeek/SAS-Tutorials/blob/master/proc_means_basic.sas
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.