BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
svw1900
Obsidian | Level 7

Hi everyone, 

I am completely new to SAS (University Edition, newest version) and actually programming as well, so probably these questions have arised many times before (but I have googled for hours, watched YouTube videos and am still stuck with the basics).

 

Just for the background, I have to analyze a relatively complex data set, split into several tables, for the university.

 

In the first step, I have to merge "Table 1" with "Table 3". There is not a single column in both tables which can be used for the data merge, but I have a linking table, "Table 2", which assigns a specific identification Key from Table 3 to the identification Keys from Table 1 and allows me to merge both tables. I would start with merging Table 2 and Table 1, so that I can merge Table 1 and Table 3 afterwards. 

 

To make it a litte bit more complex, the data in Table 1 is given on a quarterly basis, with the "Date" column being in the format "31MAR2003". The data in Table 3 is given on a yearly basis, with the "Year" column being in the format "2003". I would like to assign, based on the identification key, the yearly datapoints of Table 3 to every quarterly datapoint in Table 1, of course just for the quarters which belong to the respective year (so that the data from Table 1 for the year 2003 is assigend to every quarterly datapoint, within 2003, of Table 1).

 

I hope that this is comprehensible, otherwise just ask me for clarifications.

Thank you in advance!!

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

If you use the GUI tasks to join the data, the code will be generated for you. This way you can play around and test the different options to ensure you get what you need. 

 

1. Add a Query (Tasks>Utilities>Query)

2. In Tables, add both your tables

3. Click Tables again and select Join

4. Choose your join types (explained here: https://www.w3schools.com/sql/sql_join.asp)

5. Click on Columns and select columns from each table you want

6. See code generated at bottom of screen. 

Rinse and repeat for next table. 

 

Fictional example here:

Screen Shot 2018-08-14 at 9.22.35 PM.png


@svw1900 wrote:

Hi everyone, 

I am completely new to SAS (University Edition, newest version) and actually programming as well, so probably these questions have arised many times before (but I have googled for hours, watched YouTube videos and am still stuck with the basics).

 

Just for the background, I have to analyze a relatively complex data set, split into several tables, for the university.

 

In the first step, I have to merge "Table 1" with "Table 3". There is not a single column in both tables which can be used for the data merge, but I have a linking table, "Table 2", which assigns a specific identification Key from Table 3 to the identification Keys from Table 1 and allows me to merge both tables. I would start with merging Table 2 and Table 1, so that I can merge Table 1 and Table 3 afterwards. 

 

To make it a litte bit more complex, the data in Table 1 is given on a quarterly basis, with the "Date" column being in the format "31MAR2003". The data in Table 3 is given on a yearly basis, with the "Year" column being in the format "2003". I would like to assign, based on the identification key, the yearly datapoints of Table 3 to every quarterly datapoint in Table 1, of course just for the quarters which belong to the respective year (so that the data from Table 1 for the year 2003 is assigend to every quarterly datapoint, within 2003, of Table 1).

 

I hope that this is comprehensible, otherwise just ask me for clarifications.

Thank you in advance!!


 

View solution in original post

4 REPLIES 4
Reeza
Super User

If you use the GUI tasks to join the data, the code will be generated for you. This way you can play around and test the different options to ensure you get what you need. 

 

1. Add a Query (Tasks>Utilities>Query)

2. In Tables, add both your tables

3. Click Tables again and select Join

4. Choose your join types (explained here: https://www.w3schools.com/sql/sql_join.asp)

5. Click on Columns and select columns from each table you want

6. See code generated at bottom of screen. 

Rinse and repeat for next table. 

 

Fictional example here:

Screen Shot 2018-08-14 at 9.22.35 PM.png


@svw1900 wrote:

Hi everyone, 

I am completely new to SAS (University Edition, newest version) and actually programming as well, so probably these questions have arised many times before (but I have googled for hours, watched YouTube videos and am still stuck with the basics).

 

Just for the background, I have to analyze a relatively complex data set, split into several tables, for the university.

 

In the first step, I have to merge "Table 1" with "Table 3". There is not a single column in both tables which can be used for the data merge, but I have a linking table, "Table 2", which assigns a specific identification Key from Table 3 to the identification Keys from Table 1 and allows me to merge both tables. I would start with merging Table 2 and Table 1, so that I can merge Table 1 and Table 3 afterwards. 

 

To make it a litte bit more complex, the data in Table 1 is given on a quarterly basis, with the "Date" column being in the format "31MAR2003". The data in Table 3 is given on a yearly basis, with the "Year" column being in the format "2003". I would like to assign, based on the identification key, the yearly datapoints of Table 3 to every quarterly datapoint in Table 1, of course just for the quarters which belong to the respective year (so that the data from Table 1 for the year 2003 is assigend to every quarterly datapoint, within 2003, of Table 1).

 

I hope that this is comprehensible, otherwise just ask me for clarifications.

Thank you in advance!!


 

svw1900
Obsidian | Level 7

Thank you! I think this might help a lot. But there is still a problem: When merging the tables SAS says, that there is not enough space in Work.Query.Data. Since this would be the only dataset in the work library, I think I have to make the data set smaller in advance.. or is there any other solution?

 

If I have to make the dataset smaller, this would be possible but I need your help as well. The large dataset contains quarterly fund holdings of stocks. To make it smaller, I could sum up the shares held by the parent company of the fund in the given quarter. 

This means, that I could sum up all shares which are held by funds belonging to the same asset management company, for a given quarter in a given stock. Afterwards, the dataset should be much smaller and I could try to merge again. 

 

Thanks a lot (again)

svw1900
Obsidian | Level 7

Maybe a small example to visualize it:

 

Data set looks like this

 

    Quarter       Fund_Number     Stock_Identifier    ParentCompany(Fund)    NumberofShares    Year

31MAR2003          001                    YYYYY                      ZZZ                                  1000            2003

31MAR2003          002                    YYYYY                      ZZZ                                  1000            2003

31MAR2003          003                    YYYYY                      ZZZ                                  1000            2003

31MAR2003          004                    YYYYY                      ZZZ                                  1000            2003

31MAR2003          005                    YYYYY                      ZZZ                                  1000            2003

 

Output:

 

Quarter            Stock_Identifier    ParentCompany(Fund)    NumberofShares     Year

31MAR2003           YYYYY                      ZZZ                                  5000            2003

 

I want to summarize the data so that I get the number of shares which is held by the fund's parent company in one single stock in a quarter. Given that there are 10 funds managed by one company Z, each of them holding 100 shares of Apple at the end of March in 2003, I want to have one single row which says that parent company Z holds 1000 shares of Apple at the end of March 2003. 

The other rows with fund identifiers etc. would have to be cancelled out in the output data of course, because you cannot sum them up or anything. 

 

 

Reeza
Super User

Here's a write up on calculating summary statistics for a data set by group.

 

https://github.com/statgeek/SAS-Tutorials/blob/master/proc_means_basic.sas

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 1041 views
  • 2 likes
  • 2 in conversation