BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Konkordanz
Pyrite | Level 9

Hi forum,

 

I have a dataset with >400 cities. Each city has population-values for all quarters of 2 years (Q1 2019, Q2 2019, ..., Q4 2020). I want to create a new variable (pop_q4_2020) with the values of Q4 2020 of the respecitve city. Thats what I have:

 

 

data have;
input city $ year quarter population;
datalines;
CityA 2019 3 20029
CityB 2019 2 34900
CityA 2020 2 19853
CityB 2020 2 34500
CityA 2019 2 20029
CityA 2020 4 19853
CityB 2020 4 34300
CityA 2019 4 20029
CityB 2019 3 34800
CityB 2020 1 34600
CityA 2019 1 20135
CityA 2020 3 19853 CityB 2020 3 34400 CityB 2019 4 34700 CityA 2020 1 20029 CityB 2019 1 35000 ; run;

 

Thats what I want:

 

Region year quarter pop_q4_2020
CityA 2019 3 19853
CityB 2019 2 34300
CityA 2020 2 19853
CityB 2020 2 34300
CityA 2019 2 19853
CityA 2020 4 19853
CityB 2020 4 34300
CityA 2019 4 19853
CityB 2019 3 34300
CityB 2020 1 34300
CityA 2019 1 19853
CityA 2020 3 19853
CityB 2020 3 34300
CityB 2019 4 34300
CityA 2020 1 19853
CityB 2019 1 34300

 

I tried it with this steps: create a new var and copy the value of population-var if year=2020 and quarter=4. But I dont know how to copy the values into the cells with the same city-names.

 

Thank you for help!

 

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26
proc sql;
    create table want as select a.city,a.year,a.quarter,b.population as pop_q4_2020
        from have as a left join (select * from have where year=2020 and quarter=4) as b
        on a.city=b.city;
quit;
--
Paige Miller

View solution in original post

6 REPLIES 6
PaigeMiller
Diamond | Level 26
proc sql;
    create table want as select a.city,a.year,a.quarter,b.population as pop_q4_2020
        from have as a left join (select * from have where year=2020 and quarter=4) as b
        on a.city=b.city;
quit;
--
Paige Miller
Konkordanz
Pyrite | Level 9

Great, thank you! Do you have a further minute for writing a short description what your sql-commands exactly do? What for example means the "a.*" (a.city, a.quarter,...)? I want to learn it and need to get some hinks...

PaigeMiller
Diamond | Level 26

The PROC SQL code takes variables city, year and quarter from the data set identified as A (which is data set HAVE) and variable population (renamed to pop_q4_2020) from data set identified as B, and joins them where a.city=b.city (a left join is used, meaning the records of the data set on the left, which is A, are selected to be in the output table). Data set B is created by selecting all variables from HAVE, and all records where the year is 2020 and the quarter is 4.

--
Paige Miller
Konkordanz
Pyrite | Level 9

Thx for explaination. So, can it interprated as follows? And what does the open points mean?

What I doesnt understand: The code shall combine table a and table b...but isnt the column "city" only a content of the table a?

 

 

create table want as select a.city,a.year,a.quarter /*these var comes into the temp. table "a"*/,
b.population as pop_q4_2020  /*this var comes into the temp table "b" and get a new name*/
from have /*the source-table of the variables */
as a left join (select * from have where year=2020 and quarter=4) /*all data of the table "a" with the filter year=2020 and quarter=4 shall get involved*/

as b /* ????*/ on a.city=b.city; /*????*/ quit;

 

 

 

PaigeMiller
Diamond | Level 26

@Konkordanz wrote:

Thx for explaination. So, can it interprated as follows? And what does the open points mean?

What I doesnt understand: The code shall combine table a and table b...but isnt the column "city" only a content of the table a?

 


B has all columns from HAVE, indicated by SELECT * FROM HAVE

 

(select * from have where year=2020 and quarter=4) 
/*all data of the table "a" with the filter year=2020 and quarter=4 shall get involved*/
as b /* ????*/

The above comment in your code is incorrect. It is all the columns from data set HAVE, not table A

--
Paige Miller
Kurt_Bremser
Super User

"a" and "b" are aliases for the table they are associated with. This allows you (among other things, like reducing the amount of typing) to use the same dataset several times in a query, and treat it like different datasets in the code.

When you use point-and-click in the Enterprise Guide Query Builder, it will assign t1, t2, ..., tN as aliases.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 460 views
  • 3 likes
  • 3 in conversation