BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Konkordanz
Pyrite | Level 9

Hi forum,

 

I have a dataset with >400 cities. Each city has population-values for all quarters of 2 years (Q1 2019, Q2 2019, ..., Q4 2020). I want to create a new variable (pop_q4_2020) with the values of Q4 2020 of the respecitve city. Thats what I have:

 

 

data have;
input city $ year quarter population;
datalines;
CityA 2019 3 20029
CityB 2019 2 34900
CityA 2020 2 19853
CityB 2020 2 34500
CityA 2019 2 20029
CityA 2020 4 19853
CityB 2020 4 34300
CityA 2019 4 20029
CityB 2019 3 34800
CityB 2020 1 34600
CityA 2019 1 20135
CityA 2020 3 19853 CityB 2020 3 34400 CityB 2019 4 34700 CityA 2020 1 20029 CityB 2019 1 35000 ; run;

 

Thats what I want:

 

Region year quarter pop_q4_2020
CityA 2019 3 19853
CityB 2019 2 34300
CityA 2020 2 19853
CityB 2020 2 34300
CityA 2019 2 19853
CityA 2020 4 19853
CityB 2020 4 34300
CityA 2019 4 19853
CityB 2019 3 34300
CityB 2020 1 34300
CityA 2019 1 19853
CityA 2020 3 19853
CityB 2020 3 34300
CityB 2019 4 34300
CityA 2020 1 19853
CityB 2019 1 34300

 

I tried it with this steps: create a new var and copy the value of population-var if year=2020 and quarter=4. But I dont know how to copy the values into the cells with the same city-names.

 

Thank you for help!

 

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26
proc sql;
    create table want as select a.city,a.year,a.quarter,b.population as pop_q4_2020
        from have as a left join (select * from have where year=2020 and quarter=4) as b
        on a.city=b.city;
quit;
--
Paige Miller

View solution in original post

6 REPLIES 6
PaigeMiller
Diamond | Level 26
proc sql;
    create table want as select a.city,a.year,a.quarter,b.population as pop_q4_2020
        from have as a left join (select * from have where year=2020 and quarter=4) as b
        on a.city=b.city;
quit;
--
Paige Miller
Konkordanz
Pyrite | Level 9

Great, thank you! Do you have a further minute for writing a short description what your sql-commands exactly do? What for example means the "a.*" (a.city, a.quarter,...)? I want to learn it and need to get some hinks...

PaigeMiller
Diamond | Level 26

The PROC SQL code takes variables city, year and quarter from the data set identified as A (which is data set HAVE) and variable population (renamed to pop_q4_2020) from data set identified as B, and joins them where a.city=b.city (a left join is used, meaning the records of the data set on the left, which is A, are selected to be in the output table). Data set B is created by selecting all variables from HAVE, and all records where the year is 2020 and the quarter is 4.

--
Paige Miller
Konkordanz
Pyrite | Level 9

Thx for explaination. So, can it interprated as follows? And what does the open points mean?

What I doesnt understand: The code shall combine table a and table b...but isnt the column "city" only a content of the table a?

 

 

create table want as select a.city,a.year,a.quarter /*these var comes into the temp. table "a"*/,
b.population as pop_q4_2020  /*this var comes into the temp table "b" and get a new name*/
from have /*the source-table of the variables */
as a left join (select * from have where year=2020 and quarter=4) /*all data of the table "a" with the filter year=2020 and quarter=4 shall get involved*/

as b /* ????*/ on a.city=b.city; /*????*/ quit;

 

 

 

PaigeMiller
Diamond | Level 26

@Konkordanz wrote:

Thx for explaination. So, can it interprated as follows? And what does the open points mean?

What I doesnt understand: The code shall combine table a and table b...but isnt the column "city" only a content of the table a?

 


B has all columns from HAVE, indicated by SELECT * FROM HAVE

 

(select * from have where year=2020 and quarter=4) 
/*all data of the table "a" with the filter year=2020 and quarter=4 shall get involved*/
as b /* ????*/

The above comment in your code is incorrect. It is all the columns from data set HAVE, not table A

--
Paige Miller
Kurt_Bremser
Super User

"a" and "b" are aliases for the table they are associated with. This allows you (among other things, like reducing the amount of typing) to use the same dataset several times in a query, and treat it like different datasets in the code.

When you use point-and-click in the Enterprise Guide Query Builder, it will assign t1, t2, ..., tN as aliases.

sas-innovate-white.png

Our biggest data and AI event of the year.

Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.

Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1098 views
  • 3 likes
  • 3 in conversation