i am new to the proc sql in sas. my dataset like this:
Obs
id
area
year
5818
1
NEW YORK
2020
5819
1
NEW YORK
2021
5820
1
NEWARK
2020
5821
1
NEWARK
2021
i try to group them by id, year and area and i used the following code:
i think i supposed to get only two records but instead, i still got the 4 records. i can't figure out what's wrong with this code. proc sql; create table data.clientematga2 as select distinct a.id, a.area, a.year from data.clientematga a group by euci42, ematga , a.year;
Good evening! The "select distinct" refers to uniqueness in all columns given as an argument, in your case id, year, area ... Now as I see four different (distinct) combinations in the sample provided by you, why do you expect 2 rows? --fja
What is contained in these variables? Why do you use GROUP BY without a summary function (SAS SQL will automatically convert this to ORDER BY)? Why do you use variables in the GROUP BY which are not contained in your SELECT?
GROUP BY is used to calculate summary statistics for the groups, not for simple re-ordering.
Please post usable example data in a data step with datalines, and the expected result.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.