Finding Changes in Data

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 92
Accepted Solution

Finding Changes in Data

I have a program that provides output as follows:

 

Year     Month     depart     dest     Count

2017             8     CYVR     CYYZ     578

2017             8     CYOW    CYYZ     578

 

These are airport departure and destination codes. I can select which month/year combo I want and all is good.

 

Here is the question, how can I compare two months (say August 2016 and August 2016) and have ONLY CHANGED CITY PAIRS come out.

 

I'm thinking I will need to concatenate depart and dest to, say, departdest (e.g., CYVRCYYZ from above) and use that variable to compare. Essentially I am only interesting in saying:

 

"When comparing August 2017 to August 2016 the following city pairs were added. When comparing August 2017 to August 2016 the following city pairs were removed."

 

Any initial ideas would be great

 

thx


Accepted Solutions
Solution
‎10-20-2017 08:22 AM
Super User
Posts: 23,343

Re: Finding Changes in Data

delete_sample_data.JPG

 

 

Here's a solution that doesn't scale well to multiple data sets IMO. You need to manually code it and customize for each month.

 

delete_scale1.JPG

 

And an SQL option that does scale, using EXCEPT

 

delete_scale2.JPG

View solution in original post


All Replies
Super User
Super User
Posts: 9,441

Re: Finding Changes in Data

You may be able to get away with:

proc sql;
  create table WANT as
  select  *
  from    (select distinct YEAR,MONTH,DEPART,DEST from HAVE)
  where   count(*) > 1;
quit;

Not not tested, post test data in the form of a datastep.

Super User
Posts: 23,343

Re: Finding Changes in Data

PROC SQL with INTERSECT for the same and EXCEPT for the differences. Or PROC COMPARE but I don't find that as easy to use. If you want more help, please post sample data and expected output.

https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat...
Frequent Contributor
Posts: 92

Re: Finding Changes in Data

Here is more info

 

I have two tables:

 

AUGUST_2017

AUGUST_2016

 

Each table contains airport city pair data (I can post data if needed). Data looks like:

 

AUGUST_2017:

 

Year   Month   depart   dest   Count
2017   8          CYVR   CYYZ      578
2017   8          CYYZ    CYVR     577
2017   8          CYOW  CYYZ      530
2017   8          CYYZ    CYOW    529

 

 

AUGUST_2016:

 

Year   Month   depart   dest   Count
2016   8          CYVR   CYYZ      578
2016   8          CYYZ    CYVR     577
2016   8          CYOW  CYYZ      530
2016   8          CYYZ    CYOW    529

 

 

and so on for thousands of flights.

 

 

I need to be able to query both existing tables.

 

The first query would be to check what is in AUGUST_2017 that is not in AUGUST_2016. These would be added flights.

 

The second query would be to check what is in AUGUST_2016 that is not in AUGUST_2017. These would be flights removed by the airline.

 

I don't think my original post was clear enough. Sorry.

 

 

Super User
Posts: 23,343

Re: Finding Changes in Data

Did EXCEPT not work? If not, post your code and log.
Frequent Contributor
Posts: 92

Re: Finding Changes in Data

[ Edited ]

I am not sure of the code..can you post the sql

 

PROC SQL;

CREATE TABLE EGTASK.TEST2 AS

SELECT t1.depart,

                 t1.dest

intersect EGTASK.AUGUST_2017 t1, EGTASK.AUGUST_2016 t2;

QUIT;

Super User
Posts: 23,343

Re: Finding Changes in Data

1. Your sample data all are the same so there are no differences to be found

2. You didn't put them as a data step so I'd have to write code to import your data first

3. To demonstrate an actual example I'd have to add new data that would show the differences

4. The code below is INTERSECT which would show the same in both data sets. EXCEPT will show the exceptions.

 

So even if the code worked, there's no way to test it at present with your sample data.

 

Here's a reference for the SQL operators with examples

 

http://www2.sas.com/proceedings/sugi31/242-31.pdf

 

Solution
‎10-20-2017 08:22 AM
Super User
Posts: 23,343

Re: Finding Changes in Data

delete_sample_data.JPG

 

 

Here's a solution that doesn't scale well to multiple data sets IMO. You need to manually code it and customize for each month.

 

delete_scale1.JPG

 

And an SQL option that does scale, using EXCEPT

 

delete_scale2.JPG

Frequent Contributor
Posts: 92

Re: Finding Changes in Data

@Reeza

 

Thank you!!!

 

I have tested and your solution is perfect. I ended up with:

 

/* The code below will get all flights that were removed from the previous year */
PROC SQL;
	CREATE TABLE EGTASK.ACM_FLT_REMOVED AS
		SELECT depart, dest
			FROM EGTASK.CITY_PR_TOTAL_PYEAR
				EXCEPT
			SELECT depart, dest from EGTASK.CITY_PR_TOTAL_CYEAR;
QUIT;

/* The code below will get all flights that were added in the current year */
PROC SQL;
	CREATE TABLE EGTASK.ACM_FLT_ADDED AS
		SELECT depart, dest
			FROM EGTASK.CITY_PR_TOTAL_CYEAR
				EXCEPT
			SELECT depart, dest from EGTASK.CITY_PR_TOTAL_PYEAR;
QUIT;

 

As you can see I have generalized it to current year (CYEAR) and previous year (PYEAR).

 

Thanks!

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 186 views
  • 2 likes
  • 3 in conversation