Hi,
I am trying to select unique cases using proc SQL but unsuccessful. I appreciate any help for this.
For a given ID and a Collection_Date, I am looking to select unique cases with lowest diff_hours and closeset Ud_Collection_Date. Note: Ud_Collection_Date can be before or after Collection_Date, but it does not matter (I am looking for the closest one either before or after).
Thanks!
DataRx;
informat ID $12. COLLECTION_DATE datetime20. diff_Ud_hours Ud_COLLECTION_DATE datetime20.;
format ID $12. COLLECTION_DATE datetime20. diff_Ud_hours Ud_COLLECTION_DATE datetime20.;
input ID $ COLLECTION_DATE diff_Ud_hours Ud_COLLECTION_DATE;
cards;
ID COLLECTION_DATE diff_Ud_hours Ud_COLLECTION_DATE
100024536387 06JUN2016:16:25:00 7.0000 06JUN2016:09:25:00
100024536431 05JUN2016:17:39:00 0.0000 05JUN2016:17:39:00
100024536472 05JUN2016:03:30:00 0.0000 05JUN2016:03:30:00
100024536472 05JUN2016:03:30:00 11.6667 04JUN2016:15:50:00
100024536755 09OCT2016:08:54:00 -0.0500 09OCT2016:08:57:00
100024536755 11OCT2016:08:49:00 -0.0333 11OCT2016:08:51:00
100024536808 09OCT2016:10:26:00 -0.0333 09OCT2016:10:28:00
110004803082 07JUL2016:12:39:00 -0.1000 07JUL2016:12:45:00
110004803191 24JUN2016:17:35:00 9.9833 24JUN2016:07:36:00
110004803378 29JUN2016:19:08:00 8.3667 29JUN2016:10:46:00
110004803693 06SEP2016:00:13:00 6.9833 05SEP2016:17:14:00
110004803693 06SEP2016:00:13:00 1.8333 05SEP2016:22:23:00
110004803693 06SEP2016:00:13:00 0.7167 05SEP2016:23:30:00
110004804502 20NOV2016:22:08:00 -0.0167 20NOV2016:22:09:00
110004804502 20NOV2016:22:08:00 -9.7333 21NOV2016:07:52:00
110004809859 24DEC2016:07:18:00 -9.2667 24DEC2016:16:34:00
110004810284 17JAN2017:11:42:00 12.5667 16JAN2017:23:08:00
110004810284 18JAN2017:01:07:00 10.9167 17JAN2017:14:12:00
110004810284 18JAN2017:01:07:00 2.6833 17JAN2017:22:26:00
110004810756 26JAN2017:01:00:00 5.5500 25JAN2017:19:27:00
110004810756 26JAN2017:01:00:00 -0.0500 26JAN2017:01:03:00
110004810951 02FEB2017:16:16:00 20.9000 01FEB2017:19:22:00
110004810951 02FEB2017:16:16:00 19.1500 01FEB2017:21:07:00
110004811139 13FEB2017:14:37:00 -0.4000 13FEB2017:15:01:00
110004811242 07JUL2016:04:56:00 0.0000 07JUL2016:04:56:00
110004811819 27NOV2016:05:36:00 0.0000 27NOV2016:05:36:00
110004812239 10FEB2017:15:27:00 18.4333 09FEB2017:21:01:00
110004812246 19JAN2017:22:06:00 6.7667 19JAN2017:15:20:00
110004812426 09SEP2016:18:39:00 15.5333 09SEP2016:03:07:00
110004812426 09SEP2016:18:39:00 0.0000 09SEP2016:18:39:00
110004812430 04SEP2016:18:14:00 8.6167 04SEP2016:09:37:00
110004812444 15SEP2016:00:24:00 16.7667 14SEP2016:07:38:00
110004812703 29DEC2016:11:42:00 0.0000 29DEC2016:11:42:00
110004812755 21JAN2017:16:08:00 0.0000 21JAN2017:16:08:00
110004812788 16JAN2017:01:14:00 16.8167 15JAN2017:08:25:00
110004812797 13JAN2017:16:39:00 0.0000 13JAN2017:16:39:00
110004812833 09JAN2017:16:37:00 -0.0500 09JAN2017:16:40:00
110004812836 11FEB2017:05:32:00 3.0500 11FEB2017:02:29:00
110004812836 11FEB2017:05:32:00 -0.0167 11FEB2017:05:33:00
110004812841 13JAN2017:22:47:00 7.4333 13JAN2017:15:21:00
110004812912 29JAN2017:06:15:00 -0.0500 29JAN2017:06:18:00
110004813122 10AUG2016:17:26:00 -0.0167 10AUG2016:17:27:00
110004813166 27SEP2016:13:20:00 0.0000 27SEP2016:13:20:00
110004813189 28NOV2016:15:57:00 -0.3333 28NOV2016:16:17:00
110004813637 17JUL2016:16:38:00 5.3667 17JUL2016:11:16:00
110004813708 04AUG2016:12:31:00 -0.0500 04AUG2016:12:34:00
110004813708 13AUG2016:18:05:00 21.2167 12AUG2016:20:52:00
110004814501 23NOV2016:13:20:00 0.0000 23NOV2016:13:20:00
110004814570 17NOV2016:11:43:00 -0.1667 17NOV2016:11:53:00
110004814659 23NOV2016:15:17:00 7.1500 23NOV2016:08:08:00
;
Run;
Can you indicate which are your desired output for that given example?
Thanks for the reply ballard!
I have attached two sets of data (excel sheets). The first one shows what exactly I am looking for.
For Reg_Nos without multiple Collection_Dates and multiple Ud_Collection_Dates, there is no issue.
I have highlighted some Reg_nos where GREEN is what I want to keep and RED is what I want to delete, if there are multiple COLLECTION_DATEs per Reg_No.
The second excel sheet (Data1Data2_Ideal), I deleted REd and only kept GREEN if there are multiple COLLECTION_DATEs per Reg_No. This is EXACTLY what I want.
My data set has +10000 observations, I just attached a sample of 50 IDs.
Thanks
Here is the second sheet (Ideal output)
Many users here don't want to download Excel files because of virus potential, others have such things blocked by security software. Also if you give us Excel we have to create a SAS data set and due to the non-existent constraints on Excel data cells the result we end up with may not have variables of the same type (numeric or character) and even values.
Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as text to show exactly what you have and that we can test code against.
Here is the ideal data.
DataRx;
informat ID $12. COLLECTION_DATE datetime20. diff_Ud_hours Ud_COLLECTION_DATE datetime20.;
format ID $12. COLLECTION_DATE datetime20. diff_Ud_hours Ud_COLLECTION_DATE datetime20.;
input ID $ COLLECTION_DATE diff_Ud_hours Ud_COLLECTION_DATE;
cards;
ID COLLECTION_DATE diff_Ud_hours Ud_COLLECTION_DATE
100024536387 06JUN2016:16:25:00 7 06JUN2016:09:25:00
100024536431 05JUN2016:17:39:00 0 05JUN2016:17:39:00
100024536472 05JUN2016:03:30:00 0 05JUN2016:03:30:00
100024536755 09OCT2016:08:54:00 -0.05 09OCT2016:08:57:00
100024536755 11OCT2016:08:49:00 -0.0333 11OCT2016:08:51:00
100024536808 09OCT2016:10:26:00 -0.0333 09OCT2016:10:28:00
110004803082 07JUL2016:12:39:00 -0.1 07JUL2016:12:45:00
110004803191 24JUN2016:17:35:00 9.9833 24JUN2016:07:36:00
110004803378 29JUN2016:19:08:00 8.3667 29JUN2016:10:46:00
110004803693 06SEP2016:00:13:00 0.7167 05SEP2016:23:30:00
110004804502 20NOV2016:22:08:00 -0.0167 20NOV2016:22:09:00
110004804502 20NOV2016:22:08:00 -9.7333 21NOV2016:07:52:00
110004809859 24DEC2016:07:18:00 -9.2667 24DEC2016:16:34:00
110004810284 17JAN2017:11:42:00 12.5667 16JAN2017:23:08:00
110004810284 18JAN2017:01:07:00 2.6833 17JAN2017:22:26:00
110004810756 26JAN2017:01:00:00 -0.05 26JAN2017:01:03:00
110004810951 02FEB2017:16:16:00 19.15 01FEB2017:21:07:00
110004811139 13FEB2017:14:37:00 -0.4 13FEB2017:15:01:00
110004811242 07JUL2016:04:56:00 0 07JUL2016:04:56:00
110004811819 27NOV2016:05:36:00 0 27NOV2016:05:36:00
110004812239 10FEB2017:15:27:00 18.4333 09FEB2017:21:01:00
110004812246 19JAN2017:22:06:00 6.7667 19JAN2017:15:20:00
110004812426 09SEP2016:18:39:00 0 09SEP2016:18:39:00
110004812430 04SEP2016:18:14:00 8.6167 04SEP2016:09:37:00
110004812444 15SEP2016:00:24:00 16.7667 14SEP2016:07:38:00
110004812703 29DEC2016:11:42:00 0 29DEC2016:11:42:00
110004812755 21JAN2017:16:08:00 0 21JAN2017:16:08:00
110004812788 16JAN2017:01:14:00 16.8167 15JAN2017:08:25:00
110004812797 13JAN2017:16:39:00 0 13JAN2017:16:39:00
110004812833 09JAN2017:16:37:00 -0.05 09JAN2017:16:40:00
110004812836 11FEB2017:05:32:00 -0.0167 11FEB2017:05:33:00
110004812841 13JAN2017:22:47:00 7.4333 13JAN2017:15:21:00
110004812912 29JAN2017:06:15:00 -0.05 29JAN2017:06:18:00
110004813122 10AUG2016:17:26:00 -0.0167 10AUG2016:17:27:00
110004813166 27SEP2016:13:20:00 0 27SEP2016:13:20:00
110004813189 28NOV2016:15:57:00 -0.3333 28NOV2016:16:17:00
110004813637 17JUL2016:16:38:00 5.3667 17JUL2016:11:16:00
110004813708 04AUG2016:12:31:00 -0.05 04AUG2016:12:34:00
110004813708 13AUG2016:18:05:00 21.2167 12AUG2016:20:52:00
110004814501 23NOV2016:13:20:00 0 23NOV2016:13:20:00
110004814570 17NOV2016:11:43:00 -0.1667 17NOV2016:11:53:00
110004814659 23NOV2016:15:17:00 7.15 23NOV2016:08:08:00
110004815282 13AUG2016:21:19:00 2.0667 13AUG2016:19:15:00
110004816378 05OCT2016:12:09:00 0 05OCT2016:12:09:00
110004816495 09NOV2016:05:38:00 -0.0333 09NOV2016:05:40:00
110004816819 09JUL2016:00:47:00 -0.0167 09JUL2016:00:48:00
110004816897 21JUL2016:22:46:33 0.0092 21JUL2016:22:46:00
110004816908 27JUL2016:22:12:00 1.7333 27JUL2016:20:28:00
;
run;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.