BookmarkSubscribeRSS Feed
sms1891
Quartz | Level 8

Hi,

I am trying to select unique cases using proc SQL but unsuccessful. I appreciate any help for this.

 

For a given ID and a Collection_Date, I am looking to select unique cases with lowest diff_hours and closeset Ud_Collection_Date. Note: Ud_Collection_Date can be before or after Collection_Date, but it does not matter (I am looking for the closest one either before or after).

 

Thanks!

DataRx;
	informat ID $12. COLLECTION_DATE datetime20. diff_Ud_hours Ud_COLLECTION_DATE datetime20.;
	format ID $12. COLLECTION_DATE datetime20. diff_Ud_hours Ud_COLLECTION_DATE datetime20.;
	
	input ID   $   COLLECTION_DATE diff_Ud_hours Ud_COLLECTION_DATE;
	cards;
ID	COLLECTION_DATE	diff_Ud_hours	Ud_COLLECTION_DATE
100024536387	06JUN2016:16:25:00	7.0000	06JUN2016:09:25:00
100024536431	05JUN2016:17:39:00	0.0000	05JUN2016:17:39:00
100024536472	05JUN2016:03:30:00	0.0000	05JUN2016:03:30:00
100024536472	05JUN2016:03:30:00	11.6667	04JUN2016:15:50:00
100024536755	09OCT2016:08:54:00	-0.0500	09OCT2016:08:57:00
100024536755	11OCT2016:08:49:00	-0.0333	11OCT2016:08:51:00
100024536808	09OCT2016:10:26:00	-0.0333	09OCT2016:10:28:00
110004803082	07JUL2016:12:39:00	-0.1000	07JUL2016:12:45:00
110004803191	24JUN2016:17:35:00	9.9833	24JUN2016:07:36:00
110004803378	29JUN2016:19:08:00	8.3667	29JUN2016:10:46:00
110004803693	06SEP2016:00:13:00	6.9833	05SEP2016:17:14:00
110004803693	06SEP2016:00:13:00	1.8333	05SEP2016:22:23:00
110004803693	06SEP2016:00:13:00	0.7167	05SEP2016:23:30:00
110004804502	20NOV2016:22:08:00	-0.0167	20NOV2016:22:09:00
110004804502	20NOV2016:22:08:00	-9.7333	21NOV2016:07:52:00
110004809859	24DEC2016:07:18:00	-9.2667	24DEC2016:16:34:00
110004810284	17JAN2017:11:42:00	12.5667	16JAN2017:23:08:00
110004810284	18JAN2017:01:07:00	10.9167	17JAN2017:14:12:00
110004810284	18JAN2017:01:07:00	2.6833	17JAN2017:22:26:00
110004810756	26JAN2017:01:00:00	5.5500	25JAN2017:19:27:00
110004810756	26JAN2017:01:00:00	-0.0500	26JAN2017:01:03:00
110004810951	02FEB2017:16:16:00	20.9000	01FEB2017:19:22:00
110004810951	02FEB2017:16:16:00	19.1500	01FEB2017:21:07:00
110004811139	13FEB2017:14:37:00	-0.4000	13FEB2017:15:01:00
110004811242	07JUL2016:04:56:00	0.0000	07JUL2016:04:56:00
110004811819	27NOV2016:05:36:00	0.0000	27NOV2016:05:36:00
110004812239	10FEB2017:15:27:00	18.4333	09FEB2017:21:01:00
110004812246	19JAN2017:22:06:00	6.7667	19JAN2017:15:20:00
110004812426	09SEP2016:18:39:00	15.5333	09SEP2016:03:07:00
110004812426	09SEP2016:18:39:00	0.0000	09SEP2016:18:39:00
110004812430	04SEP2016:18:14:00	8.6167	04SEP2016:09:37:00
110004812444	15SEP2016:00:24:00	16.7667	14SEP2016:07:38:00
110004812703	29DEC2016:11:42:00	0.0000	29DEC2016:11:42:00
110004812755	21JAN2017:16:08:00	0.0000	21JAN2017:16:08:00
110004812788	16JAN2017:01:14:00	16.8167	15JAN2017:08:25:00
110004812797	13JAN2017:16:39:00	0.0000	13JAN2017:16:39:00
110004812833	09JAN2017:16:37:00	-0.0500	09JAN2017:16:40:00
110004812836	11FEB2017:05:32:00	3.0500	11FEB2017:02:29:00
110004812836	11FEB2017:05:32:00	-0.0167	11FEB2017:05:33:00
110004812841	13JAN2017:22:47:00	7.4333	13JAN2017:15:21:00
110004812912	29JAN2017:06:15:00	-0.0500	29JAN2017:06:18:00
110004813122	10AUG2016:17:26:00	-0.0167	10AUG2016:17:27:00
110004813166	27SEP2016:13:20:00	0.0000	27SEP2016:13:20:00
110004813189	28NOV2016:15:57:00	-0.3333	28NOV2016:16:17:00
110004813637	17JUL2016:16:38:00	5.3667	17JUL2016:11:16:00
110004813708	04AUG2016:12:31:00	-0.0500	04AUG2016:12:34:00
110004813708	13AUG2016:18:05:00	21.2167	12AUG2016:20:52:00
110004814501	23NOV2016:13:20:00	0.0000	23NOV2016:13:20:00
110004814570	17NOV2016:11:43:00	-0.1667	17NOV2016:11:53:00
110004814659	23NOV2016:15:17:00	7.1500	23NOV2016:08:08:00

;
Run;


5 REPLIES 5
ballardw
Super User

Can you indicate which are your desired output for that given example?

sms1891
Quartz | Level 8

Thanks for the reply ballard!

 

I have attached two sets of data (excel sheets). The first one shows what exactly I am looking for.

For Reg_Nos without multiple Collection_Dates and multiple Ud_Collection_Dates, there is no issue.

 

I have highlighted some Reg_nos where GREEN is what I want to keep and RED is what I want to delete, if there are multiple COLLECTION_DATEs per Reg_No.

 

The second excel sheet (Data1Data2_Ideal), I deleted REd and only kept GREEN if there are multiple COLLECTION_DATEs per Reg_No. This is EXACTLY what I want.

 

My data set has +10000 observations, I just attached a sample of 50 IDs.

 

Thanks

 

sms1891
Quartz | Level 8

Here is the second sheet (Ideal output)

 

 

ballardw
Super User

Many users here don't want to download Excel files because of virus potential, others have such things blocked by security software. Also if you give us Excel we have to create a SAS data set and due to the non-existent constraints on Excel data cells the result we end up with may not have variables of the same type (numeric or character) and even values.

 

 

Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as text to show exactly what you have and that we can test code against.

sms1891
Quartz | Level 8

Here is the ideal data.

DataRx;
	informat ID $12. COLLECTION_DATE datetime20. diff_Ud_hours Ud_COLLECTION_DATE datetime20.;
	format ID $12. COLLECTION_DATE datetime20. diff_Ud_hours Ud_COLLECTION_DATE datetime20.;
	
	input ID   $   COLLECTION_DATE diff_Ud_hours Ud_COLLECTION_DATE;
	cards;
ID	COLLECTION_DATE	diff_Ud_hours	Ud_COLLECTION_DATE
100024536387	06JUN2016:16:25:00	7	06JUN2016:09:25:00
100024536431	05JUN2016:17:39:00	0	05JUN2016:17:39:00
100024536472	05JUN2016:03:30:00	0	05JUN2016:03:30:00
100024536755	09OCT2016:08:54:00	-0.05	09OCT2016:08:57:00
100024536755	11OCT2016:08:49:00	-0.0333	11OCT2016:08:51:00
100024536808	09OCT2016:10:26:00	-0.0333	09OCT2016:10:28:00
110004803082	07JUL2016:12:39:00	-0.1	07JUL2016:12:45:00
110004803191	24JUN2016:17:35:00	9.9833	24JUN2016:07:36:00
110004803378	29JUN2016:19:08:00	8.3667	29JUN2016:10:46:00
110004803693	06SEP2016:00:13:00	0.7167	05SEP2016:23:30:00
110004804502	20NOV2016:22:08:00	-0.0167	20NOV2016:22:09:00
110004804502	20NOV2016:22:08:00	-9.7333	21NOV2016:07:52:00
110004809859	24DEC2016:07:18:00	-9.2667	24DEC2016:16:34:00
110004810284	17JAN2017:11:42:00	12.5667	16JAN2017:23:08:00
110004810284	18JAN2017:01:07:00	2.6833	17JAN2017:22:26:00
110004810756	26JAN2017:01:00:00	-0.05	26JAN2017:01:03:00
110004810951	02FEB2017:16:16:00	19.15	01FEB2017:21:07:00
110004811139	13FEB2017:14:37:00	-0.4	13FEB2017:15:01:00
110004811242	07JUL2016:04:56:00	0	07JUL2016:04:56:00
110004811819	27NOV2016:05:36:00	0	27NOV2016:05:36:00
110004812239	10FEB2017:15:27:00	18.4333	09FEB2017:21:01:00
110004812246	19JAN2017:22:06:00	6.7667	19JAN2017:15:20:00
110004812426	09SEP2016:18:39:00	0	09SEP2016:18:39:00
110004812430	04SEP2016:18:14:00	8.6167	04SEP2016:09:37:00
110004812444	15SEP2016:00:24:00	16.7667	14SEP2016:07:38:00
110004812703	29DEC2016:11:42:00	0	29DEC2016:11:42:00
110004812755	21JAN2017:16:08:00	0	21JAN2017:16:08:00
110004812788	16JAN2017:01:14:00	16.8167	15JAN2017:08:25:00
110004812797	13JAN2017:16:39:00	0	13JAN2017:16:39:00
110004812833	09JAN2017:16:37:00	-0.05	09JAN2017:16:40:00
110004812836	11FEB2017:05:32:00	-0.0167	11FEB2017:05:33:00
110004812841	13JAN2017:22:47:00	7.4333	13JAN2017:15:21:00
110004812912	29JAN2017:06:15:00	-0.05	29JAN2017:06:18:00
110004813122	10AUG2016:17:26:00	-0.0167	10AUG2016:17:27:00
110004813166	27SEP2016:13:20:00	0	27SEP2016:13:20:00
110004813189	28NOV2016:15:57:00	-0.3333	28NOV2016:16:17:00
110004813637	17JUL2016:16:38:00	5.3667	17JUL2016:11:16:00
110004813708	04AUG2016:12:31:00	-0.05	04AUG2016:12:34:00
110004813708	13AUG2016:18:05:00	21.2167	12AUG2016:20:52:00
110004814501	23NOV2016:13:20:00	0	23NOV2016:13:20:00
110004814570	17NOV2016:11:43:00	-0.1667	17NOV2016:11:53:00
110004814659	23NOV2016:15:17:00	7.15	23NOV2016:08:08:00
110004815282	13AUG2016:21:19:00	2.0667	13AUG2016:19:15:00
110004816378	05OCT2016:12:09:00	0	05OCT2016:12:09:00
110004816495	09NOV2016:05:38:00	-0.0333	09NOV2016:05:40:00
110004816819	09JUL2016:00:47:00	-0.0167	09JUL2016:00:48:00
110004816897	21JUL2016:22:46:33	0.0092	21JUL2016:22:46:00
110004816908	27JUL2016:22:12:00	1.7333	27JUL2016:20:28:00

;
run;

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 949 views
  • 0 likes
  • 2 in conversation