Hello all,
I have a dataset with subject id, date/time of test and test name, I want to create a new dataset using condition if there is any subject id with same test performed more than once with exact same date/time of test? My variable are all character variables and values are in this format:
subject id-ABC-001, ABC-002, ABC-003...etc
date/time of test- 2024-01-06T10:30
test name- AA, BB, CC
What is the easiest way to create this dataset?
Since you show a variable with date and time information are you sure that your selection criteria is exactly the same date and time? What if the time differs by 5 minutes? Should that be counted as a duplicate?
What will you do with the information?
Something like this may be the easiest:
proc sort data=have out=_null_ dupout=theduplicates nodupkey; by subj test time; run;
None of your descriptions provide a valid SAS variable name with default settings so I used different ones. You would use your actual variables.
The out=_null_ means that the source data set will not be replaced. If you want a set without the duplicates put a different data set name there.
The duplicates will be in a set named Theduplicates.
Since you show a variable with date and time information are you sure that your selection criteria is exactly the same date and time? What if the time differs by 5 minutes? Should that be counted as a duplicate?
What will you do with the information?
Something like this may be the easiest:
proc sort data=have out=_null_ dupout=theduplicates nodupkey; by subj test time; run;
None of your descriptions provide a valid SAS variable name with default settings so I used different ones. You would use your actual variables.
The out=_null_ means that the source data set will not be replaced. If you want a set without the duplicates put a different data set name there.
The duplicates will be in a set named Theduplicates.
@billi_billi wrote:
Thank you this worked. But before this I did separate out date and time.
You will find that working with date, time and/or datetime values will generally be easier if you read them using a proper informat to get them as a SAS date, time or datetime numeric value instead of using character values.
Your particular datetime is in an E8601dt structure and can be read from text as:
data example ; input dt :e8601dt16.; format dt e8601dt.; date=datepart(dt); time=timepart(dt); format date date9. time time.; datalines; 2024-01-06T10:30 ;
SAS provides a lot of tools to deal with values once they are date, time or datetime values. Example above includes the functions Datepart, for extracting just the date portion, and Timepart, for extracting the time portion of the value.
https://communities.sas.com/t5/SAS-Communities-Library/Working-with-Dates-and-Times-in-SAS-Tutorial/... has a PDF with much information about dates and such.
One of the very slick features of the dates, as an example, is just changing the Format of variable in a summary or analysis procedure can create analysis on groups of dates. So without changing the data you can have an analysis by calendar day, week, month, quarter or year just by choosing a different format. Since you can also use custom formats you could actually have more complex summaries such as by month of a current year and by year for previous years.
Plus the functions INTNX and INTCK let you create new date/time/datetimes with an offset from one date (INTNX) or determine the number of intervals between two values (INTCK).
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
