About SASPreK

SASPreK · ‎10-04-2024

Hi! I have the following data data have; input id dt test; datalines; ABCDE 20100429 T1 ABCDE 20100429 T2 ABCDE 20090908 T1 ABCDE 20090908 T2 ABCDE 20210823 T1 ABCUK 20191008 T1 ABCUK 20191008 T2 ABCUK 20230723 T1 ; In this dataset based on the id, date, and test there are six records that are in 3 pairs and two records that are singles. I want to create a new variable doc_id in which each unique pair will get a new id and and for singles the field will be blank. See below expected output. id dt test doc_id ABCDE 20100429 T1 ABCDE-1 ABCDE 20100429 T2 ABCDE-1 ABCDE 20090908 T1 ABCDE-2 ABCDE 20090908 T2 ABCDE-2 ABCDE 20210823 T1 ABCUK 20191008 T1 ABCUK-1 ABCUK 20191008 T2 ABCUK-1 ABCUK 20230723 T1 Thank you for your inputs!

SASPreK · ‎06-12-2024

Thank @ballardw @Tom @Kurt_Bremser @Sajid01 for all your very helpful comments. With the help of your comments and carefully looking at the error messages and variable attributes I realized my data was in this format. data have; input dob $ sample_dt $ ; format dob sample_dt 8.; informat dob sample_dt 8.; datalines; 19650214 20100429 19800724 20210823 19991208 20090908 ; My date variables were not in date format as pointed out by @ballardw but my variables were not being converted to date variable by using any format and informat of yymmdd8. functions because it had it's own format and informat of 8. So, I checked the sas community and used the solution provided by @ballardw to this thread https://communities.sas.com/t5/SAS-Enterprise-Guide/Convert-Character-to-Date/td-p/742951?lightbox-message-images-743004=59673iAFF420B5BA5A58D5 However, my actual dataset was not completely clean and was still giving me some errors at specific rows and columns because the date values had some special characters or missing values. After following the two steps of converting to date and cleaning my dataset, I was finally able to get the output I had wanted. Thank you all once again 🙂

SASPreK · ‎06-07-2024

I removed the formats and ran the following code data labcd4age /*labcd4anly labcd4_agelt13*/; set lab_cd4anly1; age=yrdif(dob, sample_dt); /*if age>=13 then output labcd4anly; else output labcd4_agelt13;*/ run; And got the following error NOTE: Character values have been converted to numeric values at the places given by: (Line):(Column). 100:15 100:20 NOTE: Invalid argument to function YRDIF(19650214,20100429) at line 100 column 9. sample_dt=20100429 dob=19650214 age=. _ERROR_=1 _N_=1 NOTE: Mathematical operations could not be performed at the following places. The results of the operations have been set to missing values. Each place is given by: (Number of times) at (Line):(Column). 7272426 at 100:9

SASPreK · ‎06-07-2024

Thank you for the info. However, my dob variable does not seem to be empty in the dataset. Also, the log shows otherwise, see below the bold part. NOTE: Character values have been converted to numeric values at the places given by: (Line):(Column). 100:15 100:20 NOTE: Invalid argument to function YRDIF(19650214,20100429) at line 100 column 9. sample_dt=20100429 dob=19650214 age=. _ERROR_=1 _N_=1 Please see this was another note NOTE: Mathematical operations could not be performed at the following places. The results of the operations have been set to missing values. Each place is given by: (Number of times) at (Line):(Column). 7272426 at 100:9

SASPreK · ‎06-07-2024

Thank you for catching that. I changed it to birth_dt and got the following error: 98 data labcd4age /*labcd4anly labcd4_agelt13*/; 99 set lab_cd4anly1; 100 birth_dt=input(dob,yymmdd8.); 101 collectdt=input(sample_dt, yymmdd8.); 102 format birth_dt collectdt yymmdd8.; 103 age=yrdif(birth_dt, collectdt); 104 format birth_dt collectdt yymmdd8.; 105 /*if age>=13 then output labcd4anly; else 106 output labcd4_agelt13;*/ 107 run; NOTE: Invalid argument to function YRDIF(.,18431) at line 103 column 9.

SASPreK · ‎06-05-2024

So sorry for not providing complete information and thank you for responding. The solution that you have provided works for the test dataset that was created for this question. However, I am getting the following error in my original dataset. ERROR 48-59: The format $YYMMDD was not found or could not be loaded. Please see the log below that has the similar kind of code. 264 data labcd4age /*labcd4anly labcd4_agelt13*/; 265 set lab_cd4anly1; 266 birth_dt=input(dob,yymmdd8.); 267 collectdt=input(sample_dt, yymmdd8.); 268 format birth_dt collectdt yymmdd8.; 269 age=yrdif(birth_dt, collectdt); 270 format dob collectdt yymmdd8.; -------- 48 ERROR 48-59: The format $YYMMDD was not found or could not be loaded. 271 /*if age>=13 then output labcd4anly; else 272 output labcd4_agelt13;*/ 273 run; NOTE: The SAS System stopped processing this step because of errors. WARNING: The data set WORK.LABCD4AGE may be incomplete. When this step was stopped there were 0 observations and 15 variables. WARNING: Data set WORK.LABCD4AGE was not replaced because this step was stopped. Please note these are the variable properties in the original dataset Alphabetic List of Variables and Attributes # Variable Type Len Format Informat Label 10 dob Char 8 $8. $8. dob 3 sample_dt Char 8 $8. $8. sample_dt

SASPreK · ‎06-05-2024

I have the following dataset data have; input dob sample_dt ; datalines; 19650214 20100429 19800724 20210823 19991208 20090908 ; I want to calculate the age in years based on dob and sample_dt. Looking for the following output dob sample_dt age 19650214 20100429 45 19800724 20210623 40 19991208 20090908 09 I tried the following code but its not working. data want; set have; dob=input(dob, yymmdd8.); sample_dt=input(sample_dt, yymmdd8.); age=yrdif(dob, sample_dt); format dob sample_dt yymmdd8.; run;

SASPreK · ‎05-31-2024

Thank you so much for the solution, this works! I have edited my question to name the variables as Var1 and Var2, thank you for catching that 🙂

SASPreK · ‎05-29-2024

I have the following input dataset ID Var1 Var2 11 A 23 11 B 121 11 A 23 12 A 32 12 B 158 12 A 32 12 B 158 13 A 87 13 B 567 I want to remove the repetitive records by ID variable and get the following output. ID Var1 Var2 11 A 23 11 B 121 12 A 32 12 B 158 13 A 87 13 B 567 Please suggest the best way to do it.

Online Status	Offline
Date Last Visited	a month ago

Sorting pairs and singles to assign new ID

Re: Calculating age from two date string variables in YYYYMMDD

Re: Calculating age from two date string variables in YYYYMMDD

Re: Calculating age from two date string variables in YYYYMMDD

Re: Calculating age from two date string variables in YYYYMMDD

Re: Calculating age from two date string variables in YYYYMMDD

Calculating age from two date string variables in YYYYMMDD

Re: Removing dups in dataset

Removing dups in dataset

Re: Sorting pairs and singles to assign new ID

Re: Sorting pairs and singles to assign new ID

Re: Calculating age from two date string variables in YYYYMMDD

Re: Calculating age from two date string variables in YYYYMMDD

Re: Calculating age from two date string variables in YYYYMMDD

Sorting pairs and singles to assign new ID

Re: Calculating age from two date string variables in YYYYMMDD

Re: Calculating age from two date string variables in YYYYMMDD

Re: Calculating age from two date string variables in YYYYMMDD

Re: Calculating age from two date string variables in YYYYMMDD

Re: Calculating age from two date string variables in YYYYMMDD

Calculating age from two date string variables in YYYYMMDD

Re: Removing dups in dataset

Removing dups in dataset