About AieuYuhara

AieuYuhara · ‎10-17-2019

Thanks for your help! All the codes given by others are working too, and i ran your codes, it works way faster, and I do like it when we dont have to specify length or dropping any variables, because it wont help when working with more than 1,000 variables. Thank you everyone!

ChrisNZ · ‎11-06-2017

You are doing something wrong. The comparison works fine here. data T1; X=0.1234; data T2; X=0.12340589; proc compare data=T1 compare=T2 method=absolute criterion=0.001; run; Values Comparison Summary Number of Variables Compared with All Observations Equal: 1. Number of Variables Compared with Some Observations Unequal: 0. Total Number of Values which Compare Unequal: 0. Total Number of Values not EXACTLY Equal: 1. Maximum Difference: 0.00000589.

Kurt_Bremser · ‎07-26-2017

Data quality improvement is a sophisticated process. SAS provides an extra application (Dataflux) and specialised server instance (DQ Server) for this, with additional licenses. Start with getting knowledge of your data. Run proc freq, and identify possible mistakes. How you fix it depends wholely on the detected mistakes (uppercase/lowercase, added blanks, numeric/word notation, ...) For names, an approach could be the following: - set up a table with expected names data exp_names; infile cards dlm=','; input expected_name :$25. actual_name :$25.; cards; ABC AGENCY,ABC AGENCY ABC AGENCY,ABC AGE. Y ; run; You can now match input data against actual_name; if a match is found, replace with expected_name; if not, output to a new dataset indicating errors. You then inspect the error dataset and add new lines to exp_names as you see appropriate. You may be able to create this dataset initially by using rules (first or last 5 matching, or similar) as you stated, but there will be occurences that can't be handled by a simple rule and need your intervention and the application of Brain 1.0. All this is of course the result of poor process design. Any data used for categorization has to be checked on input (eg company names are selected from a drop-down list of companies instead of entered manually); allowing free form input of such values is sub-optimal, to say the least.

AieuYuhara · ‎07-25-2017

Hi PG, Final will have the value which parent it should refer to. (The parent should refer to the Ref which have the latest date) Cust_name (last column) from observation 1 and 2 does not share the first five characters. as one is XYZ Production and PINK DEF Co. Since they do not share or have same 5 character for the first or last, it have their own ref as parent. This is the code to get the variable Final. But the problem is they read cust_name seperately even though they just differs 2/3 characters. proc sort data=temp; by id cust_name descending date; run; data want; set temp; by id cust_name; retain final; if first.cust_name then final = ref; run; proc sort data=want; by id cust_name date; run; proc print data=want noobs; run; after you run the above code, what i want is observation 7 to copy 5451654 for Final. why? because they are in the same ID, only some of the characters differs. Now I am looking for some codes which might help to group the data even though some of the customer_name differs one/two characters. Appreciate your help!

AieuYuhara · ‎07-25-2017

It works!!! Thanks Jag!

AieuYuhara · ‎07-24-2017

It works!!!! Thanks a lot!

AieuYuhara · ‎07-24-2017

It works!!!! Thanks a lot!

SASKiwi · ‎05-03-2017

My understanding is that it is not an additional column but simply unused space in the Listing report to the right of your defined columns. As far as I am aware there is no option to "snap" the table width to fit the column sizes.

Online Status	Offline
Date Last Visited	‎10-30-2019 06:03 AM

Re: Create table for Missing and Filling Rate

Create table for Missing and Filling Rate

Re: PROC Compare Output - Cater Rounded values on Base dataset

Re: PROC Compare Output - Cater Rounded values on Base dataset

PROC Compare Output - Cater Rounded values on Base dataset

Re: Data Cleaning - Character String

Data Cleaning - Character String

Re: Update a column based on latest date and other information

Re: Update a column based on latest date and other information

Update a column based on latest date and other information

Re: Create table for Missing and Filling Rate

Re: Create table for Missing and Filling Rate

Re: Update a column based on latest date and other information

Re: Create table for Missing and Filling Rate

Re: PROC Compare Output - Cater Rounded values on Base dataset

Re: Update a column based on latest date and other information

Re: Data Cleaning - Character String

Re: Create new table with same records on ONE variable, but different ...

Re: How do I do Data Cleansing - Data Mapping

Re: How do I do Data Cleansing - Data Mapping

Re: Delete Additional Columns from Listing in SAS VA