Sample data is attached:
How to create a dataset that contains the variable as value ,its origibal value and truncated values as shown below:
In the first record, business_name is truncated for company name "Byrum". The lengths of business_name and new_business_names are different and their values are also truncated.
In the recond row, the address got truncated for "ICORE".
I would like to show them in a dataset as following:
company_name | variable | Original_value | truncated_value | |
Byrum | business_description | GH Utilization&IV Solutions | GH Utilization&IV | |
ICORE | business_Address | P.O.BOX 5523,Kryton,TN 38172-2468 | P.O.BOX 5523,Kryton,TN 38172 |
The dataset can have 100's of variables with original and truncated values. I want to compare only those variables there is a truncation in their values.
What is the question that you want answered?
Do you want to find the observations where the truncated_value is different?
data want ;
set have;
where truncated_value ne original_value;
run;
Its been mentioned before, please don't post Office files as they are security risk. To get a good answer, post test data in the form of a datastep, what the output should look like, and explain any logic between the two.
It is ficticious data!
It does not matter, Office files are a security risk, some even block these files. Not to mention the extra work responders would need to do to import that file, guess what the structure is etc. Please follow the folowing post on how to provide test data as a datastep in the body of the post (using the {i} button):
Sure!
A bit better prepared sample data and desired output would be appreciated - but here you go:
DATA WORK.have(drop=_:);
INFILE DATALINES4
DLM='7F'x
MISSOVER
DSD;
INPUT
company_name : $10.
business_description : $30.
_business_description2 : $30.
business_Address : $40.
_business_Address2 : $40.;
output;
business_description=_business_description2;
business_Address=_business_Address2;
output;
DATALINES4;
ByrumGH Utilization&IV SolutionsGH Utilization&IVP.O.BOX 5523,Kryton,TN 38172-2468P.O.BOX 5523,Kryton,TN 38172-2468
ICORENovo DistributionNovo DistributionP.O.BOX 5523,Kryton,TN 38172-2468P.O.BOX 5523,Kryton,TN 38172
;;;;
run;
proc sort data=have;
by company_name;
run;
data want(keep=obs_num company_name variable_name max_string trunc_string);
length obs_num 8;
if 0 then set have;
length Variable_Name $32 max_string trunc_string $100.;
array test {2} business_description business_Address;
array _max_len [2] 8 _temporary_;
array _max_str [2] $100 _temporary_;
/* determine longest string per company and variable */
do while(1);
set have;
by company_name;
do _i=1 to dim(test);
if first.company_name or lengthn(test[_i])>_max_len[_i] then
do;
_max_len[_i]=lengthn(test[_i]);
_max_str[_i]=test[_i];
end;
end;
if last.company_name then leave;
end;
/* DQ: output cases with truncated strings */
do while(1);
set have end=last;
by company_name;
obs_num+1;
do _i= 1 to dim(test);
if lengthn(test[_i])<_max_len[_i] then
do;
variable_name=vname(test[_i]);
max_string=_max_str[_i];
trunc_string=test[_i];
end;
end;
if last.company_name then leave;
end;
run;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.