Hello,
I have two datasets that I need to merge using the ID variable (Inmatesid)
DATA X
Inmatesid | Race |
03-000011ID | 2 |
30-000211ID | 3 |
193-004000ID | 1 |
1903-022222ID | 1 |
1903-001244ID | 1 |
1903-001246ID | 1 |
1903-1265790ID | 5 |
1903-123457ID | 2 |
1903-123456ID | 2 |
1903-000055ID | 4 |
1903-122000 | 2 |
DATA Y
Inmatesid | Tatoo | Security level |
506-1903-000011 | Yes | 3 |
506-1903-000211 | No | . |
506-1903-004000 | Yes | 1 |
506-1903-022222 | Yes | 2 |
506-1903-001244 | Yes | 4 |
506-1903-001246 | No | . |
506-1903-126579 | Yes | 2 |
506-1903-123457 | No | 2 |
506-1903-123456 | Yes | 2 |
506-1903-000055 | Yes | 3 |
506-1903-002222 | Yes | 1 |
506-1903-011123 | No | 5 |
506-1903-122000 | Yes | 5 |
My first task was to standardize inmate ID in both datasets so that
1903-000055ID =>1903-000055 (data x)
506-1903-000055 =>1903-000055 (data y)
1) for data x, I used inmatesid=compress(inmatesid, ' ', 'I');
2) data y, i used Inmatesid=trim(left(substr(Inmatesid, index(Inmates id, '-')+1)));
the first issue I have is that there are some typos for inmatesid (for example 03-000011ID) in data x, which does not work well with that above code. so I manually tried to fix the ID's that were not transformed properly; I will be curious to know how to do this using a code .
Then I tried to merge the two data sets but I am getting this error. Does anybody know how to debbug this error. Thanks
WARNING: Multiple lengths were specified for the BY variable Inmatesid by input data sets and LENGTH,
FORMAT, INFORMAT, or ATTRIB statements. This may cause unexpected results.
WARNING: Multiple lengths were specified for the variable Inmatesid by input
data set(s). This may cause truncation of data
First, that is not an error. The cause is if you look at the Length property of the variable you will find that the two are different.
The solution to removing the warning is to ensure like variables have the same length before combining data sets.
There is a possibility that things won't work right because the "shorter" values may not match a longer one.
Example:
Id = 123456 in one set and in the other set
id = 1234567 but are supposed to be the "same". Since they are not identical then the BY treats them as different records.
Also the first dataset listed will set the length of the output variable. So in the case above the ID of 1234567 may get truncated to 123456 and then you spend time trying to figure out why the sets apparently didn't merge properly.
First, that is not an error. The cause is if you look at the Length property of the variable you will find that the two are different.
The solution to removing the warning is to ensure like variables have the same length before combining data sets.
There is a possibility that things won't work right because the "shorter" values may not match a longer one.
Example:
Id = 123456 in one set and in the other set
id = 1234567 but are supposed to be the "same". Since they are not identical then the BY treats them as different records.
Also the first dataset listed will set the length of the output variable. So in the case above the ID of 1234567 may get truncated to 123456 and then you spend time trying to figure out why the sets apparently didn't merge properly.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.