Hi!
Quick question (with hopefully a quick answer).
I've a dataset with multiple rows. Each row does have - let's say - 41 columns of which column 1 is a unique identifier, the next 20 columns are customer characteristics and the next 20 columns are the same as the 20 columns before (same names but with a fixed prefix) and could contain - per column - a different value.
I'm looking for a smart way to compare the value of each of the 20 columns with the value of the same column with the same name (but now with a prefix). In case the value is the same, nothing should happen. In case the value is different, the column should be updated with the value from the column with the same name but the fixed prefix.
I would like to prevent something as the below for each column:
if REGIO ne XXX_REGIO then REGIO = XXX_REGIO;
if GESLACHT ne XXX_GESLACHT then GESLACHT = XXX_GESLACHT;
if VOORLETTERS ne XXX_VOORLETTERS then VOORLETTERS = XXX_VOORLETTERS
Example of the dataset with a limited number of columns:
ACCOUNTNUMBER | REGIO | GESLACHT | VOORLETTERS | XXX_REGIO | XXX_GESLACHT | XXX_VOORLETTERS |
1 | A | 1 | R.H.M. | B | 1 | R.H.L. |
2 | A | 2 | D.A. | A | 1 | D.A. |
3 | B | 1 | A. | A | 2 | A. |
4 | A | 1 | S.B.M.J. | A | 1 | S.B.M.K. |
5 | A | 2 | L.W. | B | 1 | L.W. |
How could this be done?
Thanks in advance!
Regards,
Sven
There are several possible methods.
The first question: where do the prefixed columns come from? If you just built your table from two different tables and added the prefix on the way, you best drop that step and use update in a data step to combine your tables:
data class_upd;
set sashelp.class;
if name = 'Alfred' then age = 16;
run;
data class_updated;
update sashelp.class class_upd;
by name;
run;
That's not a good way to model your data, instead of columns, move each block of 41 columns down to its own row. This will make your life much easier in numerous ways.
As such with your structure now, your going to have to setup arrays for these blocks. It depends on your column names for how easy this is, if you have:
ID SET1_REGIO SET1_GESLACHT... SET2_REGIO...;
Then you can do something like:
data want; set have; array s1{20} set1:; array s2{20} set2:; ... do i=1 to 20; if s1{i}=s2{i}... end; run;
That will work, however you will find its a lot of coding and effort. Normalising - i.e. having rows rather than columns, will make this all far simpler.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.