Hi! I need to concatenate some variables, but I need to do such concatenation taking into account the common ID of each of the variables so that the concatenation occurs correctly.
Data structure:
ID | TYPE | VALUE |
1 | NAME | love |
1 | URL | https://love.com/ |
2 | NAME | happy |
2 | URL | https://happy.com/ |
Desired final table:
ID | VALUE |
1 | love - https://love.com/ |
2 | happy - https://happy.com/ |
Maybe it’s not the smartest approach, but I started writing the following code and I'm totally lost once I don't know how to move on. How do I reference each of my rows to do the concatenation? How to relate the value of type name with the value of type URL based on IDs?
data work.WANT; DATASTMTCHK=NONE; set work.HAVE; VALUE=catx(' - ', of VALUE:); run;
First likely issue: stuffing two values into the same variable is extremely likely to result in text that is too long at least some times. If your Value variable is defined as length 20 then it can hold 20 characters. So if you try to store "happy - https://happy.com/" which is 27 characters it won't fit.
So you need to address that.
Second you need to be concerned with getting a value across the data step boundary.
If you have EXACTLY two value for each and every ID then this may work:
data want; set have; by id; lvalue=lag(value); if last.id then do; newvalue= catx(' - ',lvalue,value); output; end; keep id newvalue; run;
This will require the data to be in ID order. If grouped but not sorted as "notsorted" to the BY statement.
The LAG function(s) will retrieve the value from a previous record (lag3 gets the value from the 3rd record before the current for example). However since the behavior is such that when called inside any "if" construct if uses the last time the IF was true to keep track of which records you may want.
If you ever have more or fewer than two records for an ID then you need to provide examples and desired output and perhaps a more general description.
Placing multiple values in a single variable often results in very hard to use code later.
data want;
length value $ 64;
merge have(where=(type='NAME')) have(where=(type='URL') rename=(value=url));
by id;
value = catx(' - ',value,url);
run;
Do this
data have;
input ID TYPE $ VALUE :$20.;
datalines;
1 NAME love
1 URL https://love.com/
2 NAME happy
2 URL https://happy.com/
;
data want(rename = newvalue = value);
do until (last.id);
set have;
by id;
length newvalue $ 200;
newvalue = catx(' - ', newvalue, value);
end;
drop value;
run;
data have;
input ID TYPE : $ VALUE : $ 200.;
cards;
1 NAME love
1 URL https://love.com/
2 NAME happy
2 URL https://happy.com/
;
run;
proc print;
run;
data want(rename = (v=value));
set have;
by id;
length v $ 200; retain v;
if first.id then v = "";
v = catx(" - ",v, VALUE);
if last.id then output;
keep ID v;
run;
proc print;
run;
First likely issue: stuffing two values into the same variable is extremely likely to result in text that is too long at least some times. If your Value variable is defined as length 20 then it can hold 20 characters. So if you try to store "happy - https://happy.com/" which is 27 characters it won't fit.
So you need to address that.
Second you need to be concerned with getting a value across the data step boundary.
If you have EXACTLY two value for each and every ID then this may work:
data want; set have; by id; lvalue=lag(value); if last.id then do; newvalue= catx(' - ',lvalue,value); output; end; keep id newvalue; run;
This will require the data to be in ID order. If grouped but not sorted as "notsorted" to the BY statement.
The LAG function(s) will retrieve the value from a previous record (lag3 gets the value from the 3rd record before the current for example). However since the behavior is such that when called inside any "if" construct if uses the last time the IF was true to keep track of which records you may want.
If you ever have more or fewer than two records for an ID then you need to provide examples and desired output and perhaps a more general description.
Placing multiple values in a single variable often results in very hard to use code later.
@Neeydchi wrote:
Thank you for your reply and also for the issues raised, I took note and I am considering them.
However, the result is still not as expected. With this code, I get the following result:
ID newvalue
1 love - happy
2 https://love.com/ - https://happy.com/
Then your data is not as you showed.
With data made from your example:
data have; input ID TYPE $ VALUE:$20.; datalines; 1 NAME love 1 URL https://love.com/ 2 NAME happy 2 URL https://happy.com/ ; data want; set have; by id; lvalue=lag(value); if last.id then do; newvalue= catx(' - ',lvalue,value); output; end; keep id newvalue; run;
The result is indeed
love - https://love.com/
The result you show would require the values of ID1 to be Love and Happy , not Love and https://love.com.
So, provide a data step similar to what I did above that duplicates your actual data.
/*---- Start of User Written Code ----*/ data &_OUTPUT; set &_INPUT; by id; lvalue=lag(value); if last.id then do; length newvalue $ 1024; newvalue= catx(' - ',lvalue,value); output; end; keep id newvalue; run; /*---- End of User Written Code ----*/
the result is as I mentioned earlier, the code is concatenating the values with different id but with the same type. Although I also cannot understand why.
@Neeydchi wrote:
/*---- Start of User Written Code ----*/ data &_OUTPUT; set &_INPUT; by id; lvalue=lag(value); if last.id then do; length newvalue $ 1024; newvalue= catx(' - ',lvalue,value); output; end; keep id newvalue; run; /*---- End of User Written Code ----*/the result is as I mentioned earlier, the code is concatenating the values with different id but with the same type. Although I also cannot understand why.
Show us (a portion of) the exact data set you are working with. Follow these instructions: https://blogs.sas.com/content/sastraining/2016/03/11/jedi-sas-tricks-data-to-data-step-macro/ (Not acceptable are Screen captures of the data or Excel files)
@Neeydchi wrote:
/*---- Start of User Written Code ----*/ data &_OUTPUT; set &_INPUT; by id; lvalue=lag(value); if last.id then do; length newvalue $ 1024; newvalue= catx(' - ',lvalue,value); output; end; keep id newvalue; run; /*---- End of User Written Code ----*/the result is as I mentioned earlier, the code is concatenating the values with different id but with the same type. Although I also cannot understand why.
As was stated when posted that code requires that each value of ID has EXACTLY two observations. In addition it requires that the TYPE='NAME' value is the first and the TYPE='URL' value is the second.
So if the real data does not meet that restriction you cannot use this method.
You have many proposed solutions that should work well for this.
The basic issue is that your values are currently stored in different observations so you need to get the two values into the same observation before you can combine the values. Whether that is via retaining values to the next observation, using LAG() to remember previous values, use MERGE to combine that data.
Another method that I didn't see proposed was to transform the data from the current tall name/value structure into a wide structure where the values are in their own variables and the name field's value is used to name those new variables.
PROC TRANSPOSE is a good tool for that.
data have ;
input id type :$32. value :$200.;
cards;
1 NAME love
1 URL https://love.com/
2 NAME happy
2 URL https://happy.com/
;
proc transpose data=have out=wide(drop=_name_);
by id;
var value;
id type ;
run;
data want ;
set wide ;
length value $220 ;
value = catx(' - ',name,url);
keep id value;
run;
Results:
Obs id value 1 1 love - https://love.com/ 2 2 happy - https://happy.com/
The advantage of using this transpose method is that more easily handles expanding to situations that require combining more than just two values from your original TALL structure.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.