BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Neeydchi
Fluorite | Level 6

Hi! I need to concatenate some variables, but I need to do such concatenation taking into account the common ID of each of the variables so that the concatenation occurs correctly.

 

Data structure:

 

IDTYPEVALUE
1NAMElove
1URLhttps://love.com/
2NAMEhappy
2URLhttps://happy.com/

 

Desired final table:

 

IDVALUE
1love -  https://love.com/
2happy - https://happy.com/

 

Maybe it’s not the smartest approach, but I started writing the following code and I'm totally lost once I don't know how to move on. How do I reference each of my rows to do the concatenation?  How to relate the value of type name with the value of type URL based on IDs?

 

 

data work.WANT; 
DATASTMTCHK=NONE;
set work.HAVE;
VALUE=catx(' - ', of VALUE:); run;
Can someone help me?
Thank you! 

 

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

First likely issue: stuffing two values into the same variable is extremely likely to result in text that is too long at least some times. If your Value variable is defined as length 20 then it can hold 20 characters. So if you try to store "happy - https://happy.com/"  which is 27 characters it won't fit.

So you need to address that.

Second you need to be concerned with getting a value across the data step boundary.

If you have EXACTLY two value for each and every ID then this may work:

 

 
data want;
   set have;
   by id;
   lvalue=lag(value);
   if last.id then do;
      newvalue= catx(' - ',lvalue,value);
      output;
   end;
   keep id newvalue;
run;
      

This will require the data to be in ID order. If grouped but not sorted as "notsorted" to the BY statement.

The LAG function(s) will retrieve the value from a previous record (lag3 gets the value from the 3rd record before the current for example). However since the behavior is such that when called inside any "if" construct if uses the last time the IF was true to keep track of which records you may want.

 

If you ever have more or fewer than two records for an ID then you need to provide examples and desired output and perhaps a more general description.

 

Placing multiple values in a single variable often results in very hard to use code later.

View solution in original post

10 REPLIES 10
PaigeMiller
Diamond | Level 26
data want;
    length value $ 64;
    merge have(where=(type='NAME')) have(where=(type='URL') rename=(value=url));
    by id;
    value = catx(' - ',value,url);
run;
--
Paige Miller
PeterClemmensen
Tourmaline | Level 20

Do this

 

data have;
input ID TYPE $ VALUE :$20.;
datalines;
1 NAME love               
1 URL  https://love.com/  
2 NAME happy              
2 URL  https://happy.com/ 
;

data want(rename = newvalue = value);
   do until (last.id);
      set have;
      by id;
      length newvalue $ 200;
      newvalue = catx(' - ', newvalue, value);
   end;
   drop value;
run;
yabwon
Onyx | Level 15
data have;
input ID TYPE : $	VALUE : $ 200.;
cards;
1 NAME love
1 URL https://love.com/
2 NAME happy
2 URL https://happy.com/
;
run;
proc print;
run;

data want(rename = (v=value));
  set have;
  by id;
  length v $ 200; retain v;
  if first.id then v = "";

  v = catx(" - ",v, VALUE);
  
  if last.id then output;
  keep ID v;
run;
proc print;
run;
_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



ballardw
Super User

First likely issue: stuffing two values into the same variable is extremely likely to result in text that is too long at least some times. If your Value variable is defined as length 20 then it can hold 20 characters. So if you try to store "happy - https://happy.com/"  which is 27 characters it won't fit.

So you need to address that.

Second you need to be concerned with getting a value across the data step boundary.

If you have EXACTLY two value for each and every ID then this may work:

 

 
data want;
   set have;
   by id;
   lvalue=lag(value);
   if last.id then do;
      newvalue= catx(' - ',lvalue,value);
      output;
   end;
   keep id newvalue;
run;
      

This will require the data to be in ID order. If grouped but not sorted as "notsorted" to the BY statement.

The LAG function(s) will retrieve the value from a previous record (lag3 gets the value from the 3rd record before the current for example). However since the behavior is such that when called inside any "if" construct if uses the last time the IF was true to keep track of which records you may want.

 

If you ever have more or fewer than two records for an ID then you need to provide examples and desired output and perhaps a more general description.

 

Placing multiple values in a single variable often results in very hard to use code later.

Neeydchi
Fluorite | Level 6
Thank you for your reply and also for the issues raised, I took note and I am considering them.

However, the result is still not as expected. With this code, I get the following result:

ID newvalue
1 love - happy
2 https://love.com/ - https://happy.com/
ballardw
Super User

@Neeydchi wrote:
Thank you for your reply and also for the issues raised, I took note and I am considering them.

However, the result is still not as expected. With this code, I get the following result:

ID newvalue
1 love - happy
2 https://love.com/ - https://happy.com/

Then your data is not as you showed.

With data made from your example:

data have;
  input ID	TYPE $	VALUE:$20.;
datalines;
1	NAME	love
1	URL	https://love.com/
2	NAME	happy
2	URL	https://happy.com/
;
 
data want;
   set have;
   by id;
   lvalue=lag(value);
   if last.id then do;
      newvalue= catx(' - ',lvalue,value);
      output;
   end;
   keep id newvalue;
run;
      

The result is indeed

love - https://love.com/

The result you show would require the values of ID1 to be Love and Happy , not Love and https://love.com.

So, provide a data step similar to what I did above that duplicates your actual data.

 

Neeydchi
Fluorite | Level 6

 

/*---- Start of User Written Code  ----*/ 

data &_OUTPUT;
   set &_INPUT;
   by id;
   lvalue=lag(value);
   if last.id then do;
length newvalue $ 1024;
      newvalue= catx(' - ',lvalue,value);
      output;
   end;
   keep id newvalue;
run;
/*---- End of User Written Code  ----*/ 

the result is as I mentioned earlier, the code is concatenating the values ​​with different id but with the same type. Although I also cannot understand why.

 

PaigeMiller
Diamond | Level 26

@Neeydchi wrote:

 

/*---- Start of User Written Code  ----*/ 

data &_OUTPUT;
   set &_INPUT;
   by id;
   lvalue=lag(value);
   if last.id then do;
length newvalue $ 1024;
      newvalue= catx(' - ',lvalue,value);
      output;
   end;
   keep id newvalue;
run;
/*---- End of User Written Code  ----*/ 

the result is as I mentioned earlier, the code is concatenating the values ​​with different id but with the same type. Although I also cannot understand why.

 


Show us (a portion of) the exact data set you are working with. Follow these instructions: https://blogs.sas.com/content/sastraining/2016/03/11/jedi-sas-tricks-data-to-data-step-macro/ (Not acceptable are Screen captures of the data or Excel files)

--
Paige Miller
Tom
Super User Tom
Super User

@Neeydchi wrote:

 

/*---- Start of User Written Code  ----*/ 

data &_OUTPUT;
   set &_INPUT;
   by id;
   lvalue=lag(value);
   if last.id then do;
length newvalue $ 1024;
      newvalue= catx(' - ',lvalue,value);
      output;
   end;
   keep id newvalue;
run;
/*---- End of User Written Code  ----*/ 

the result is as I mentioned earlier, the code is concatenating the values ​​with different id but with the same type. Although I also cannot understand why.

 


As was stated when posted that code requires that each value of ID has EXACTLY two observations.  In addition it requires that the TYPE='NAME' value is the first and the TYPE='URL' value is the second.  

 

So if the real data does not meet that restriction you cannot use this method.

Tom
Super User Tom
Super User

You have many proposed solutions that should work well for this.

The basic issue is that your values are currently stored in different observations so you need to get the two values into the same observation before you can combine the values.  Whether that is via retaining values to the next observation, using LAG() to remember previous values, use MERGE to combine that data.  

 

Another method that I didn't see proposed was to transform the data from the current tall name/value structure into a wide structure where the values are in their own variables and the name field's value is used to name those new variables. 

 

PROC TRANSPOSE is a good tool for that.

data have ;
  input id type :$32. value :$200.;
cards;
1	NAME	love
1	URL	https://love.com/
2	NAME	happy
2	URL	https://happy.com/
;

proc transpose data=have out=wide(drop=_name_);
  by id;
  var value;
  id type ;
run;

data want ;
  set wide ;
  length value $220 ;
  value = catx(' - ',name,url);
  keep id value;
run;

Results:

Obs    id              value

 1      1    love - https://love.com/
 2      2    happy - https://happy.com/

The advantage of using this transpose method is that more easily handles expanding to situations that require combining more than just two values from your original TALL structure.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 10 replies
  • 1293 views
  • 2 likes
  • 6 in conversation