BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
LAKings
Calcite | Level 5

Hi All,

New to SAS EG and SQL, so I appreciate your patience and assistance! So far I've been able to find a solution to every hiccup I've encountered in my learning process except this one. What I'm trying to do is combine all the row values of a certain column into a single column/row when three different ID variables are the same.

Here's the sample data:

PersonProviderDateCode
JohnABC1/1/201512345
JohnABC1/1/201534567
JohnABC1/1/201545678
JohnXYZ1/1/201512345
JaneRST1/1/201512345
JaneRST2/1/201545678
JaneRST2/1/201534567

And here's what I'm after:

PersonProviderDateCombo
JohnABC1/1/201512345 34567 45678
JohnXYZ1/1/201512345
JaneRST1/1/201512345
JaneRST2/1/201545678 34567

So, essentially, if the Person/Provider/Date are the same in various rows, then I want to combine all the Codes of those rows into one column so there's only one row for each unique Person/Provider/Date possibility. My original data table has many more possibilities than I've listed above, but I hope this provides a general idea on what I'm trying to accomplish.

Any idea on how I can do this via PROC SQL? Or, if it's not possible via PROC SQL (which I've encountered on some other hiccups), any idea on the solution via DATA STEP?

Many thanks in advance!

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

I would suggest not putting them into a single variable. One issue with a single variable: how big should it be? Do you know the maximum number of codes that will ever be encountered? If not, you might declare the size for the variable combo to be too small to hold the result.

SQL would not be my first choice to attempt such an operation as it really doesn't have a native concept of "previous row" or "next row".

Proc transpose will create a single record for the repeated values with a new variable for each level and doesn't need to know how many you have. A brief example for your data:

proc sort data=have; by person provider date;run;

proc transpose data=have out=want;

by person provider date;

var code;

run;

If you really think you need this one variable:

options missing= " "; /* if your codes are numeric the combo below will have the missing character in the result*/

data finalwant;

     set want;

     array c col: ;

     combo = catx(" ", of c

  • );
  • run;

    options missing =".";

    What will you actually be doing with the resulting data? Many analysis operations work much better in your current form.

    View solution in original post

    2 REPLIES 2
    ballardw
    Super User

    I would suggest not putting them into a single variable. One issue with a single variable: how big should it be? Do you know the maximum number of codes that will ever be encountered? If not, you might declare the size for the variable combo to be too small to hold the result.

    SQL would not be my first choice to attempt such an operation as it really doesn't have a native concept of "previous row" or "next row".

    Proc transpose will create a single record for the repeated values with a new variable for each level and doesn't need to know how many you have. A brief example for your data:

    proc sort data=have; by person provider date;run;

    proc transpose data=have out=want;

    by person provider date;

    var code;

    run;

    If you really think you need this one variable:

    options missing= " "; /* if your codes are numeric the combo below will have the missing character in the result*/

    data finalwant;

         set want;

         array c col: ;

         combo = catx(" ", of c

  • );
  • run;

    options missing =".";

    What will you actually be doing with the resulting data? Many analysis operations work much better in your current form.

    LAKings
    Calcite | Level 5

    Hi there,

    I don't know the maximum number of codes that will ever be encountered, so that possibility is dynamic.

    Is there a way to declare a large size for the variable combo to try and reduce the possibility of the variable combo of exceeding the variable size?

    The resulting data that I'm trying to get at with the "Combo" variable won't be utilized in SAS for any further analysis operations, so I'm not worried about creating a variable I can't utilize in SAS after I create it. The combo variable is part of an Excel report for my customers and they want all of the codes in one column when the person/provider/date is the same.

    I used your provide steps and was able to get the final output I was after, thank you!

    SAS Innovate 2025: Register Now

    Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
    Sign up by Dec. 31 to get the 2024 rate of just $495.
    Register now!

    What is Bayesian Analysis?

    Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

    Find more tutorials on the SAS Users YouTube channel.

    SAS Training: Just a Click Away

     Ready to level-up your skills? Choose your own adventure.

    Browse our catalog!

    Discussion stats
    • 2 replies
    • 6946 views
    • 0 likes
    • 2 in conversation