BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
LAKings
Calcite | Level 5

Hi All,

New to SAS EG and SQL, so I appreciate your patience and assistance! So far I've been able to find a solution to every hiccup I've encountered in my learning process except this one. What I'm trying to do is combine all the row values of a certain column into a single column/row when three different ID variables are the same.

Here's the sample data:

PersonProviderDateCode
JohnABC1/1/201512345
JohnABC1/1/201534567
JohnABC1/1/201545678
JohnXYZ1/1/201512345
JaneRST1/1/201512345
JaneRST2/1/201545678
JaneRST2/1/201534567

And here's what I'm after:

PersonProviderDateCombo
JohnABC1/1/201512345 34567 45678
JohnXYZ1/1/201512345
JaneRST1/1/201512345
JaneRST2/1/201545678 34567

So, essentially, if the Person/Provider/Date are the same in various rows, then I want to combine all the Codes of those rows into one column so there's only one row for each unique Person/Provider/Date possibility. My original data table has many more possibilities than I've listed above, but I hope this provides a general idea on what I'm trying to accomplish.

Any idea on how I can do this via PROC SQL? Or, if it's not possible via PROC SQL (which I've encountered on some other hiccups), any idea on the solution via DATA STEP?

Many thanks in advance!

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

I would suggest not putting them into a single variable. One issue with a single variable: how big should it be? Do you know the maximum number of codes that will ever be encountered? If not, you might declare the size for the variable combo to be too small to hold the result.

SQL would not be my first choice to attempt such an operation as it really doesn't have a native concept of "previous row" or "next row".

Proc transpose will create a single record for the repeated values with a new variable for each level and doesn't need to know how many you have. A brief example for your data:

proc sort data=have; by person provider date;run;

proc transpose data=have out=want;

by person provider date;

var code;

run;

If you really think you need this one variable:

options missing= " "; /* if your codes are numeric the combo below will have the missing character in the result*/

data finalwant;

     set want;

     array c col: ;

     combo = catx(" ", of c

  • );
  • run;

    options missing =".";

    What will you actually be doing with the resulting data? Many analysis operations work much better in your current form.

    View solution in original post

    2 REPLIES 2
    ballardw
    Super User

    I would suggest not putting them into a single variable. One issue with a single variable: how big should it be? Do you know the maximum number of codes that will ever be encountered? If not, you might declare the size for the variable combo to be too small to hold the result.

    SQL would not be my first choice to attempt such an operation as it really doesn't have a native concept of "previous row" or "next row".

    Proc transpose will create a single record for the repeated values with a new variable for each level and doesn't need to know how many you have. A brief example for your data:

    proc sort data=have; by person provider date;run;

    proc transpose data=have out=want;

    by person provider date;

    var code;

    run;

    If you really think you need this one variable:

    options missing= " "; /* if your codes are numeric the combo below will have the missing character in the result*/

    data finalwant;

         set want;

         array c col: ;

         combo = catx(" ", of c

  • );
  • run;

    options missing =".";

    What will you actually be doing with the resulting data? Many analysis operations work much better in your current form.

    LAKings
    Calcite | Level 5

    Hi there,

    I don't know the maximum number of codes that will ever be encountered, so that possibility is dynamic.

    Is there a way to declare a large size for the variable combo to try and reduce the possibility of the variable combo of exceeding the variable size?

    The resulting data that I'm trying to get at with the "Combo" variable won't be utilized in SAS for any further analysis operations, so I'm not worried about creating a variable I can't utilize in SAS after I create it. The combo variable is part of an Excel report for my customers and they want all of the codes in one column when the person/provider/date is the same.

    I used your provide steps and was able to get the final output I was after, thank you!

    sas-innovate-2024.png

    Don't miss out on SAS Innovate - Register now for the FREE Livestream!

    Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

     

    Register now!

    What is Bayesian Analysis?

    Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

    Find more tutorials on the SAS Users YouTube channel.

    Click image to register for webinarClick image to register for webinar

    Classroom Training Available!

    Select SAS Training centers are offering in-person courses. View upcoming courses for:

    View all other training opportunities.

    Discussion stats
    • 2 replies
    • 6104 views
    • 0 likes
    • 2 in conversation