I've attached the source and target and below are the rules to derive the target,
1) Based on ID order the DIAG Codes by line number
2) Populate only Distinct DI codes in above order.
3) Assign the LI_SEQ based on the order of distinct Di Codes.
I don't want to use proc transpose and I need help in writing SQL or it would nice if you point to the document which can help here.
Ignore the data inside table as I shown only few data in source and target to help understand the logic
Source Table:
Target:
What is a Diag code? I don't see anything by that name.
What "above order"? Especially if it relates to character values like A542. Because "order" to you may not be apparent to anything else. With normal rules for character values "A9" is after "A542". So a very clear definition of what "order" is to be applied is critical. So rethink and re-describe the rules so we can have a change of providing an answer.
Also, is that "target" the exact solution for the shown source? It feels like that target may be missing more than a few rows. If you do not provide a complete target then you may have to provide more rules about why something gets left out of the target.
I don't see any rules about use of any of the pointer variables or how such a value ends up in the target set.
And there is nothing at all about how HDR_POSITION is to be created or assigned.
I believe you have been shown more than once how to provide data as a data step. Please provide data in such form so we don't have to guess about actual variable names or types.
I don't think you can really do a transpose in SQL. You probably need to abandon that requirement.
Why SQL? I would assume that would just make the problem much more complicated than it appears to be.
Photographs of data are impossible to code with. And photographs of dataset that use different variable names than the ones used in the problem description make it even harder.
It looks to me like you have two groups of variables with numeric suffixes. Is it safe to assume they are related to each other so that the first entry in the first group is paired with the first entry in the second group?
I do not see any LINE NUMBER variable in your dataset. I thought perhaps you meant LI_SEQ, but then you say you want to create that variable. But if you want to create it then why is it already in the SOURCE dataset?
You do have a DATE variable that appears to be named PAID_DATE. Do you want to order the data by PAID_DATE? Not sure if that makes much sense unless your analysis is about the economic impact of delaying payments?
Do you mean to say that even though there is no variable by which to order the data you want to process it in the order it appears? That is not something the SQL can do. But it is trivial in SAS code.
SQL is not very suitable for transposing data. Using SAS Proc Transpose or a data step is what you should be using.
The only reason for a SQL would be if processing needs to get pushed to a database. If so then you would need to consult the documentation for the database and use explicit database specific SQL if and as available (like the PIVOT command in Oracle).
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.