I am working with a transaction type data set where there are multiple observations for individual children. Each child has a unique "child_id" and there are multiple observations, for which the child has an associated value for the variable FUNDING. There may or may not be repeats for the values of FUNDING. My main goal is to eliminate consecutive, repeating FUNDING values and then transpose the data set so that I have a list of "paths." There may be instances in which a funding source such as C appears as the second and last funding source. How can I capture this, but also delete the consecutive, repeating instances of the funding source C?
There are around 80,000 children. There an indefinite number of TREATMENT paths based on around 30 TREATMENT types.
The following is similar to what the data currently looks like:
Child Funding
1 A
1 C
1 C
1 D
1 C
2 B
2 B
2 A
2 B
The following is just a brief example of what I would like the data to look like. The TREATMENT should only increment when the treatment type has changed.
Child Funding_1 Funding_2 Funding_3 Funding_4 Funding_n
1 A C D C
2 B A B
Thank you for your help.
Something like below should do the job.
data have;
input child $ treatment $;
datalines;
1 A
1 C
1 C
1 D
1 C
2 B
2 B
2 A
2 B
;
run;
data prep;
set have;
by child treatment notsorted;
if first.treatment;
run;
proc transpose data=prep out=want prefix=Treatment_;
by child notsorted;
var treatment;
run;
Something like below should do the job.
data have;
input child $ treatment $;
datalines;
1 A
1 C
1 C
1 D
1 C
2 B
2 B
2 A
2 B
;
run;
data prep;
set have;
by child treatment notsorted;
if first.treatment;
run;
proc transpose data=prep out=want prefix=Treatment_;
by child notsorted;
var treatment;
run;
Patrick,
I edited my initial post to highlight my dilemma. A funding source such as 'C' may appear and then reappear for a particular child. I believe your procedure, like my previous attempts does not account for this.
Best,
Phil
Hi Phil
The code I've posted creates your desired result. What do you believe I'm missing?
It is the "notsorted" in the by statement which allows to transform stuff like C C A C to C A C - so the code I've posted will only get rid of consecutive duplicates and also won't change the sequence of the original data. Isn't this what you've asked for?
Thanks
Patrick
You're right. I apologize for that. Thank you for your help.
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.