BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
melhaf
Fluorite | Level 6

Hi, 
I have a dataset that it something like this (can't post the original one because of confidential) but I'll do my best to explain it.  The dataset is not grouped och order by nothing, I just have ordered it here because it would be easier for you to read)

The problem is that I have a dataset who looks like this: 

 

 

BOOKING(char)           AMOUNT(num)     DAYS(num)
1.2.3     Andersen      10              30 
1.2.4     Andersen      12              28 
1.2.4      Baker        whatever        whatever
1.2.5      Brown        whatever        whatever      
2.1.1      Clark         8               20
2.1.2      Clark          6               18


You see that we have duplicates of Andersen and Clark. Because we have four different families. How do I combine them? (recall that they're CHAR). 

I have tried:

1. select distinct  * from dataset GROUP BY 1 . 
- then I get duplicates ....

 

2. sum(BOOKING)
- not working beacuse it is a char


3. proc sort data=dataset out=dataset2 nodupkey; by BOOKING; run;
- still sees the duplicates and I guess it is because is is different because of the (1.2.3) and (1.2.4) thing that comes before the last name 

 

4. sum(input(BOOKING, best12.))as booking 
- I get : 
NOTE: The query requires remerging summary statistics back with the original data.
NOTE: Invalid string.
NOTE: Invalid argument to function INPUT. Missing values may be generated.

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

Not sure that I fully understood what you're asking for. Is below returning what you want?

data have;
 input BOOKING $14. @15 AMOUNT DAYS;
 datalines;
1.2.3 Andersen 10 30 
1.2.4 Andersen 12 28 
1.2.4 Baker    1 1
1.2.5 Brown    1 1 
2.1.1 Clark    8 20
2.1.2 Clark    6 18
;

proc sql;
  select 
     substr(booking,find(booking,' ')) as booking_name
    ,sum(amount) as sum_amount
    ,sum(days) as sum_days
  from have
  group by booking_name
  ;
quit;

Patrick_0-1707256491751.png

 

View solution in original post

10 REPLIES 10
melhaf
Fluorite | Level 6

I don't know how to edit my topic. But I want to summerize Andersen and Clark into one 

Reeza
Super User
Please show what you would expect as output.
SASKiwi
PROC Star

What do you want the single rows for Andersen and Clark to contain? Is there a rule to decide which of the two rows in these examples is chosen?

melhaf
Fluorite | Level 6
Hi, I want to summarize them, so Andersen have together 10+12 AMOUNT,
30+28 DAYS and the same for Clark
Reeza
Super User
What happens with booking though? Since you seem to be trying to sum them. Showing the example output would be helpful.
melhaf
Fluorite | Level 6
It dosen't happends anything with the code I posted. Or, I posted what the log said.
Reeza
Super User
And is there two different variables, the booking and the name or are they one variable? If so, you very much want to separate those first.
melhaf
Fluorite | Level 6

They are all the same the column is "1.2.3 Andersen" and then the other family is "1.2.4 Andersen" and I want to summerize families that have the name Andersen (and Clark in this example as well)

Patrick
Opal | Level 21

Not sure that I fully understood what you're asking for. Is below returning what you want?

data have;
 input BOOKING $14. @15 AMOUNT DAYS;
 datalines;
1.2.3 Andersen 10 30 
1.2.4 Andersen 12 28 
1.2.4 Baker    1 1
1.2.5 Brown    1 1 
2.1.1 Clark    8 20
2.1.2 Clark    6 18
;

proc sql;
  select 
     substr(booking,find(booking,' ')) as booking_name
    ,sum(amount) as sum_amount
    ,sum(days) as sum_days
  from have
  group by booking_name
  ;
quit;

Patrick_0-1707256491751.png

 

melhaf
Fluorite | Level 6

Thank you so much !

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 1265 views
  • 2 likes
  • 4 in conversation