BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
AJ_Brien
Quartz | Level 8

Hello,

 

I have a dataset which as char and numeric values. I need to sort the data by all of those values and then assign the top most or the last most with a flag.

 

Sample data:

acc(char)timemoneyid(char)
106798852702.777101000002817
473393517704.067101000002601
903803511164.957101002302817
11274058162.467101000002026
2232453520578.687101002302817
643146853418.597101000002601

 

I need to sort this data first by id, then by descending time, then by descending money and then by descending acc. Then I need to assign the top and bottom acc for each id group by a flag.

 

This is my code, and it's not sorting correctly. Some of these variables are char and some are num as you can see. Appreciate any help I can get

PROC SORT DATA = abc SORTSEQ =LINGUISTIC (NUMERIC_COLLATION=ON);
BY id descending time descending money DESCENDING acc;
RUN;

proc sql;
alter table abc
add flag char(4); 
update abc
set flag = 'YES' where id=first.id;/*I'm pretty sure this is incorrect*/
quit;

Expected output: As you can see we first sort by id, for each id we then sort by desc time, then for each time we sort by desc money and for each money we sort by desc acc value. And the one that is the highest for each id group, we mark them with a flag

acc(char)timemoneyid(char)flag
11274058162.467101000002026YES
473393517704.067101000002601YES
643146853418.597101000002601 
106798852702.777101000002817YES
2232453520578.687101002302817YES
903803511164.957101002302817 

 

1 ACCEPTED SOLUTION

Accepted Solutions
andreas_lds
Jade | Level 19

Instead of proc sql:

data abc_flagged;
  set abc;
  by id;
  
  flag = (first.id or last.id);
run;

 

EDIT: This "solution" was posted before @AJ_Brien  added the expected results to the original post. So the data step is somewhat useless.

View solution in original post

8 REPLIES 8
PeterClemmensen
Tourmaline | Level 20

So what does your desired result look like given the posted example data?

AJ_Brien
Quartz | Level 8
thank you for your reply, I added the expected data in the post.
PeterClemmensen
Tourmaline | Level 20

Ok. Then do

 

data abc;
input acc $ time money id :$20.;
datalines;
1067988 5 2702.77 7101000002817
47339 35 17704.06 7101000002601
90380 35 11164.95 7101002302817
112740 5 8162.46 7101000002026
223245 35 20578.68 7101002302817
6431468 5 3418.59 7101000002601
;

proc sort data = abc sortseq=linguistic (numeric_collation=on);
    by id descending time descending money descending acc;
run;

data want;
    set abc;
    by id;
    if first.id then flag='YES';
run;
AJ_Brien
Quartz | Level 8
that's exactly how I'm trying to sort data, but it doesn't sort anything except my id.
PeterClemmensen
Tourmaline | Level 20

It sure does. However with 6 obs there is not much to sort. But it does sort your data by id, by descending time and so on

andreas_lds
Jade | Level 19

Instead of proc sql:

data abc_flagged;
  set abc;
  by id;
  
  flag = (first.id or last.id);
run;

 

EDIT: This "solution" was posted before @AJ_Brien  added the expected results to the original post. So the data step is somewhat useless.

AJ_Brien
Quartz | Level 8
Thank you for your reply.
I need to sort by char and numeric values, how do I accomodate that?
Also my variable flag has a value 'YES' that needs to be assigned. How do I assign that once I identify which is first?
hashman
Ammonite | Level 13

@AJ_Brien:

You're talking about numeric and character variables. However, in your sample output ACC, TIME, and MONEY are all left-justified. Whatever SAS interface you're using to view the data shown here, it's a sure sign that these variables are stored as the character type. Thus in your sort, for example, 47339>1067988, 5>35, and so forth. Yet it surely appears as though you're interested in comparing them using the variables according to the numeric values represented by their digits. Thus, either convert ACC, TIME, and MONEY to the numeric data type using the INPUT function or right-justify them using the RIGHT function before the sorting.

 

Alternatively, revise what you know about proc SORT incredibly numerous options. I seem to recall that there's an option:

 

SORTSEQ=LINGUISTIC(NUMERIC_COLLATION=ON)

 

that forces a character string to be ordered by the numeric values of integers present in there instead of their character value. An excellent reference to all these nifty proc SORT features are excellently aggregated in the excellent SGF 2018 paper by Derek Morgan:

 

https://www.sas.com/content/dam/SAS/support/en/sas-global-forum-proceedings/2018/2773-2018.pdf

 

Kind regards

Paul D. 

 

    

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 1281 views
  • 0 likes
  • 4 in conversation