BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Havi
Obsidian | Level 7

Hi Guys

 

I have the dataset 1 below:

I receive an excel sheet of about 60+ random numbers. How do I read in the sheet and link the unique numbers from EXCEL1 to DATASET 1 to get the account ID i.e OUTPUT 1?

 

DATASET 1      
RANGE MIN MAX ACCOUNT_ID
High 1 10 A0001
High 11 20 A0002
Medium 21 29 A0003
Medium 30 69 A0004
Medium 70 99 A0005
Medium 100 299 A0006
Medium 300 499 A0007
Medium 500 1000 A0008
Low 1001 1499 A0009
Low 1500 2000 A0010
Very Low 2001 6000 A0011
Very Low 6001 9000 A0012

 

EXCEL1      
NUMBERS      
10      
2005      
8000      

 

OUTPUT1      
NUMBER ACCOUNT_ID    
10 A0001    
2005 A0011    
8000 A0012    

 

Many thanks

1 ACCEPTED SOLUTION

Accepted Solutions
MikeZdeb
Rhodochrosite | Level 12

Hi, this works with the data you posted (assumes you'll have a match for all obs in data set SEARCH) ...

 


data x;
input range :$6. min max account :$5. @@;
datalines;
High 1 10 A0001     High 11 20 A0002     Medium 21 29 A0003      Medium 30 69 A0004
Medium 70 99 A0005  Medium 100 299 A0006 Medium 300 499 A0007    Medium 500 1000 A0008
Low 1001 1499 A0009 Low 1500 2000 A0010  VeryLow 2001 6000 A0011 VeryLow 6001 9000 A0012
;

 

data search;
input number @@;
datalines;
10 2005 8000
;

 

data y (keep=number account);
set search;
do until (number ge min and number le max);
   set x;

end;

run;

 

data set Y ...

Obs    number    account

 1        10      A0001
 2      2005      A0011
 3      8000      A0012

 

View solution in original post

9 REPLIES 9
LinusH
Tourmaline | Level 20

 Use SQL join with a between - and criteria.

Data never sleeps
Astounding
PROC Star

Presumably, you know how to read in the sheet.  If not, that's a separate question.

 

I would create a format out of the min/max ranges:

 

data fmt;

fmtname='$acct';

set dataset1 end=done;

length label $ 7;

rename min=start max=end;

label=account;

output;

if done;

hlo='O';

label='Unknown';

output;

run;

 

proc format cntlin=fmt;

run;

 

Then it becomes easy to look up the values:

 

data want;

set excel1;

account_id = put(number, $acct.);

run;

 

It's untested, but should work.  Good luck.

MadhuKorni
Quartz | Level 8


proc sql;
create table Want as
select A.Number, B.Account_Id from Numbers1 A left join Dataset1 B
on B.Min le A.Number le B.Max;
quit;

Havi
Obsidian | Level 7

Thanks MadhuKorni

 

Would this be the most optimal way? The reason I am asking is that although the numbers table is small with 60 records the left join to the dataset1 table is around 40 million records?

MikeZdeb
Rhodochrosite | Level 12

Hi, try 40 million observations (<2 seconds) ..

 

data x;
do account=1 to 40e6;
  min = 10*account;
  max = min + 9;
  output;
end;

run;

 

data search;
input number :comma. @@;
datalines;
10 2005 99,999,999 300,000,000 999,999,999
;

 

* won't find last number ... too big;

data y (keep=number account);
set search;
do until (number ge min and number le max);
   set x;
end;

 

273  data y (keep=number account);
274  set search;
275  do until (number ge min and number le max);
276     set x;
277  end;
278
279  run;

NOTE: There were 5 observations read from the data set WORK.SEARCH.
NOTE: There were 40000000 observations read from the data set WORK.X.
NOTE: The data set WORK.Y has 4 observations and 2 variables.
NOTE: DATA statement used (Total process time):
      real time           1.38 seconds
      cpu time            1.34 seconds

MikeZdeb
Rhodochrosite | Level 12

Hi, this works with the data you posted (assumes you'll have a match for all obs in data set SEARCH) ...

 


data x;
input range :$6. min max account :$5. @@;
datalines;
High 1 10 A0001     High 11 20 A0002     Medium 21 29 A0003      Medium 30 69 A0004
Medium 70 99 A0005  Medium 100 299 A0006 Medium 300 499 A0007    Medium 500 1000 A0008
Low 1001 1499 A0009 Low 1500 2000 A0010  VeryLow 2001 6000 A0011 VeryLow 6001 9000 A0012
;

 

data search;
input number @@;
datalines;
10 2005 8000
;

 

data y (keep=number account);
set search;
do until (number ge min and number le max);
   set x;

end;

run;

 

data set Y ...

Obs    number    account

 1        10      A0001
 2      2005      A0011
 3      8000      A0012

 

Astounding
PROC Star

Mike,

 

Very nice approach.  It can be expanded to cover additional situations.  Perhaps, for example, two NUMBER values belong to the same account.  You might go with:

 

data want;

set search;

if (min <= number <= max) then return;

do until (min <= number <= max);

   set x;

end;

run;

 

And there is always the more complex possibility that you mentioned:  What if one NUMBER does not have a match (a decimal fraction perhaps?) but the next NUMBER does have a match?  WIthout sketching out all the details, it would probably be easiest to create two output data sets (one holding matches, one holding mismatches).

MikeZdeb
Rhodochrosite | Level 12

Hi ...

 

re duplicates in the SEARCH data set (if SEARCH in order) ...

 

data y (keep=number account);
set search;
if number ne lag(number);
do until (number ge min and number le max);
   set x;
end;
run; 

 

re "more complex possibility" ... life is full of them

Havi
Obsidian | Level 7
Thanks a million Mike. Returns back the results incredibly fast 🙂

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 1406 views
  • 2 likes
  • 5 in conversation