BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
kjohnsonm
Lapis Lazuli | Level 10

Hello all,

I have two lists one of 6100 files a fully qualified windows dir\path\filename.ext with random punctuation like:     &,()'

and another larger list 81K obs.   The 6100 records will each and everyone have a match in the larger list however the larger list has none of those punctuation marks any longer but does have a place holder underscore. "_" if any punctuation was taken out for example:

D:\path\file&name.ext                                   D:\path\file_name.ext

D:\path\path's\file&name.ext                        D:\path\path_s\file_name.ext

E:\path (my)\pat&h\new, file.SAV                    E:\path _my_\pat_h\new_ file.SAV

etc.

6100                                                             81K

Does anyone know how to crosswalk "left join" these two lists with wild cards and might have the time to toss me a slow pitch solfball?

 

/*for a clear list of what has been compressed out*/
compress(Path_File, "',&()", "")

These 6100 files are the last of 240K files I need to research for metadata, however I did not have the skill to make SAS read these files with the problematic punctuation still in the file paths/names.  PS the path file names can be upto 260 char long, all other data is derived from the path, file name and data type so no fields were given as examples.   TIA. -KJ

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

Use function translate()

 

a inner join b on a.newPath = translate(b.oldPath,"____", "&,()") 

PG

View solution in original post

2 REPLIES 2
PGStats
Opal | Level 21

Use function translate()

 

a inner join b on a.newPath = translate(b.oldPath,"____", "&,()") 

PG
kjohnsonm
Lapis Lazuli | Level 10
proc sql;
create table testing as
Select a.Path_File1 as keya,
	   b.Path_File1 as keyb
from small_data_set a
inner join large_data_set b
on a.Path_File1 = translate(b.Path_File1,"_____", "&',()")
;
quit;

The data sets are exactly 4 files off and I hand checked them earler and removed their obs because of side issues, thanks for the help.   Smiley Happy

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 1585 views
  • 1 like
  • 2 in conversation