BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
kjohnsonm
Lapis Lazuli | Level 10

Hello all,

I have two lists one of 6100 files a fully qualified windows dir\path\filename.ext with random punctuation like:     &,()'

and another larger list 81K obs.   The 6100 records will each and everyone have a match in the larger list however the larger list has none of those punctuation marks any longer but does have a place holder underscore. "_" if any punctuation was taken out for example:

D:\path\file&name.ext                                   D:\path\file_name.ext

D:\path\path's\file&name.ext                        D:\path\path_s\file_name.ext

E:\path (my)\pat&h\new, file.SAV                    E:\path _my_\pat_h\new_ file.SAV

etc.

6100                                                             81K

Does anyone know how to crosswalk "left join" these two lists with wild cards and might have the time to toss me a slow pitch solfball?

 

/*for a clear list of what has been compressed out*/
compress(Path_File, "',&()", "")

These 6100 files are the last of 240K files I need to research for metadata, however I did not have the skill to make SAS read these files with the problematic punctuation still in the file paths/names.  PS the path file names can be upto 260 char long, all other data is derived from the path, file name and data type so no fields were given as examples.   TIA. -KJ

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

Use function translate()

 

a inner join b on a.newPath = translate(b.oldPath,"____", "&,()") 

PG

View solution in original post

2 REPLIES 2
PGStats
Opal | Level 21

Use function translate()

 

a inner join b on a.newPath = translate(b.oldPath,"____", "&,()") 

PG
kjohnsonm
Lapis Lazuli | Level 10
proc sql;
create table testing as
Select a.Path_File1 as keya,
	   b.Path_File1 as keyb
from small_data_set a
inner join large_data_set b
on a.Path_File1 = translate(b.Path_File1,"_____", "&',()")
;
quit;

The data sets are exactly 4 files off and I hand checked them earler and removed their obs because of side issues, thanks for the help.   Smiley Happy

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 1452 views
  • 1 like
  • 2 in conversation