/**************************************************************************** * S A S S A M P L E L I B R A R Y * * NAME: IMPORTPW * TITLE: Metadata User Import From etc/passwd file * PRODUCT: SAS * VERSION: 9.1 * SYSTEM: ALL * DATE: 16DEC2003 * DESC: Example code to extract user information from * Unix etc/passwd and group files and load it into * the Metadata Server. * KEYS: METADATA USER PERSON IDENTITYGROUP GROUP LOGIN * UPDATED: Version 9.2 19Jul2006 * ****************************************************************************/ /**************************************************************************** **************************************************************************** ** ** ** The following SAS Program is divided into 5 discrete sections in ** ** order to help simplify its overall organization. Each SECTION is ** ** marked by a comment box like this one, with "double bound" asterisk. ** ** Here is a summary of the sections: ** ** ** ** SECTION 1: SAS Option, Macro Variable, and filename Definitions ** ** ** ** SECTION 2: %mduimpc defines canonical datasets and variable lists. ** ** ** ** SECTION 3: Extract User Information from a etc/passwd file, normalize ** ** data, and create corresponding canonical datasets. ** ** ** ** SECTION 4: Extract Group Information from a etc/group file, normalize ** ** data, and create corresponding canonical datasets. ** ** ** ** SECTION 5: %mduimpl reads the canonical datasets, generates ** ** XML representing metadata objects, and invokes PROC ** ** METADATA to load the metadata. ** ** ** ** In order to run this program, you will verify and change the filerefs ** ** that specify the location of the passwd and group files where user ** ** information is read from and the SAS Metadata Server that recieves ** ** this information in the form of XML representing metadata objects. ** ** This information is specified in SECTION 1 below. ** ** ** ** CAUTION: before running this program, please read the SAS code below, ** ** SECTION by SECTION, to gain an understanding of its overall flow. It ** ** is especially important to understand which user accounts will be ** ** imported and which will be dropped as the Person information is ** ** extracted and constructed in SECTION 3. The amount of modification ** ** reqired could vary from site to site depending on the form of the ** ** passwd file and it's contents. Also note that the same principles ** ** apply for the code used to retrieve groups in SECTION 4. ** ** ** ** ** ** Format of the etc/passwd file ** ** ** ** This example was developed with an etc\passwd file that followed the ** ** normal conventions, but contained additional user info in the ** ** Gcos field (the 5th ':' delimited field) that is used to build the ** ** user info. The standard form of the file is: ** ** ** ** user name : password : uid (numeric) : primary group id : <continued> ** ** gcos-field : home-directory : login-shell ** ** ** ** Within the gcos-field there are other fields that are comma ** ** delimited. Our gcos-field has the following format: ** ** ** ** Person Name , Office , phone extension , misc , employeeid ** ** ** ** Putting the standard passwd file together with our specific ** ** gcos-field results in passwd entries that look like: ** ** ** ** user name:password:uid (numeric):primary group id:<continued> ** ** Person Name,Office,phone extension,misc,employeeid:<continued> ** ** Home-directory:login-shell ** ** ** ** Colons and Commas delimit the fields in the entries above. If your ** ** etc/passwd file does not match this layout. You will have to modify ** ** the code below by removing the extraction of certain information ** ** (phone numbers, addresses) or extracting it from other fields. ** ** Note, in our example the info in the gcos-field was comma delimited ** ** there is no specification as to what delimiter must be used. The ** ** only specification is that a colon cannot be used because it is used ** ** to delimit the main fields of the passwd file. ** ** ** ** ** ** Format of the etc/group file ** ** ** ** There is nothing special about the format of the group file used to ** ** build this example. There are not special gcos-fields within the ** ** group file so it's not likely that this section of code would need ** ** much modification unless you simply didn't want to extract the groups.** ** The format of the standard etc/group file is: ** ** ** ** groupname : password : gid (numerid) : comma delimited list of users ** ** ** ** The fields of the group file are colon delimited and the members ** ** of the group are specified as a comma delimited list in the last ** ** colon delimited field. Usernames are used in the list rather than ** ** the numeric uids. Groups are not allowed to be members of other ** ** groups. ** **************************************************************************** ****************************************************************************/ /**************************************************************************** **************************************************************************** ** ** ** SECTION 1: SAS Option, Macro Variable, and filename Definitions ** ** ** **************************************************************************** ****************************************************************************/ /****************************************************************************/ /* Use the Meta* options to specify the metadata server connection options */ /* where the user information will be loaded. */ /****************************************************************************/ options metaserver=sasmeta.corp.com /* network name/address of the */ /* metadata server. */ metaport=8562 /* Port Metadata Server is listening on.*/ metauser="sasadm@saspw" /* Domain Qualified Userid for */ /* connection to metadata server. */ metapass="xxxxxxxxxxxxxxxx" /* Password for userid above. */ metaprotocol=bridge /* Protocol for Metadata Server. */ metarepository=Foundation; /* Default location of user information */ /* is in the foundation repository. */ /****************************************************************************/ /* Define the tag that will be included in the Context attribute of */ /* ExternalIdentity objects associated with the information loaded by this */ /* application. This tag will make it easier to determine where information*/ /* originated from when synchronization tools become available. */ /* Note, the value of this macro should not be quoted. */ /****************************************************************************/ %let PWExtIDTag = Passwd File Import; /****************************************************************************/ /* This process will extract UNIX password file information into datasets */ /* in the libref represented by the "extractlibref" macro variable. The */ /* extracted information will be cleansed and normalized in this library */ /* and then transferred into the canonical form datasets defined in the */ /* %mduimpc macro. */ /* */ /* Specify the library to where the passwd and group Files information */ /* should be extracted. */ /****************************************************************************/ %let extractlibref=work; /****************************************************************************/ /* Specify the location of the passwd and group files. Typically this will */ /* be located in the /etc directory. However, if they are located in a */ /* different location, specify it here. */ /****************************************************************************/ /*filename grpfile "/etc/group"; filename pwfile "/etc/passwd";*/ filename grpfile "/home"; filename pwfile "/home"; /****************************************************************************/ /* Set the name of the AuthenticationDomain in the metadata to which logins */ /* created by the process should be associated. Note, there is no */ /* requirement that the name of the MetadataAuthDomain match an actual */ /* network domain. The default value is "DefaultAuth". */ /****************************************************************************/ %let MetadataAuthDomain=DefaultAuth; /****************************************************************************/ /* EMAIL addresses are in the form: userid@domain. The userid value is */ /* extracted from the passwd file below. Supply the domain portion of the */ /* email addresses in the UNIXEMAILDOMAIN macro variable. */ /* */ /* NOTE: If your email environment supports a different convention for */ /* EMAIL addresses then modifications in the code below may be required if */ /* the EMAIL addresses are needed in the metadata being created. */ /****************************************************************************/ %let UNIXEMAILDOMAIN=EMAIL.ACCOUNTS.DOMAIN; /****************************************************************************/ /* The importlibref macro variable declares the libref where the normalized */ /* datasets defined by the macro %mduimpc will be created in the processing */ /* below. It is VERY important to NOT change any &importlibref reference in */ /* the code below. If you want to save the normalized datasets in a */ /* specific library then uncomment libname xxxx 'your_path_name';. */ /* supply your own path name, and change %let importlibref=work; to */ /* %let importlibref=xxxx; where xxxx is a libref name of your choosing. */ /****************************************************************************/ /* libname xxxx 'your_path_name'; */ %let importlibref=work; /****************************************************************************/ /* filename for location where macro %mduimpl saves its generated XML */ /****************************************************************************/ filename keepxml "request.xml" lrecl=1024; /**************************************************************************** **************************************************************************** ** ** ** SECTION 2: %mduimpc defines canonical datasets and variable lists. ** ** ** **************************************************************************** ****************************************************************************/ /****************************************************************************/ /* Invoke the %mduimpc macro to generate the macro variables used to define */ /* the canonical datasets and columns for input to the %mduimpl macro. The */ /* %mduimpl (see end of program) macro reads the canonical form datasets, */ /* builds an XML stream containing user information, and loads this user */ /* information into the metadata server specified in the meta options above.*/ /****************************************************************************/ %mduimpc(libref=&importlibref,maketable=0); /**************************************************************************** **************************************************************************** ** ** ** SECTION 3: Extract User Information from etc/passwd. normalize ** ** data, and create corresponding canonical datasets. ** ** ** **************************************************************************** ****************************************************************************/ /****************************************************************************/ /* The following datastep reads the UNIX password file and creates the */ /* unixUsers dataset. Please note that the record format of UNIX PW files */ /* can vary somewhat. For example, the "custominfo" variable that is */ /* commented out below actually represents a field in the record that is */ /* further subdivided into name, office, officeph, unknown, and empid in */ /* the UNIX PW file that was used for testing this example. */ /* */ /* NOTE: These variables may be freely modified to accomodate the UNIX PW */ /* file found at your site. WARNING: If you change the names of any of */ /* these variables, be aware that they are also used in the code that */ /* follows. */ /* */ /* NOTE: An assumption is made that the user information will include an */ /* Employee ID (empid). If empid for a particular passwd entry is empty */ /* then that entry will be dropped from the import. */ /****************************************************************************/ data &extractlibref..unixUsers (keep=keyid userid gid displayname ); attrib keyid informat=$30. format=$30.; attrib userid informat=$10. format=$10.; attrib password informat=$13. format=$13.; attrib uid informat=$32. format=$12.; attrib gid informat=$32. format=$12.; /*attrib custominfo informat=$64. format=$64.;*/ attrib displayname informat=$128. format=$128.; attrib empid informat=$30. format=$30.; attrib userhome informat=$50. format=$50.; attrib usershell informat=$24. format=$24.; infile pwfile delimiter = ':,' MISSOVER DSD; input userid $ password $ uid $ gid $ /* custominfo $ is a summary of the following 5 fields */ displayname $ empid $ userhome $ usershell $ ; /************************************************************************/ /* If an account does not contain and Employee ID, then drop that */ /* account from the import. */ /************************************************************************/ if empid = "" then do; delete; return; end; /***************************************************************************/ /* In our example data, accounts were disabled by entering a *DELETED* */ /* or *NOLOGIN* in the password field. If any password field begins with */ /* an asterisk, then consider that account invalid and drop it from the */ /* import. */ /***************************************************************************/ if substrn(password,1,1) = "*" then do; delete; return; end; uid = compress("U_" || uid); gid = compress("G_" || gid); /***************************************************************************/ /* The keyid must uniquely identify a single person in the person dataset. */ /* If the keyid is not unique, the mduimpl macro will purge duplicates, */ /* leaving only 1 person object with a particular keyid to be loaded into */ /* the metadata. */ /* */ /* In this example, we have selected to use the uid of the account */ /* as the keyid. */ /* */ /* NOTES: */ /* 1- the keyid is also used to relate login, email, phone, address and */ /* group information to a particular user. So, if two users share the */ /* same keyid, one of them will get all the info for both users while */ /* the other is purged. */ /* */ /* 2- The keyid is written to an ExternalIdentity object that is */ /* associated to the person. The external identity can be used to */ /* determine where a user's information came from and how to */ /* synchronize it. Care must be taken when determining what */ /* value to use for the keyid and the implications that decision may */ /* have. */ /* */ /* 3- If the keyid is some sort of global user identifier (like an */ /* employeeid) it may be possible to merge login information extracted */ /* from another authentication system with the user information */ /* extracted here. The addition/merging of additional information */ /* can take place from the extraction and loading of this information. */ /* However, the ability to perform the merge would depend on both */ /* authentication systems using the same global identifier to relate */ /* ownership of the account to a person. In this case, the */ /* global identifier should be used as the keyid. */ /***************************************************************************/ keyid = uid; /************************************************************************************/ /* Non-printable characters (e.g. BACKSPACE 0x080) have been observed in displayname*/ /* coming from the "custominfo". These characters trouble the XML parser in the */ /* SAS Metadata Server during the load process. The following loop recodes any */ /* embedded non-printable ASCII characters found in the name string to blanks */ /* (which are digestible by the XML parser). If a non-pritable character is */ /* found at the of the string, the string is simply truncated by one character. */ /************************************************************************************/ _NOTPRINT = notprint( displayname, 1); do while ( _NOTPRINT ); put "non-printable character(s): " displayname=; displaynameL = length(displayname); if ( _NOTPRINT EQ displaynameL ) then do; /* found at end of the string */ displayname = substr(displayname,1,displaynameL-1); _NOTPRINT = 0; /* set up to exit */ end; else do; /* found within the string */ substr(displayname,_NOTPRINT,1) = " "; _NOTPRINT = notprint( displayname, _NOTPRINT+1 ); /* any more bad? */ end; end; run; /******************************************************************************************/ /* The following datastep creates the normalized tables for person, location, */ /* phone, email, and login from the &extractlibref..unixextrpersons extracted above. */ /******************************************************************************************/ data &persontbla /* Macros to define Normalized Tables from %mduimpc */ &locationtbla &phonetbla &emailtbla &logintbla ; %definepersoncols; /* Macros to define Normalized Table Columns from %mduimpc */ %definelocationcols; %definephonecols; %defineemailcols; %definelogincols; retain city postalcode area country ""; set &extractlibref..unixUsers; /* (obs=x for testing) */ /* name is already in the input dataset */ /* title cannot be derived from the UNIX PW file */ name=userid; title=""; description=""; output &persontbl; /* setup location values */ if office NE "" then do; locationName = strip(name) || " Office"; locationtype = "Office"; address = office; output &locationtbl; end; if officeph NE "" then do; phonenumber = officeph; phonetype = "Office"; output &phonetbl; end; /* create email address based on useid and the UNIXEMAILDOMAIN */ /* macro variable. */ emailAddr = compress(userid) || "@" || "&UNIXEMAILDOMAIN"; emailType = "Office"; output &emailtbl; if userid NE "" then do; password =""; authdomkeyid = 'domkey' || compress(upcase("&MetadataAuthDomain")); output &logintbl; end; run; /************************************************************************/ /* Each person entry in &persontbl must be unique according to the */ /* rules for Metadata Authorization Identities. By enforcing this */ /* uniqueness here, we help ensure that the Metadata XML will load */ /* correctly when the %mduimpl macro is invoked with submit=1 below. */ /************************************************************************/ proc sort data=&persontbl nodupkey; by keyid; run; proc datasets library=&importlibref memtype=data; /* Create Index for */ modify person; /* speedy retrieval */ index create keyid; run; /************************************************************************/ /* Because standard UNIX PW files (i.e. PW files that don't link to */ /* dirctory systems, etc.) only provide simple location info (e.g. */ /* office number), the possibilty of multiple locations per person does */ /* not really exist. */ /************************************************************************/ proc sort data=&locationtbl nodupkey; by keyid; run; proc datasets library=&importlibref memtype=data; /* Create Index for */ modify location; /* speedy retrieval */ index create keyid; run; /************************************************************************/ /* Each person can have zero or more entries in &phonetbl. Each */ /* entry will be a unique combination of keyid and phone number. */ /************************************************************************/ proc sort data=&phonetbl nodupkey; by keyid phonenumber; run; proc datasets library=&importlibref memtype=data; /* Create Index for */ modify phone; /* speedy retrieval */ index create keyid; run; /************************************************************************/ /* Each person can have zero or more entries in &emailtbl. Purge any */ /* duplicate entries. Note, Dups should not occur so long as the */ /* there are no duplicate entries in the passwd file. */ /************************************************************************/ proc sort data=&emailtbl nodupkey; by keyid emailAddr; run; proc datasets library=&importlibref memtype=data; /* Create Index for */ modify email; /* speedy retrieval */ index create keyid; run; /************************************************************************/ /* Because each person can have multiple logins, the entries by */ /* keyid are not required to be unique. However, the UserID */ /* attribute by relation to AuthenticationDomain must be unique for */ /* each login owned by a person, *and* a login can only be related */ /* to one person. These constraints are enforced during processing */ /* in the %mduimpl macro, which is invoked below. */ /************************************************************************/ proc sort data=&logintbl; by keyid; run; proc datasets library=&importlibref memtype=data; /* Create Index for */ modify logins; /* speedy retrieval */ index create keyid; run; /**************************************************************************** **************************************************************************** ** ** ** SECTION 4: Extract Group Information from /etc/group, normalize ** ** data, and create corresponding canonical datasets. ** ** ** **************************************************************************** ****************************************************************************/ /****************************************************************************/ /* The following datastep reads the UNIX group file and creates the */ /* unixgroups and unixGroupMembers datasets. */ /* */ /* **The Group file contains only secondary group membership information. */ /* In other words, the passwd file contained user account information and */ /* the primary group membership for the account. This primary group */ /* membership is only reflected in the passwd file, not the group file. */ /* To build the complete group membership, we'll have to combine the group */ /* membership information from both the passwd and group files. */ /* */ /* See comments at the beginning of this example file for information */ /* regarding the format of the group file. Note though that the members */ /* are specified using the account user name (userid) that can be found */ /* in the 1st field of a passwd file entry. */ /* */ /* NOTE: These variables may be freely modified to accomodate the UNIX PW */ /* file found at your site. WARNING: If you change the names of any of */ /* these variables, be aware that they are also used in the code that */ /* follows. */ /****************************************************************************/ data &extractlibref..unixGroups (keep=gid name) &extractlibref..unixGroupMembers (keep=gid name member); attrib name informat=$50. format=$50.; attrib gid informat=$32. format=$12.; attrib member informat=$50. format=$50.; length membersOff 8 membersLen 8 memWord 8 memCnt 8; infile grpfile LRECL=32756 length=Len; input; name = scan(_infile_,1,':'); /* first word is group Name */ nameL = length(name); /* Check to see if a group PW is present */ if ( substr(_infile_,nameL+1,2) EQ "::" ) then do; gid = scan(_infile_,2,':'); /* second real word is usually the group ID */ membersOff = nameL + 2 + length(gid) + 1; end; else do; /* unless a group PW is present then */ pw = scan(_infile_,2,':'); /* second real word is the group PW */ gid = scan(_infile_,3,':'); /* the third word is group ID */ membersOff = nameL + 1 + length(pw) + 1 + length(gid) + 1; end; if input(gid, best16.0) > 0; /* Don't add nogroup or root to datasets */ gid = compress("G_" || gid); output &extractlibref..unixGroups; /* Add to groups dataset */ membersLen = Len - membersOff; /* groups can be defined without members. If members are */ /* present then add an entry for each to the members dataset */ if ( membersLen > 0 ) then do; members = substr(_infile_,membersOff+1,membersLen); memWord=1; memCnt = countc(members,",") + 1; do while ( memWord LE memCnt ); member = scan(members,memWord,","); output &extractlibref..unixGroupMembers; memWord + 1; end; end; run; /***************************************************************************************/ /* Each row in the unixGroupMembers Table represts a user's membership in a group. A */ /* user may be a member of multiple groups and in this case a row will exist for each */ /* group. Nested groups are not supported in UNIX so the member column will *never* */ /* contain a group name (... although it may appear that way because UNIX does allow */ /* a user and a group to have the same name value.) Sort the unixGroupMembers Table */ /* by member for the processing that follows. */ /***************************************************************************************/ proc sort data=&extractlibref..unixGroupMembers; by member; run; proc datasets library=&extractlibref memtype=data; modify unixGroupMembers; index create member; run; /***************************************************************************************/ /* Use the information in the unixUsers dataset that was extracted from the passwd */ /* file to create a dataset that contains the primary group memberships for accounts */ /* in the passwd file. */ /* */ /* Note the "userid" column will be renamed "member" in the output dataset */ /* unixUsersAsPrimaryGroupMembers. */ /***************************************************************************************/ proc sql; create table &extractlibref..unixUsersInPrimaryGroups as select g.gid, u.keyid, g.name, u.userid as member from &extractlibref..unixUsers u, &extractlibref..unixGroups g where u.gid EQ g.gid; quit; /************************************************************************/ /* Build a dataset with the memberships and member keyids for the */ /* secondary group memberships that are contained in the group file. */ /************************************************************************/ proc sql; create table &extractlibref..unixUsersInNonPrimaryGroups as select m.gid, u.keyid, m.name, m.member from &extractlibref..unixUsers u, &extractlibref..unixGroupMembers m where m.member EQ u.userid; quit; /****************************************************************************/ /* Build the canonical groupmems dataset by selecting the appropriate */ /* columns and concatenating the unixUsersInPrimaryGroups and */ /* unixUsersInNonPrimaryGroups datasets built above. */ /****************************************************************************/ data &idgrpmemstbla /* Macros to define canonical Tables from %mduimpc */ ; %defineidgrpmemscols; /* Macros to define Table Columns from %mduimpc */ set &extractlibref..unixUsersInPrimaryGroups &extractlibref..unixUsersInNonPrimaryGroups; grpkeyid = gid; memkeyid = keyid; run; proc sort data=&idgrpmemstbla; by grpkeyid memkeyid; run; proc datasets library=&importlibref memtype=data; /* Create Index for */ modify grpmems; /* speedy retrieval */ index create grpkeyid; quit; run; /************************************************************************/ /* Build the canonical groups dataset containing the group names and */ /* keyids. */ /************************************************************************/ data &idgrptbla /* Macros to define canonical Tables from %mduimpc */ ; %defineidgrpcols; /* Macros to define Table Columns from %mduimpc */ set &extractlibref..unixgroups; keyid = gid; /* Name column moves straight from unixgroups dataset */ description=""; grpType=""; run; /***************************************************************************************/ /* Sort and index the idgrps table */ /***************************************************************************************/ proc sort data=&idgrptbla nodupkey; by keyid; run; proc datasets library=&importlibref memtype=data; modify idgrps; index create keyid; run; quit; /************************************************************************/ /* Add the UNIX AuthDom to the AuthenticationDomain dataset */ /************************************************************************/ data &authdomtbl; %defineauthdomcols; /* Macros to define Table authdomain from %mduimpc */ authDomName="&MetadataAuthDomain"; keyid='domkey' || compress(upcase("&MetadataAuthDomain")); run; /**************************************************************************** **************************************************************************** ** ** ** SECTION 5: %mduimpl reads the canonical datasets, generates ** ** XML representing metadata objects, and invokes PROC ** ** METADATA to load the metadata. ** ** ** **************************************************************************** ****************************************************************************/ /************************************************************************/ /* note, the mduimpl macro will lookup all the person objects by */ /* external identity and then subsitute the keyid for the login object */ /* with the objectid of the person any logins that have a keyid that */ /* isn't found will be dropped from the load. */ /************************************************************************/ %macro Execute_Load; /* if the _EXTRACTONLY macro is set, then return and don't do any load processing. */ %if %symexist(_EXTRACTONLY) %then %return; %mduimplb(libref=&importlibref, extidtag=&PWExtIDTag); %mend Execute_Load; %Execute_Load;
... View more