class ProxyManagerClient(object): def __init__(self): self.__usersCache = DictCache() self.__proxiesCache = DictCache() self.__vomsProxiesCache = DictCache() self.__pilotProxiesCache = DictCache() self.__filesCache = DictCache(self.__deleteTemporalFile) def __deleteTemporalFile(self, filename): """ Delete temporal file :param str filename: path to file """ try: os.remove(filename) except Exception: pass def clearCaches(self): """ Clear caches """ self.__usersCache.purgeAll() self.__proxiesCache.purgeAll() self.__vomsProxiesCache.purgeAll() self.__pilotProxiesCache.purgeAll() def __getSecondsLeftToExpiration(self, expiration, utc=True): """ Get time left to expiration in a seconds :param datetime expiration: :param boolean utc: time in utc :return: datetime """ if utc: td = expiration - datetime.datetime.utcnow() else: td = expiration - datetime.datetime.now() return td.days * 86400 + td.seconds def __refreshUserCache(self, validSeconds=0): """ Refresh user cache :param int validSeconds: required seconds the proxy is valid for :return: S_OK()/S_ERROR() """ rpcClient = RPCClient("Framework/ProxyManager", timeout=120) retVal = rpcClient.getRegisteredUsers(validSeconds) if not retVal['OK']: return retVal data = retVal['Value'] # Update the cache for record in data: cacheKey = (record['DN'], record['group']) self.__usersCache.add(cacheKey, self.__getSecondsLeftToExpiration(record['expirationtime']), record) return S_OK() @gUsersSync def userHasProxy(self, userDN, userGroup, validSeconds=0): """ Check if a user(DN-group) has a proxy in the proxy management Updates internal cache if needed to minimize queries to the service :param str userDN: user DN :param str userGroup: user group :param int validSeconds: proxy valid time in a seconds :return: S_OK()/S_ERROR() """ # For backward compatibility reasons with versions prior to v7r1 # we need to check for proxy with a group # AND for groupless proxy even if not specified cacheKeys = ((userDN, userGroup), (userDN, '')) for cacheKey in cacheKeys: if self.__usersCache.exists(cacheKey, validSeconds): return S_OK(True) # Get list of users from the DB with proxys at least 300 seconds gLogger.verbose("Updating list of users in proxy management") retVal = self.__refreshUserCache(validSeconds) if not retVal['OK']: return retVal for cacheKey in cacheKeys: if self.__usersCache.exists(cacheKey, validSeconds): return S_OK(True) return S_OK(False) @gUsersSync def getUserPersistence(self, userDN, userGroup, validSeconds=0): """ Check if a user(DN-group) has a proxy in the proxy management Updates internal cache if needed to minimize queries to the service :param str userDN: user DN :param str userGroup: user group :param int validSeconds: proxy valid time in a seconds :return: S_OK()/S_ERROR() """ cacheKey = (userDN, userGroup) userData = self.__usersCache.get(cacheKey, validSeconds) if userData: if userData['persistent']: return S_OK(True) # Get list of users from the DB with proxys at least 300 seconds gLogger.verbose("Updating list of users in proxy management") retVal = self.__refreshUserCache(validSeconds) if not retVal['OK']: return retVal userData = self.__usersCache.get(cacheKey, validSeconds) if userData: return S_OK(userData['persistent']) return S_OK(False) def setPersistency(self, userDN, userGroup, persistent): """ Set the persistency for user/group :param str userDN: user DN :param str userGroup: user group :param boolean persistent: presistent flag :return: S_OK()/S_ERROR() """ # Hack to ensure bool in the rpc call persistentFlag = True if not persistent: persistentFlag = False rpcClient = RPCClient("Framework/ProxyManager", timeout=120) retVal = rpcClient.setPersistency(userDN, userGroup, persistentFlag) if not retVal['OK']: return retVal # Update internal persistency cache cacheKey = (userDN, userGroup) record = self.__usersCache.get(cacheKey, 0) if record: record['persistent'] = persistentFlag self.__usersCache.add(cacheKey, self.__getSecondsLeftToExpiration(record['expirationtime']), record) return retVal def uploadProxy(self, proxy=None, restrictLifeTime=0, rfcIfPossible=False): """ Upload a proxy to the proxy management service using delegation :param X509Chain proxy: proxy as a chain :param int restrictLifeTime: proxy live time in a seconds :param boolean rfcIfPossible: make rfc proxy if possible :return: S_OK(dict)/S_ERROR() -- dict contain proxies """ # Discover proxy location if isinstance(proxy, X509Chain): chain = proxy proxyLocation = "" else: if not proxy: proxyLocation = Locations.getProxyLocation() if not proxyLocation: return S_ERROR("Can't find a valid proxy") elif isinstance(proxy, six.string_types): proxyLocation = proxy else: return S_ERROR("Can't find a valid proxy") chain = X509Chain() result = chain.loadProxyFromFile(proxyLocation) if not result['OK']: return S_ERROR("Can't load %s: %s " % (proxyLocation, result['Message'])) # Make sure it's valid if chain.hasExpired().get('Value'): return S_ERROR("Proxy %s has expired" % proxyLocation) if chain.getDIRACGroup().get('Value') or chain.isVOMS().get('Value'): return S_ERROR("Cannot upload proxy with DIRAC group or VOMS extensions") rpcClient = RPCClient("Framework/ProxyManager", timeout=120) # Get a delegation request result = rpcClient.requestDelegationUpload(chain.getRemainingSecs()['Value']) if not result['OK']: return result reqDict = result['Value'] # Generate delegated chain chainLifeTime = chain.getRemainingSecs()['Value'] - 60 if restrictLifeTime and restrictLifeTime < chainLifeTime: chainLifeTime = restrictLifeTime retVal = chain.generateChainFromRequestString(reqDict['request'], lifetime=chainLifeTime, rfc=rfcIfPossible) if not retVal['OK']: return retVal # Upload! result = rpcClient.completeDelegationUpload(reqDict['id'], retVal['Value']) if not result['OK']: return result return S_OK(result.get('proxies') or result['Value']) @gProxiesSync def downloadProxy(self, userDN, userGroup, limited=False, requiredTimeLeft=1200, cacheTime=14400, proxyToConnect=None, token=None): """ Get a proxy Chain from the proxy management :param str userDN: user DN :param str userGroup: user group :param boolean limited: if need limited proxy :param int requiredTimeLeft: required proxy live time in a seconds :param int cacheTime: store in a cache time in a seconds :param X509Chain proxyToConnect: proxy as a chain :param str token: valid token to get a proxy :return: S_OK(X509Chain)/S_ERROR() """ cacheKey = (userDN, userGroup) if self.__proxiesCache.exists(cacheKey, requiredTimeLeft): return S_OK(self.__proxiesCache.get(cacheKey)) req = X509Request() req.generateProxyRequest(limited=limited) if proxyToConnect: rpcClient = RPCClient("Framework/ProxyManager", proxyChain=proxyToConnect, timeout=120) else: rpcClient = RPCClient("Framework/ProxyManager", timeout=120) if token: retVal = rpcClient.getProxyWithToken(userDN, userGroup, req.dumpRequest()['Value'], int(cacheTime + requiredTimeLeft), token) else: retVal = rpcClient.getProxy(userDN, userGroup, req.dumpRequest()['Value'], int(cacheTime + requiredTimeLeft)) if not retVal['OK']: return retVal chain = X509Chain(keyObj=req.getPKey()) retVal = chain.loadChainFromString(retVal['Value']) if not retVal['OK']: return retVal self.__proxiesCache.add(cacheKey, chain.getRemainingSecs()['Value'], chain) return S_OK(chain) def downloadProxyToFile(self, userDN, userGroup, limited=False, requiredTimeLeft=1200, cacheTime=14400, filePath=None, proxyToConnect=None, token=None): """ Get a proxy Chain from the proxy management and write it to file :param str userDN: user DN :param str userGroup: user group :param boolean limited: if need limited proxy :param int requiredTimeLeft: required proxy live time in a seconds :param int cacheTime: store in a cache time in a seconds :param str filePath: path to save proxy :param X509Chain proxyToConnect: proxy as a chain :param str token: valid token to get a proxy :return: S_OK(X509Chain)/S_ERROR() """ retVal = self.downloadProxy(userDN, userGroup, limited, requiredTimeLeft, cacheTime, proxyToConnect, token) if not retVal['OK']: return retVal chain = retVal['Value'] retVal = self.dumpProxyToFile(chain, filePath) if not retVal['OK']: return retVal retVal['chain'] = chain return retVal @gVOMSProxiesSync def downloadVOMSProxy(self, userDN, userGroup, limited=False, requiredTimeLeft=1200, cacheTime=14400, requiredVOMSAttribute=None, proxyToConnect=None, token=None): """ Download a proxy if needed and transform it into a VOMS one :param str userDN: user DN :param str userGroup: user group :param boolean limited: if need limited proxy :param int requiredTimeLeft: required proxy live time in a seconds :param int cacheTime: store in a cache time in a seconds :param str requiredVOMSAttribute: VOMS attr to add to the proxy :param X509Chain proxyToConnect: proxy as a chain :param str token: valid token to get a proxy :return: S_OK(X509Chain)/S_ERROR() """ cacheKey = (userDN, userGroup, requiredVOMSAttribute, limited) if self.__vomsProxiesCache.exists(cacheKey, requiredTimeLeft): return S_OK(self.__vomsProxiesCache.get(cacheKey)) req = X509Request() req.generateProxyRequest(limited=limited) if proxyToConnect: rpcClient = RPCClient("Framework/ProxyManager", proxyChain=proxyToConnect, timeout=120) else: rpcClient = RPCClient("Framework/ProxyManager", timeout=120) if token: retVal = rpcClient.getVOMSProxyWithToken(userDN, userGroup, req.dumpRequest()['Value'], int(cacheTime + requiredTimeLeft), token, requiredVOMSAttribute) else: retVal = rpcClient.getVOMSProxy(userDN, userGroup, req.dumpRequest()['Value'], int(cacheTime + requiredTimeLeft), requiredVOMSAttribute) if not retVal['OK']: return retVal chain = X509Chain(keyObj=req.getPKey()) retVal = chain.loadChainFromString(retVal['Value']) if not retVal['OK']: return retVal self.__vomsProxiesCache.add(cacheKey, chain.getRemainingSecs()['Value'], chain) return S_OK(chain) def downloadVOMSProxyToFile(self, userDN, userGroup, limited=False, requiredTimeLeft=1200, cacheTime=14400, requiredVOMSAttribute=None, filePath=None, proxyToConnect=None, token=None): """ Download a proxy if needed, transform it into a VOMS one and write it to file :param str userDN: user DN :param str userGroup: user group :param boolean limited: if need limited proxy :param int requiredTimeLeft: required proxy live time in a seconds :param int cacheTime: store in a cache time in a seconds :param str requiredVOMSAttribute: VOMS attr to add to the proxy :param str filePath: path to save proxy :param X509Chain proxyToConnect: proxy as a chain :param str token: valid token to get a proxy :return: S_OK(X509Chain)/S_ERROR() """ retVal = self.downloadVOMSProxy(userDN, userGroup, limited, requiredTimeLeft, cacheTime, requiredVOMSAttribute, proxyToConnect, token) if not retVal['OK']: return retVal chain = retVal['Value'] retVal = self.dumpProxyToFile(chain, filePath) if not retVal['OK']: return retVal retVal['chain'] = chain return retVal def getPilotProxyFromDIRACGroup(self, userDN, userGroup, requiredTimeLeft=43200, proxyToConnect=None): """ Download a pilot proxy with VOMS extensions depending on the group :param str userDN: user DN :param str userGroup: user group :param int requiredTimeLeft: required proxy live time in seconds :param X509Chain proxyToConnect: proxy as a chain :return: S_OK(X509Chain)/S_ERROR() """ # Assign VOMS attribute vomsAttr = Registry.getVOMSAttributeForGroup(userGroup) if not vomsAttr: gLogger.warn("No voms attribute assigned to group %s when requested pilot proxy" % userGroup) return self.downloadProxy(userDN, userGroup, limited=False, requiredTimeLeft=requiredTimeLeft, proxyToConnect=proxyToConnect) else: return self.downloadVOMSProxy(userDN, userGroup, limited=False, requiredTimeLeft=requiredTimeLeft, requiredVOMSAttribute=vomsAttr, proxyToConnect=proxyToConnect) def getPilotProxyFromVOMSGroup(self, userDN, vomsAttr, requiredTimeLeft=43200, proxyToConnect=None): """ Download a pilot proxy with VOMS extensions depending on the group :param str userDN: user DN :param str vomsAttr: VOMS attribute :param int requiredTimeLeft: required proxy live time in a seconds :param X509Chain proxyToConnect: proxy as a chain :return: S_OK(X509Chain)/S_ERROR() """ groups = Registry.getGroupsWithVOMSAttribute(vomsAttr) if not groups: return S_ERROR("No group found that has %s as voms attrs" % vomsAttr) for userGroup in groups: result = self.downloadVOMSProxy(userDN, userGroup, limited=False, requiredTimeLeft=requiredTimeLeft, requiredVOMSAttribute=vomsAttr, proxyToConnect=proxyToConnect) if result['OK']: return result return result def getPayloadProxyFromDIRACGroup(self, userDN, userGroup, requiredTimeLeft, token=None, proxyToConnect=None): """ Download a payload proxy with VOMS extensions depending on the group :param str userDN: user DN :param str userGroup: user group :param int requiredTimeLeft: required proxy live time in a seconds :param str token: valid token to get a proxy :param X509Chain proxyToConnect: proxy as a chain :return: S_OK(X509Chain)/S_ERROR() """ # Assign VOMS attribute vomsAttr = Registry.getVOMSAttributeForGroup(userGroup) if not vomsAttr: gLogger.verbose("No voms attribute assigned to group %s when requested payload proxy" % userGroup) return self.downloadProxy(userDN, userGroup, limited=True, requiredTimeLeft=requiredTimeLeft, proxyToConnect=proxyToConnect, token=token) else: return self.downloadVOMSProxy(userDN, userGroup, limited=True, requiredTimeLeft=requiredTimeLeft, requiredVOMSAttribute=vomsAttr, proxyToConnect=proxyToConnect, token=token) def getPayloadProxyFromVOMSGroup(self, userDN, vomsAttr, token, requiredTimeLeft, proxyToConnect=None): """ Download a payload proxy with VOMS extensions depending on the VOMS attr :param str userDN: user DN :param str vomsAttr: VOMS attribute :param str token: valid token to get a proxy :param int requiredTimeLeft: required proxy live time in a seconds :param X509Chain proxyToConnect: proxy as a chain :return: S_OK(X509Chain)/S_ERROR() """ groups = Registry.getGroupsWithVOMSAttribute(vomsAttr) if not groups: return S_ERROR("No group found that has %s as voms attrs" % vomsAttr) userGroup = groups[0] return self.downloadVOMSProxy(userDN, userGroup, limited=True, requiredTimeLeft=requiredTimeLeft, requiredVOMSAttribute=vomsAttr, proxyToConnect=proxyToConnect, token=token) def dumpProxyToFile(self, chain, destinationFile=None, requiredTimeLeft=600): """ Dump a proxy to a file. It's cached so multiple calls won't generate extra files :param X509Chain chain: proxy as a chain :param str destinationFile: path to store proxy :param int requiredTimeLeft: required proxy live time in a seconds :return: S_OK(str)/S_ERROR() """ result = chain.hash() if not result['OK']: return result cHash = result['Value'] if self.__filesCache.exists(cHash, requiredTimeLeft): filepath = self.__filesCache.get(cHash) if filepath and os.path.isfile(filepath): return S_OK(filepath) self.__filesCache.delete(cHash) retVal = chain.dumpAllToFile(destinationFile) if not retVal['OK']: return retVal filename = retVal['Value'] self.__filesCache.add(cHash, chain.getRemainingSecs()['Value'], filename) return S_OK(filename) def deleteGeneratedProxyFile(self, chain): """ Delete a file generated by a dump :param X509Chain chain: proxy as a chain :return: S_OK() """ self.__filesCache.delete(chain) return S_OK() def deleteProxyBundle(self, idList): """ delete a list of id's :param list,tuple idList: list of identity numbers :return: S_OK(int)/S_ERROR() """ rpcClient = RPCClient("Framework/ProxyManager", timeout=120) return rpcClient.deleteProxyBundle(idList) def requestToken(self, requesterDN, requesterGroup, numUses=1): """ Request a number of tokens. usesList must be a list of integers and each integer is the number of uses a token must have :param str requesterDN: user DN :param str requesterGroup: user group :param int numUses: number of uses :return: S_OK(tuple)/S_ERROR() -- tuple contain token, number uses """ rpcClient = RPCClient("Framework/ProxyManager", timeout=120) return rpcClient.generateToken(requesterDN, requesterGroup, numUses) def renewProxy(self, proxyToBeRenewed=None, minLifeTime=3600, newProxyLifeTime=43200, proxyToConnect=None): """ Renew a proxy using the ProxyManager :param X509Chain proxyToBeRenewed: proxy to renew :param int minLifeTime: if proxy life time is less than this, renew. Skip otherwise :param int newProxyLifeTime: life time of new proxy :param X509Chain proxyToConnect: proxy to use for connecting to the service :return: S_OK(X509Chain)/S_ERROR() """ retVal = multiProxyArgument(proxyToBeRenewed) if not retVal['Value']: return retVal proxyToRenewDict = retVal['Value'] secs = proxyToRenewDict['chain'].getRemainingSecs()['Value'] if secs > minLifeTime: deleteMultiProxy(proxyToRenewDict) return S_OK() if not proxyToConnect: proxyToConnectDict = {'chain': False, 'tempFile': False} else: retVal = multiProxyArgument(proxyToConnect) if not retVal['Value']: deleteMultiProxy(proxyToRenewDict) return retVal proxyToConnectDict = retVal['Value'] userDN = proxyToRenewDict['chain'].getIssuerCert()['Value'].getSubjectDN()['Value'] retVal = proxyToRenewDict['chain'].getDIRACGroup() if not retVal['OK']: deleteMultiProxy(proxyToRenewDict) deleteMultiProxy(proxyToConnectDict) return retVal userGroup = retVal['Value'] limited = proxyToRenewDict['chain'].isLimitedProxy()['Value'] voms = VOMS() retVal = voms.getVOMSAttributes(proxyToRenewDict['chain']) if not retVal['OK']: deleteMultiProxy(proxyToRenewDict) deleteMultiProxy(proxyToConnectDict) return retVal vomsAttrs = retVal['Value'] if vomsAttrs: retVal = self.downloadVOMSProxy(userDN, userGroup, limited=limited, requiredTimeLeft=newProxyLifeTime, requiredVOMSAttribute=vomsAttrs[0], proxyToConnect=proxyToConnectDict['chain']) else: retVal = self.downloadProxy(userDN, userGroup, limited=limited, requiredTimeLeft=newProxyLifeTime, proxyToConnect=proxyToConnectDict['chain']) deleteMultiProxy(proxyToRenewDict) deleteMultiProxy(proxyToConnectDict) if not retVal['OK']: return retVal chain = retVal['Value'] if not proxyToRenewDict['tempFile']: return chain.dumpAllToFile(proxyToRenewDict['file']) return S_OK(chain) def getDBContents(self, condDict={}, sorting=[['UserDN', 'DESC']], start=0, limit=0): """ Get the contents of the db :param dict condDict: search condition :return: S_OK(dict)/S_ERROR() -- dict contain fields, record list, total records """ rpcClient = RPCClient("Framework/ProxyManager", timeout=120) return rpcClient.getContents(condDict, sorting, start, limit) def getVOMSAttributes(self, chain): """ Get the voms attributes for a chain :param X509Chain chain: proxy as a chain :return: S_OK(str)/S_ERROR() """ return VOMS().getVOMSAttributes(chain) def getUploadedProxyLifeTime(self, DN, group): """ Get the remaining seconds for an uploaded proxy :param str DN: user DN :param str group: group :return: S_OK(int)/S_ERROR() """ result = self.getDBContents({'UserDN': [DN], 'UserGroup': [group]}) if not result['OK']: return result data = result['Value'] if len(data['Records']) == 0: return S_OK(0) pNames = list(data['ParameterNames']) dnPos = pNames.index('UserDN') groupPos = pNames.index('UserGroup') expiryPos = pNames.index('ExpirationTime') for row in data['Records']: if DN == row[dnPos] and group == row[groupPos]: td = row[expiryPos] - datetime.datetime.utcnow() secondsLeft = td.days * 86400 + td.seconds return S_OK(max(0, secondsLeft)) return S_OK(0) def getUserProxiesInfo(self): """ Get the user proxies uploaded info :return: S_OK(dict)/S_ERROR() """ result = RPCClient("Framework/ProxyManager", timeout=120).getUserProxiesInfo() if 'rpcStub' in result: result.pop('rpcStub') return result
class ProxyManagerClient: __metaclass__ = DIRACSingleton.DIRACSingleton def __init__(self): self.__usersCache = DictCache() self.__proxiesCache = DictCache() self.__vomsProxiesCache = DictCache() self.__pilotProxiesCache = DictCache() self.__filesCache = DictCache(self.__deleteTemporalFile) def __deleteTemporalFile(self, filename): try: os.unlink(filename) except: pass def clearCaches(self): self.__usersCache.purgeAll() self.__proxiesCache.purgeAll() self.__vomsProxiesCache.purgeAll() self.__pilotProxiesCache.purgeAll() def __getSecondsLeftToExpiration(self, expiration, utc=True): if utc: td = expiration - datetime.datetime.utcnow() else: td = expiration - datetime.datetime.now() return td.days * 86400 + td.seconds def __refreshUserCache(self, validSeconds=0): rpcClient = RPCClient("Framework/ProxyManager", timeout=120) retVal = rpcClient.getRegisteredUsers(validSeconds) if not retVal['OK']: return retVal data = retVal['Value'] #Update the cache for record in data: cacheKey = (record['DN'], record['group']) self.__usersCache.add( cacheKey, self.__getSecondsLeftToExpiration(record['expirationtime']), record) return S_OK() @gUsersSync def userHasProxy(self, userDN, userGroup, validSeconds=0): """ Check if a user(DN-group) has a proxy in the proxy management - Updates internal cache if needed to minimize queries to the service """ cacheKey = (userDN, userGroup) if self.__usersCache.exists(cacheKey, validSeconds): return S_OK(True) #Get list of users from the DB with proxys at least 300 seconds gLogger.verbose("Updating list of users in proxy management") retVal = self.__refreshUserCache(validSeconds) if not retVal['OK']: return retVal return S_OK(self.__usersCache.exists(cacheKey, validSeconds)) @gUsersSync def getUserPersistence(self, userDN, userGroup, validSeconds=0): """ Check if a user(DN-group) has a proxy in the proxy management - Updates internal cache if needed to minimize queries to the service """ cacheKey = (userDN, userGroup) userData = self.__usersCache.get(cacheKey, validSeconds) if userData: if userData['persistent']: return S_OK(True) #Get list of users from the DB with proxys at least 300 seconds gLogger.verbose("Updating list of users in proxy management") retVal = self.__refreshUserCache(validSeconds) if not retVal['OK']: return retVal userData = self.__usersCache.get(cacheKey, validSeconds) if userData: return S_OK(userData['persistent']) return S_OK(False) def setPersistency(self, userDN, userGroup, persistent): """ Set the persistency for user/group """ #Hack to ensure bool in the rpc call persistentFlag = True if not persistent: persistentFlag = False rpcClient = RPCClient("Framework/ProxyManager", timeout=120) retVal = rpcClient.setPersistency(userDN, userGroup, persistentFlag) if not retVal['OK']: return retVal #Update internal persistency cache cacheKey = (userDN, userGroup) record = self.__usersCache.get(cacheKey, 0) if record: record['persistent'] = persistentFlag self.__usersCache.add( cacheKey, self.__getSecondsLeftToExpiration(record['expirationtime']), record) return retVal def uploadProxy(self, proxy=False, diracGroup=False, chainToConnect=False, restrictLifeTime=0): """ Upload a proxy to the proxy management service using delgation """ #Discover proxy location if type(proxy) == g_X509ChainType: chain = proxy proxyLocation = "" else: if not proxy: proxyLocation = Locations.getProxyLocation() if not proxyLocation: return S_ERROR("Can't find a valid proxy") elif type(proxy) in (types.StringType, types.UnicodeType): proxyLocation = proxy else: return S_ERROR("Can't find a valid proxy") chain = X509Chain() result = chain.loadProxyFromFile(proxyLocation) if not result['OK']: return S_ERROR("Can't load %s: %s " % (proxyLocation, result['Message'])) if not chainToConnect: chainToConnect = chain #Make sure it's valid if chain.hasExpired()['Value']: return S_ERROR("Proxy %s has expired" % proxyLocation) #rpcClient = RPCClient( "Framework/ProxyManager", proxyChain = chainToConnect ) rpcClient = RPCClient("Framework/ProxyManager", timeout=120) #Get a delegation request result = rpcClient.requestDelegationUpload( chain.getRemainingSecs()['Value'], diracGroup) if not result['OK']: return result #Check if the delegation has been granted if 'Value' not in result or not result['Value']: if 'proxies' in result: return S_OK(result['proxies']) else: return S_OK() reqDict = result['Value'] #Generate delegated chain chainLifeTime = chain.getRemainingSecs()['Value'] - 60 if restrictLifeTime and restrictLifeTime < chainLifeTime: chainLifeTime = restrictLifeTime retVal = chain.generateChainFromRequestString(reqDict['request'], lifetime=chainLifeTime, diracGroup=diracGroup) if not retVal['OK']: return retVal #Upload! result = rpcClient.completeDelegationUpload(reqDict['id'], retVal['Value']) if not result['OK']: return result if 'proxies' in result: return S_OK(result['proxies']) return S_OK() @gProxiesSync def downloadProxy(self, userDN, userGroup, limited=False, requiredTimeLeft=1200, cacheTime=43200, proxyToConnect=False, token=False): """ Get a proxy Chain from the proxy management """ cacheKey = (userDN, userGroup) if self.__proxiesCache.exists(cacheKey, requiredTimeLeft): return S_OK(self.__proxiesCache.get(cacheKey)) req = X509Request() req.generateProxyRequest(limited=limited) if proxyToConnect: rpcClient = RPCClient("Framework/ProxyManager", proxyChain=proxyToConnect, timeout=120) else: rpcClient = RPCClient("Framework/ProxyManager", timeout=120) if token: retVal = rpcClient.getProxyWithToken( userDN, userGroup, req.dumpRequest()['Value'], long(cacheTime + requiredTimeLeft), token) else: retVal = rpcClient.getProxy(userDN, userGroup, req.dumpRequest()['Value'], long(cacheTime + requiredTimeLeft)) if not retVal['OK']: return retVal chain = X509Chain(keyObj=req.getPKey()) retVal = chain.loadChainFromString(retVal['Value']) if not retVal['OK']: return retVal self.__proxiesCache.add(cacheKey, chain.getRemainingSecs()['Value'], chain) return S_OK(chain) def downloadProxyToFile(self, userDN, userGroup, limited=False, requiredTimeLeft=1200, cacheTime=43200, filePath=False, proxyToConnect=False, token=False): """ Get a proxy Chain from the proxy management and write it to file """ retVal = self.downloadProxy(userDN, userGroup, limited, requiredTimeLeft, cacheTime, proxyToConnect, token) if not retVal['OK']: return retVal chain = retVal['Value'] retVal = self.dumpProxyToFile(chain, filePath) if not retVal['OK']: return retVal retVal['chain'] = chain return retVal @gVOMSProxiesSync def downloadVOMSProxy(self, userDN, userGroup, limited=False, requiredTimeLeft=1200, cacheTime=43200, requiredVOMSAttribute=False, proxyToConnect=False, token=False): """ Download a proxy if needed and transform it into a VOMS one """ cacheKey = (userDN, userGroup, requiredVOMSAttribute, limited) if self.__vomsProxiesCache.exists(cacheKey, requiredTimeLeft): return S_OK(self.__vomsProxiesCache.get(cacheKey)) req = X509Request() req.generateProxyRequest(limited=limited) if proxyToConnect: rpcClient = RPCClient("Framework/ProxyManager", proxyChain=proxyToConnect, timeout=120) else: rpcClient = RPCClient("Framework/ProxyManager", timeout=120) if token: retVal = rpcClient.getVOMSProxyWithToken( userDN, userGroup, req.dumpRequest()['Value'], long(cacheTime + requiredTimeLeft), token, requiredVOMSAttribute) else: retVal = rpcClient.getVOMSProxy(userDN, userGroup, req.dumpRequest()['Value'], long(cacheTime + requiredTimeLeft), requiredVOMSAttribute) if not retVal['OK']: return retVal chain = X509Chain(keyObj=req.getPKey()) retVal = chain.loadChainFromString(retVal['Value']) if not retVal['OK']: return retVal self.__vomsProxiesCache.add(cacheKey, chain.getRemainingSecs()['Value'], chain) return S_OK(chain) def downloadVOMSProxyToFile(self, userDN, userGroup, limited=False, requiredTimeLeft=1200, cacheTime=43200, requiredVOMSAttribute=False, filePath=False, proxyToConnect=False, token=False): """ Download a proxy if needed, transform it into a VOMS one and write it to file """ retVal = self.downloadVOMSProxy(userDN, userGroup, limited, requiredTimeLeft, cacheTime, requiredVOMSAttribute, proxyToConnect, token) if not retVal['OK']: return retVal chain = retVal['Value'] retVal = self.dumpProxyToFile(chain, filePath) if not retVal['OK']: return retVal retVal['chain'] = chain return retVal def getPilotProxyFromDIRACGroup(self, userDN, userGroup, requiredTimeLeft=43200, proxyToConnect=False): """ Download a pilot proxy with VOMS extensions depending on the group """ #Assign VOMS attribute vomsAttr = CS.getVOMSAttributeForGroup(userGroup) if not vomsAttr: gLogger.verbose( "No voms attribute assigned to group %s when requested pilot proxy" % userGroup) return self.downloadProxy(userDN, userGroup, limited=False, requiredTimeLeft=requiredTimeLeft, proxyToConnect=proxyToConnect) else: return self.downloadVOMSProxy(userDN, userGroup, limited=False, requiredTimeLeft=requiredTimeLeft, requiredVOMSAttribute=vomsAttr, proxyToConnect=proxyToConnect) def getPilotProxyFromVOMSGroup(self, userDN, vomsAttr, requiredTimeLeft=43200, proxyToConnect=False): """ Download a pilot proxy with VOMS extensions depending on the group """ groups = CS.getGroupsWithVOMSAttribute(vomsAttr) if not groups: return S_ERROR("No group found that has %s as voms attrs" % vomsAttr) for userGroup in groups: result = self.downloadVOMSProxy(userDN, userGroup, limited=False, requiredTimeLeft=requiredTimeLeft, requiredVOMSAttribute=vomsAttr, proxyToConnect=proxyToConnect) if result['OK']: return result return result def getPayloadProxyFromDIRACGroup(self, userDN, userGroup, requiredTimeLeft, token=False, proxyToConnect=False): """ Download a payload proxy with VOMS extensions depending on the group """ #Assign VOMS attribute vomsAttr = CS.getVOMSAttributeForGroup(userGroup) if not vomsAttr: gLogger.verbose( "No voms attribute assigned to group %s when requested payload proxy" % userGroup) return self.downloadProxy(userDN, userGroup, limited=True, requiredTimeLeft=requiredTimeLeft, proxyToConnect=proxyToConnect, token=token) else: return self.downloadVOMSProxy(userDN, userGroup, limited=True, requiredTimeLeft=requiredTimeLeft, requiredVOMSAttribute=vomsAttr, proxyToConnect=proxyToConnect, token=token) def getPayloadProxyFromVOMSGroup(self, userDN, vomsAttr, token, requiredTimeLeft, proxyToConnect=False): """ Download a payload proxy with VOMS extensions depending on the VOMS attr """ groups = CS.getGroupsWithVOMSAttribute(vomsAttr) if not groups: return S_ERROR("No group found that has %s as voms attrs" % vomsAttr) userGroup = groups[0] return self.downloadVOMSProxy(userDN, userGroup, limited=True, requiredTimeLeft=requiredTimeLeft, requiredVOMSAttribute=vomsAttr, proxyToConnect=proxyToConnect, token=token) def dumpProxyToFile(self, chain, destinationFile=False, requiredTimeLeft=600): """ Dump a proxy to a file. It's cached so multiple calls won't generate extra files """ result = chain.hash() if not result['OK']: return result hash = result['Value'] if self.__filesCache.exists(hash, requiredTimeLeft): filepath = self.__filesCache.get(hash) if os.path.isfile(filepath): return S_OK(filepath) self.__filesCache.delete(hash) retVal = chain.dumpAllToFile(destinationFile) if not retVal['OK']: return retVal filename = retVal['Value'] self.__filesCache.add(hash, chain.getRemainingSecs()['Value'], filename) return S_OK(filename) def deleteGeneratedProxyFile(self, chain): """ Delete a file generated by a dump """ self.__filesCache.delete(chain) return S_OK() def requestToken(self, requesterDN, requesterGroup, numUses=1): """ Request a number of tokens. usesList must be a list of integers and each integer is the number of uses a token must have """ rpcClient = RPCClient("Framework/ProxyManager", timeout=120) return rpcClient.generateToken(requesterDN, requesterGroup, numUses) def renewProxy(self, proxyToBeRenewed=False, minLifeTime=3600, newProxyLifeTime=43200, proxyToConnect=False): """ Renew a proxy using the ProxyManager Arguments: proxyToBeRenewed : proxy to renew minLifeTime : if proxy life time is less than this, renew. Skip otherwise newProxyLifeTime : life time of new proxy proxyToConnect : proxy to use for connecting to the service """ retVal = File.multiProxyArgument(proxyToBeRenewed) if not retVal['Value']: return retVal proxyToRenewDict = retVal['Value'] secs = proxyToRenewDict['chain'].getRemainingSecs()['Value'] if secs > minLifeTime: File.deleteMultiProxy(proxyToRenewDict) return S_OK() if not proxyToConnect: proxyToConnectDict = {'chain': False, 'tempFile': False} else: retVal = File.multiProxyArgument(proxyToConnect) if not retVal['Value']: File.deleteMultiProxy(proxyToRenewDict) return retVal proxyToConnectDict = retVal['Value'] userDN = proxyToRenewDict['chain'].getIssuerCert( )['Value'].getSubjectDN()['Value'] retVal = proxyToRenewDict['chain'].getDIRACGroup() if not retVal['OK']: File.deleteMultiProxy(proxyToRenewDict) File.deleteMultiProxy(proxyToConnectDict) return retVal userGroup = retVal['Value'] limited = proxyToRenewDict['chain'].isLimitedProxy()['Value'] voms = VOMS() retVal = voms.getVOMSAttributes(proxyToRenewDict['chain']) if not retVal['OK']: File.deleteMultiProxy(proxyToRenewDict) File.deleteMultiProxy(proxyToConnectDict) return retVal vomsAttrs = retVal['Value'] if vomsAttrs: retVal = self.downloadVOMSProxy( userDN, userGroup, limited=limited, requiredTimeLeft=newProxyLifeTime, requiredVOMSAttribute=vomsAttrs[0], proxyToConnect=proxyToConnectDict['chain']) else: retVal = self.downloadProxy( userDN, userGroup, limited=limited, requiredTimeLeft=newProxyLifeTime, proxyToConnect=proxyToConnectDict['chain']) File.deleteMultiProxy(proxyToRenewDict) File.deleteMultiProxy(proxyToConnectDict) if not retVal['OK']: return retVal chain = retVal['Value'] if not proxyToRenewDict['tempFile']: return chain.dumpAllToFile(proxyToRenewDict['file']) return S_OK(chain) def getDBContents(self, condDict={}): """ Get the contents of the db """ rpcClient = RPCClient("Framework/ProxyManager", timeout=120) return rpcClient.getContents(condDict, [['UserDN', 'DESC']], 0, 0) def getVOMSAttributes(self, chain): """ Get the voms attributes for a chain """ return VOMS().getVOMSAttributes(chain) def getUploadedProxyLifeTime(self, DN, group): """ Get the remaining seconds for an uploaded proxy """ result = self.getDBContents({'UserDN': [DN], 'UserGroup': [group]}) if not result['OK']: return result data = result['Value'] if len(data['Records']) == 0: return S_OK(0) pNames = list(data['ParameterNames']) dnPos = pNames.index('UserDN') groupPos = pNames.index('UserGroup') expiryPos = pNames.index('ExpirationTime') for row in data['Records']: if DN == row[dnPos] and group == row[groupPos]: td = row[expiryPos] - datetime.datetime.utcnow() secondsLeft = td.days * 86400 + td.seconds return S_OK(max(0, secondsLeft)) return S_OK(0) def getUserProxiesInfo(self): """ Get the user proxies uploaded info """ result = RPCClient("Framework/ProxyManager", timeout=120).getUserProxiesInfo() if 'rpcStub' in result: result.pop('rpcStub') return result
class GridPilotDirector(PilotDirector): """ Base Grid PilotDirector class Derived classes must declare: * self.Middleware: It must correspond to the string before "PilotDirector". (For proper naming of the logger) * self.ResourceBrokers: list of Brokers used by the Director. (For proper error reporting) """ def __init__(self, submitPool): """ Define some defaults and call parent __init__ """ self.gridEnv = GRIDENV self.cpuPowerRef = CPU_POWER_REF self.requirements = REQUIREMENTS self.rank = RANK self.fuzzyRank = FUZZY_RANK self.__failingWMSCache = DictCache() self.__ticketsWMSCache = DictCache() self.__listMatchWMSCache = DictCache() PilotDirector.__init__(self, submitPool) def configure(self, csSection, submitPool): """ Here goes common configuration for all Grid PilotDirectors """ PilotDirector.configure(self, csSection, submitPool) self.reloadConfiguration(csSection, submitPool) self.__failingWMSCache.purgeExpired() self.__ticketsWMSCache.purgeExpired() for rb in self.__failingWMSCache.getKeys(): if rb in self.resourceBrokers: try: self.resourceBrokers.remove(rb) except: pass self.resourceBrokers = List.randomize(self.resourceBrokers) if self.gridEnv: self.log.info(' GridEnv: ', self.gridEnv) if self.resourceBrokers: self.log.info(' ResourceBrokers:', ', '.join(self.resourceBrokers)) def configureFromSection(self, mySection): """ reload from CS """ PilotDirector.configureFromSection(self, mySection) self.gridEnv = gConfig.getValue(mySection + '/GridEnv', self.gridEnv) if not self.gridEnv: # No specific option found, try a general one setup = gConfig.getValue('/DIRAC/Setup', '') if setup: instance = gConfig.getValue( '/DIRAC/Setups/%s/WorkloadManagement' % setup, '') if instance: self.gridEnv = gConfig.getValue( '/Systems/WorkloadManagement/%s/GridEnv' % instance, '') self.resourceBrokers = gConfig.getValue(mySection + '/ResourceBrokers', self.resourceBrokers) self.cpuPowerRef = gConfig.getValue(mySection + '/CPUPowerRef', self.cpuPowerRef) self.requirements = gConfig.getValue(mySection + '/Requirements', self.requirements) self.rank = gConfig.getValue(mySection + '/Rank', self.rank) self.fuzzyRank = gConfig.getValue(mySection + '/FuzzyRank', self.fuzzyRank) def _submitPilots(self, workDir, taskQueueDict, pilotOptions, pilotsToSubmit, ceMask, submitPrivatePilot, privateTQ, proxy, pilotsPerJob): """ This method does the actual pilot submission to the Grid RB The logic is as follows: - If there are no available RB it return error - If there is no VOMS extension in the proxy, return error - It creates a temp directory - Prepare a JDL it has some part common to gLite and LCG (the payload description) it has some part specific to each middleware """ taskQueueID = taskQueueDict['TaskQueueID'] # ownerDN = taskQueueDict['OwnerDN'] credDict = proxy.getCredentials()['Value'] ownerDN = credDict['identity'] ownerGroup = credDict['group'] if not self.resourceBrokers: # Since we can exclude RBs from the list, it may become empty return S_ERROR(ERROR_RB) # Need to get VOMS extension for the later interactions with WMS ret = gProxyManager.getVOMSAttributes(proxy) if not ret['OK']: self.log.error(ERROR_VOMS, ret['Message']) return S_ERROR(ERROR_VOMS) if not ret['Value']: return S_ERROR(ERROR_VOMS) workingDirectory = tempfile.mkdtemp(prefix='TQ_%s_' % taskQueueID, dir=workDir) self.log.verbose('Using working Directory:', workingDirectory) # Write JDL retDict = self._prepareJDL(taskQueueDict, workingDirectory, pilotOptions, pilotsPerJob, ceMask, submitPrivatePilot, privateTQ) jdl = retDict['JDL'] pilotRequirements = retDict['Requirements'] rb = retDict['RB'] if not jdl: try: shutil.rmtree(workingDirectory) except: pass return S_ERROR(ERROR_JDL) # Check that there are available queues for the Job: if self.enableListMatch: availableCEs = [] now = Time.dateTime() availableCEs = self.listMatchCache.get(pilotRequirements) if availableCEs is None: availableCEs = self._listMatch(proxy, jdl, taskQueueID, rb) if availableCEs != False: self.log.verbose('LastListMatch', now) self.log.verbose('AvailableCEs ', availableCEs) self.listMatchCache.add( pilotRequirements, self.listMatchDelay * 60, value=availableCEs) # it is given in minutes if not availableCEs: try: shutil.rmtree(workingDirectory) except: pass return S_ERROR(ERROR_CE + ' TQ: %d' % taskQueueID) # Now we are ready for the actual submission, so self.log.verbose('Submitting Pilots for TaskQueue', taskQueueID) # FIXME: what is this?? If it goes on the super class, it is doomed submitRet = self._submitPilot(proxy, pilotsPerJob, jdl, taskQueueID, rb) try: shutil.rmtree(workingDirectory) except: pass if not submitRet: return S_ERROR('Pilot Submission Failed for TQ %d ' % taskQueueID) # pilotReference, resourceBroker = submitRet submittedPilots = 0 if pilotsPerJob != 1 and len(submitRet) != pilotsPerJob: # Parametric jobs are used for pilotReference, resourceBroker in submitRet: pilotReference = self._getChildrenReferences( proxy, pilotReference, taskQueueID) submittedPilots += len(pilotReference) pilotAgentsDB.addPilotTQReference(pilotReference, taskQueueID, ownerDN, ownerGroup, resourceBroker, self.gridMiddleware, pilotRequirements) else: for pilotReference, resourceBroker in submitRet: pilotReference = [pilotReference] submittedPilots += len(pilotReference) pilotAgentsDB.addPilotTQReference(pilotReference, taskQueueID, ownerDN, ownerGroup, resourceBroker, self.gridMiddleware, pilotRequirements) # add some sleep here time.sleep(0.1 * submittedPilots) if pilotsToSubmit > pilotsPerJob: # Additional submissions are necessary, need to get a new token and iterate. pilotsToSubmit -= pilotsPerJob result = gProxyManager.requestToken( ownerDN, ownerGroup, max(pilotsToSubmit, self.maxJobsInFillMode)) if not result['OK']: self.log.error(ERROR_TOKEN, result['Message']) result = S_ERROR(ERROR_TOKEN) result['Value'] = submittedPilots return result (token, numberOfUses) = result['Value'] for option in pilotOptions: if option.find('-o /Security/ProxyToken=') == 0: pilotOptions.remove(option) pilotOptions.append('-o /Security/ProxyToken=%s' % token) pilotsPerJob = max( 1, min(pilotsPerJob, int(numberOfUses / self.maxJobsInFillMode))) result = self._submitPilots(workDir, taskQueueDict, pilotOptions, pilotsToSubmit, ceMask, submitPrivatePilot, privateTQ, proxy, pilotsPerJob) if not result['OK']: if 'Value' not in result: result['Value'] = 0 result['Value'] += submittedPilots return result submittedPilots += result['Value'] return S_OK(submittedPilots) def _prepareJDL(self, taskQueueDict, workingDirectory, pilotOptions, pilotsToSubmit, ceMask, submitPrivatePilot, privateTQ): """ This method should be overridden in a subclass """ self.log.error( '_prepareJDL() method should be implemented in a subclass') sys.exit() def _JobJDL(self, taskQueueDict, pilotOptions, ceMask): """ The Job JDL is the same for LCG and GLite """ pilotJDL = 'Executable = "%s";\n' % os.path.basename(self.pilot) executable = self.pilot pilotJDL += 'Arguments = "%s";\n' % ' '.join(pilotOptions) pilotJDL += 'CPUTimeRef = %s;\n' % taskQueueDict['CPUTime'] pilotJDL += 'CPUPowerRef = %s;\n' % self.cpuPowerRef pilotJDL += """CPUWorkRef = real( CPUTimeRef * CPUPowerRef ); Lookup = "CPUScalingReferenceSI00=*"; cap = isList( other.GlueCECapability ) ? other.GlueCECapability : { "dummy" }; i0 = regexp( Lookup, cap[0] ) ? 0 : undefined; i1 = isString( cap[1] ) && regexp( Lookup, cap[1] ) ? 1 : i0; i2 = isString( cap[2] ) && regexp( Lookup, cap[2] ) ? 2 : i1; i3 = isString( cap[3] ) && regexp( Lookup, cap[3] ) ? 3 : i2; i4 = isString( cap[4] ) && regexp( Lookup, cap[4] ) ? 4 : i3; i5 = isString( cap[5] ) && regexp( Lookup, cap[5] ) ? 5 : i4; index = isString( cap[6] ) && regexp( Lookup, cap[6] ) ? 6 : i5; i = isUndefined( index ) ? 0 : index; QueuePowerRef = real( ! isUndefined( index ) ? int( substr( cap[i], size( Lookup ) - 1 ) ) : other.GlueHostBenchmarkSI00 ); QueueTimeRef = real( other.GlueCEPolicyMaxCPUTime * 60 ); QueueWorkRef = QueuePowerRef * QueueTimeRef; """ requirements = list(self.requirements) if 'GridCEs' in taskQueueDict and taskQueueDict['GridCEs']: # if there an explicit Grig CE requested by the TQ, remove the Ranking requirement for req in self.requirements: if req.strip().lower()[:6] == 'rank >': requirements.remove(req) requirements.append('QueueWorkRef > CPUWorkRef') siteRequirements = '\n || '.join( ['other.GlueCEInfoHostName == "%s"' % s for s in ceMask]) requirements.append("( %s\n )" % siteRequirements) pilotRequirements = '\n && '.join(requirements) pilotJDL += 'pilotRequirements = %s;\n' % pilotRequirements pilotJDL += 'Rank = %s;\n' % self.rank pilotJDL += 'FuzzyRank = %s;\n' % self.fuzzyRank pilotJDL += 'StdOutput = "%s";\n' % outputSandboxFiles[0] pilotJDL += 'StdError = "%s";\n' % outputSandboxFiles[1] pilotJDL += 'InputSandbox = { "%s" };\n' % '", "'.join( [self.install, executable] + self.extraModules) pilotJDL += 'OutputSandbox = { %s };\n' % ', '.join( ['"%s"' % f for f in outputSandboxFiles]) self.log.verbose(pilotJDL) return (pilotJDL, pilotRequirements) def parseListMatchStdout(self, proxy, cmd, taskQueueID, rb): """ Parse List Match stdout to return list of matched CE's """ self.log.verbose('Executing List Match for TaskQueue', taskQueueID) start = time.time() ret = executeGridCommand(proxy, cmd, self.gridEnv) if not ret['OK']: self.log.error('Failed to execute List Match:', ret['Message']) self.__sendErrorMail(rb, 'List Match', cmd, ret, proxy) return False if ret['Value'][0] != 0: self.log.error('Error executing List Match:', str(ret['Value'][0]) + '\n'.join(ret['Value'][1:3])) self.__sendErrorMail(rb, 'List Match', cmd, ret, proxy) return False self.log.info('List Match Execution Time: %.2f for TaskQueue %d' % ((time.time() - start), taskQueueID)) stdout = ret['Value'][1] stderr = ret['Value'][2] availableCEs = [] # Parse std.out for line in List.fromChar(stdout, '\n'): if re.search('/jobmanager-', line) or re.search('/cream-', line): # TODO: the line has to be stripped from extra info availableCEs.append(line) if not availableCEs: self.log.info('List-Match failed to find CEs for TaskQueue', taskQueueID) self.log.info(stdout) self.log.info(stderr) else: self.log.debug('List-Match returns:', str(ret['Value'][0]) + '\n'.join(ret['Value'][1:3])) self.log.info( 'List-Match found %s CEs for TaskQueue' % len(availableCEs), taskQueueID) self.log.verbose(', '.join(availableCEs)) return availableCEs def parseJobSubmitStdout(self, proxy, cmd, taskQueueID, rb): """ Parse Job Submit stdout to return pilot reference """ start = time.time() self.log.verbose('Executing Job Submit for TaskQueue', taskQueueID) ret = executeGridCommand(proxy, cmd, self.gridEnv) if not ret['OK']: self.log.error('Failed to execute Job Submit:', ret['Message']) self.__sendErrorMail(rb, 'Job Submit', cmd, ret, proxy) return False if ret['Value'][0] != 0: self.log.error('Error executing Job Submit:', str(ret['Value'][0]) + '\n'.join(ret['Value'][1:3])) self.__sendErrorMail(rb, 'Job Submit', cmd, ret, proxy) return False self.log.info('Job Submit Execution Time: %.2f for TaskQueue %d' % ((time.time() - start), taskQueueID)) stdout = ret['Value'][1] failed = 1 rb = '' for line in List.fromChar(stdout, '\n'): m = re.search("(https:\S+)", line) if (m): glite_id = m.group(1) if not rb: m = re.search("https://(.+):.+", glite_id) rb = m.group(1) failed = 0 if failed: self.log.error('Job Submit returns no Reference:', str(ret['Value'][0]) + '\n'.join(ret['Value'][1:3])) return False self.log.info('Reference %s for TaskQueue %s' % (glite_id, taskQueueID)) return glite_id, rb def _writeJDL(self, filename, jdlList): try: f = open(filename, 'w') f.write('\n'.join(jdlList)) f.close() except Exception as x: self.log.exception(x) return '' return filename def __sendErrorMail(self, rb, name, command, result, proxy): """ In case or error with RB/WM: - check if RB/WMS still in use - remove RB/WMS from current list - check if RB/WMS not in cache - add RB/WMS to cache - send Error mail """ if rb in self.resourceBrokers: try: self.resourceBrokers.remove(rb) self.log.info('Removed RB from list', rb) except: pass if not self.__failingWMSCache.exists(rb): self.__failingWMSCache.add( rb, self.errorClearTime) # disable for 30 minutes mailAddress = self.errorMailAddress msg = '' if not result['OK']: subject = "%s: timeout executing %s" % (rb, name) msg += '\n%s' % result['Message'] elif result['Value'][0] != 0: if re.search('the server is temporarily drained', ' '.join(result['Value'][1:3])): return if re.search('System load is too high:', ' '.join(result['Value'][1:3])): return subject = "%s: error executing %s" % (rb, name) else: return msg += ' '.join(command) msg += '\nreturns: %s\n' % str(result['Value'][0]) + '\n'.join( result['Value'][1:3]) msg += '\nUsing Proxy:\n' + getProxyInfoAsString( proxy)['Value'] #msg += '\nUsing Proxy:\n' + gProxyManager. ticketTime = self.errorClearTime + self.errorTicketTime if self.__ticketsWMSCache.exists(rb): mailAddress = self.alarmMailAddress # the RB was already detected failing a short time ago msg = 'Submit GGUS Ticket for this error if not already opened\n' + \ 'It has been failing at least for %s hours\n' % ( ticketTime / 60 / 60 ) + msg else: self.__ticketsWMSCache.add(rb, ticketTime) if mailAddress: result = NotificationClient().sendMail( mailAddress, subject, msg, fromAddress=self.mailFromAddress) if not result['OK']: self.log.error("Mail could not be sent") return
class ProxyManagerClient: __metaclass__ = DIRACSingleton.DIRACSingleton def __init__( self ): self.__usersCache = DictCache() self.__proxiesCache = DictCache() self.__vomsProxiesCache = DictCache() self.__pilotProxiesCache = DictCache() self.__filesCache = DictCache( self.__deleteTemporalFile ) def __deleteTemporalFile( self, filename ): try: os.unlink( filename ) except: pass def clearCaches( self ): self.__usersCache.purgeAll() self.__proxiesCache.purgeAll() self.__vomsProxiesCache.purgeAll() self.__pilotProxiesCache.purgeAll() def __getSecondsLeftToExpiration( self, expiration, utc = True ): if utc: td = expiration - datetime.datetime.utcnow() else: td = expiration - datetime.datetime.now() return td.days * 86400 + td.seconds def __refreshUserCache( self, validSeconds = 0 ): rpcClient = RPCClient( "Framework/ProxyManager", timeout = 120 ) retVal = rpcClient.getRegisteredUsers( validSeconds ) if not retVal[ 'OK' ]: return retVal data = retVal[ 'Value' ] #Update the cache for record in data: cacheKey = ( record[ 'DN' ], record[ 'group' ] ) self.__usersCache.add( cacheKey, self.__getSecondsLeftToExpiration( record[ 'expirationtime' ] ), record ) return S_OK() @gUsersSync def userHasProxy( self, userDN, userGroup, validSeconds = 0 ): """ Check if a user(DN-group) has a proxy in the proxy management - Updates internal cache if needed to minimize queries to the service """ cacheKey = ( userDN, userGroup ) if self.__usersCache.exists( cacheKey, validSeconds ): return S_OK( True ) #Get list of users from the DB with proxys at least 300 seconds gLogger.verbose( "Updating list of users in proxy management" ) retVal = self.__refreshUserCache( validSeconds ) if not retVal[ 'OK' ]: return retVal return S_OK( self.__usersCache.exists( cacheKey, validSeconds ) ) @gUsersSync def getUserPersistence( self, userDN, userGroup, validSeconds = 0 ): """ Check if a user(DN-group) has a proxy in the proxy management - Updates internal cache if needed to minimize queries to the service """ cacheKey = ( userDN, userGroup ) userData = self.__usersCache.get( cacheKey, validSeconds ) if userData: if userData[ 'persistent' ]: return S_OK( True ) #Get list of users from the DB with proxys at least 300 seconds gLogger.verbose( "Updating list of users in proxy management" ) retVal = self.__refreshUserCache( validSeconds ) if not retVal[ 'OK' ]: return retVal userData = self.__usersCache.get( cacheKey, validSeconds ) if userData: return S_OK( userData[ 'persistent' ] ) return S_OK( False ) def setPersistency( self, userDN, userGroup, persistent ): """ Set the persistency for user/group """ #Hack to ensure bool in the rpc call persistentFlag = True if not persistent: persistentFlag = False rpcClient = RPCClient( "Framework/ProxyManager", timeout = 120 ) retVal = rpcClient.setPersistency( userDN, userGroup, persistentFlag ) if not retVal[ 'OK' ]: return retVal #Update internal persistency cache cacheKey = ( userDN, userGroup ) record = self.__usersCache.get( cacheKey, 0 ) if record: record[ 'persistent' ] = persistentFlag self.__usersCache.add( cacheKey, self.__getSecondsLeftToExpiration( record[ 'expirationtime' ] ), record ) return retVal def uploadProxy( self, proxy = False, diracGroup = False, chainToConnect = False, restrictLifeTime = 0, rfcIfPossible = False ): """ Upload a proxy to the proxy management service using delegation """ #Discover proxy location if type( proxy ) == g_X509ChainType: chain = proxy proxyLocation = "" else: if not proxy: proxyLocation = Locations.getProxyLocation() if not proxyLocation: return S_ERROR( "Can't find a valid proxy" ) elif isinstance( proxy, basestring ): proxyLocation = proxy else: return S_ERROR( "Can't find a valid proxy" ) chain = X509Chain() result = chain.loadProxyFromFile( proxyLocation ) if not result[ 'OK' ]: return S_ERROR( "Can't load %s: %s " % ( proxyLocation, result[ 'Message' ] ) ) if not chainToConnect: chainToConnect = chain #Make sure it's valid if chain.hasExpired()[ 'Value' ]: return S_ERROR( "Proxy %s has expired" % proxyLocation ) #rpcClient = RPCClient( "Framework/ProxyManager", proxyChain = chainToConnect ) rpcClient = RPCClient( "Framework/ProxyManager", timeout = 120 ) #Get a delegation request result = rpcClient.requestDelegationUpload( chain.getRemainingSecs()['Value'], diracGroup ) if not result[ 'OK' ]: return result #Check if the delegation has been granted if 'Value' not in result or not result[ 'Value' ]: if 'proxies' in result: return S_OK( result[ 'proxies' ] ) else: return S_OK() reqDict = result[ 'Value' ] #Generate delegated chain chainLifeTime = chain.getRemainingSecs()[ 'Value' ] - 60 if restrictLifeTime and restrictLifeTime < chainLifeTime: chainLifeTime = restrictLifeTime retVal = chain.generateChainFromRequestString( reqDict[ 'request' ], lifetime = chainLifeTime, diracGroup = diracGroup, rfc = rfcIfPossible) if not retVal[ 'OK' ]: return retVal #Upload! result = rpcClient.completeDelegationUpload( reqDict[ 'id' ], retVal[ 'Value' ] ) if not result[ 'OK' ]: return result if 'proxies' in result: return S_OK( result[ 'proxies' ] ) return S_OK() @gProxiesSync def downloadProxy( self, userDN, userGroup, limited = False, requiredTimeLeft = 1200, cacheTime = 43200, proxyToConnect = False, token = False ): """ Get a proxy Chain from the proxy management """ cacheKey = ( userDN, userGroup ) if self.__proxiesCache.exists( cacheKey, requiredTimeLeft ): return S_OK( self.__proxiesCache.get( cacheKey ) ) req = X509Request() req.generateProxyRequest( limited = limited ) if proxyToConnect: rpcClient = RPCClient( "Framework/ProxyManager", proxyChain = proxyToConnect, timeout = 120 ) else: rpcClient = RPCClient( "Framework/ProxyManager", timeout = 120 ) if token: retVal = rpcClient.getProxyWithToken( userDN, userGroup, req.dumpRequest()['Value'], long( cacheTime + requiredTimeLeft ), token ) else: retVal = rpcClient.getProxy( userDN, userGroup, req.dumpRequest()['Value'], long( cacheTime + requiredTimeLeft ) ) if not retVal[ 'OK' ]: return retVal chain = X509Chain( keyObj = req.getPKey() ) retVal = chain.loadChainFromString( retVal[ 'Value' ] ) if not retVal[ 'OK' ]: return retVal self.__proxiesCache.add( cacheKey, chain.getRemainingSecs()['Value'], chain ) return S_OK( chain ) def downloadProxyToFile( self, userDN, userGroup, limited = False, requiredTimeLeft = 1200, cacheTime = 43200, filePath = False, proxyToConnect = False, token = False ): """ Get a proxy Chain from the proxy management and write it to file """ retVal = self.downloadProxy( userDN, userGroup, limited, requiredTimeLeft, cacheTime, proxyToConnect, token ) if not retVal[ 'OK' ]: return retVal chain = retVal[ 'Value' ] retVal = self.dumpProxyToFile( chain, filePath ) if not retVal[ 'OK' ]: return retVal retVal[ 'chain' ] = chain return retVal @gVOMSProxiesSync def downloadVOMSProxy( self, userDN, userGroup, limited = False, requiredTimeLeft = 1200, cacheTime = 43200, requiredVOMSAttribute = False, proxyToConnect = False, token = False ): """ Download a proxy if needed and transform it into a VOMS one """ cacheKey = ( userDN, userGroup, requiredVOMSAttribute, limited ) if self.__vomsProxiesCache.exists( cacheKey, requiredTimeLeft ): return S_OK( self.__vomsProxiesCache.get( cacheKey ) ) req = X509Request() req.generateProxyRequest( limited = limited ) if proxyToConnect: rpcClient = RPCClient( "Framework/ProxyManager", proxyChain = proxyToConnect, timeout = 120 ) else: rpcClient = RPCClient( "Framework/ProxyManager", timeout = 120 ) if token: retVal = rpcClient.getVOMSProxyWithToken( userDN, userGroup, req.dumpRequest()['Value'], long( cacheTime + requiredTimeLeft ), token, requiredVOMSAttribute ) else: retVal = rpcClient.getVOMSProxy( userDN, userGroup, req.dumpRequest()['Value'], long( cacheTime + requiredTimeLeft ), requiredVOMSAttribute ) if not retVal[ 'OK' ]: return retVal chain = X509Chain( keyObj = req.getPKey() ) retVal = chain.loadChainFromString( retVal[ 'Value' ] ) if not retVal[ 'OK' ]: return retVal self.__vomsProxiesCache.add( cacheKey, chain.getRemainingSecs()['Value'], chain ) return S_OK( chain ) def downloadVOMSProxyToFile( self, userDN, userGroup, limited = False, requiredTimeLeft = 1200, cacheTime = 43200, requiredVOMSAttribute = False, filePath = False, proxyToConnect = False, token = False ): """ Download a proxy if needed, transform it into a VOMS one and write it to file """ retVal = self.downloadVOMSProxy( userDN, userGroup, limited, requiredTimeLeft, cacheTime, requiredVOMSAttribute, proxyToConnect, token ) if not retVal[ 'OK' ]: return retVal chain = retVal[ 'Value' ] retVal = self.dumpProxyToFile( chain, filePath ) if not retVal[ 'OK' ]: return retVal retVal[ 'chain' ] = chain return retVal def getPilotProxyFromDIRACGroup( self, userDN, userGroup, requiredTimeLeft = 43200, proxyToConnect = False ): """ Download a pilot proxy with VOMS extensions depending on the group """ #Assign VOMS attribute vomsAttr = CS.getVOMSAttributeForGroup( userGroup ) if not vomsAttr: gLogger.verbose( "No voms attribute assigned to group %s when requested pilot proxy" % userGroup ) return self.downloadProxy( userDN, userGroup, limited = False, requiredTimeLeft = requiredTimeLeft, proxyToConnect = proxyToConnect ) else: return self.downloadVOMSProxy( userDN, userGroup, limited = False, requiredTimeLeft = requiredTimeLeft, requiredVOMSAttribute = vomsAttr, proxyToConnect = proxyToConnect ) def getPilotProxyFromVOMSGroup( self, userDN, vomsAttr, requiredTimeLeft = 43200, proxyToConnect = False ): """ Download a pilot proxy with VOMS extensions depending on the group """ groups = CS.getGroupsWithVOMSAttribute( vomsAttr ) if not groups: return S_ERROR( "No group found that has %s as voms attrs" % vomsAttr ) for userGroup in groups: result = self.downloadVOMSProxy( userDN, userGroup, limited = False, requiredTimeLeft = requiredTimeLeft, requiredVOMSAttribute = vomsAttr, proxyToConnect = proxyToConnect ) if result['OK']: return result return result def getPayloadProxyFromDIRACGroup( self, userDN, userGroup, requiredTimeLeft, token = False, proxyToConnect = False ): """ Download a payload proxy with VOMS extensions depending on the group """ #Assign VOMS attribute vomsAttr = CS.getVOMSAttributeForGroup( userGroup ) if not vomsAttr: gLogger.verbose( "No voms attribute assigned to group %s when requested payload proxy" % userGroup ) return self.downloadProxy( userDN, userGroup, limited = True, requiredTimeLeft = requiredTimeLeft, proxyToConnect = proxyToConnect, token = token ) else: return self.downloadVOMSProxy( userDN, userGroup, limited = True, requiredTimeLeft = requiredTimeLeft, requiredVOMSAttribute = vomsAttr, proxyToConnect = proxyToConnect, token = token ) def getPayloadProxyFromVOMSGroup( self, userDN, vomsAttr, token, requiredTimeLeft, proxyToConnect = False ): """ Download a payload proxy with VOMS extensions depending on the VOMS attr """ groups = CS.getGroupsWithVOMSAttribute( vomsAttr ) if not groups: return S_ERROR( "No group found that has %s as voms attrs" % vomsAttr ) userGroup = groups[0] return self.downloadVOMSProxy( userDN, userGroup, limited = True, requiredTimeLeft = requiredTimeLeft, requiredVOMSAttribute = vomsAttr, proxyToConnect = proxyToConnect, token = token ) def dumpProxyToFile( self, chain, destinationFile = False, requiredTimeLeft = 600 ): """ Dump a proxy to a file. It's cached so multiple calls won't generate extra files """ result = chain.hash() if not result[ 'OK' ]: return result cHash = result[ 'Value' ] if self.__filesCache.exists( cHash, requiredTimeLeft ): filepath = self.__filesCache.get( cHash ) if filepath and os.path.isfile( filepath ): return S_OK( filepath ) self.__filesCache.delete( cHash ) retVal = chain.dumpAllToFile( destinationFile ) if not retVal[ 'OK' ]: return retVal filename = retVal[ 'Value' ] self.__filesCache.add( cHash, chain.getRemainingSecs()['Value'], filename ) return S_OK( filename ) def deleteGeneratedProxyFile( self, chain ): """ Delete a file generated by a dump """ self.__filesCache.delete( chain ) return S_OK() def requestToken( self, requesterDN, requesterGroup, numUses = 1 ): """ Request a number of tokens. usesList must be a list of integers and each integer is the number of uses a token must have """ rpcClient = RPCClient( "Framework/ProxyManager", timeout = 120 ) return rpcClient.generateToken( requesterDN, requesterGroup, numUses ) def renewProxy( self, proxyToBeRenewed = False, minLifeTime = 3600, newProxyLifeTime = 43200, proxyToConnect = False ): """ Renew a proxy using the ProxyManager Arguments: proxyToBeRenewed : proxy to renew minLifeTime : if proxy life time is less than this, renew. Skip otherwise newProxyLifeTime : life time of new proxy proxyToConnect : proxy to use for connecting to the service """ retVal = multiProxyArgument( proxyToBeRenewed ) if not retVal[ 'Value' ]: return retVal proxyToRenewDict = retVal[ 'Value' ] secs = proxyToRenewDict[ 'chain' ].getRemainingSecs()[ 'Value' ] if secs > minLifeTime: deleteMultiProxy( proxyToRenewDict ) return S_OK() if not proxyToConnect: proxyToConnectDict = { 'chain': False, 'tempFile': False } else: retVal = multiProxyArgument( proxyToConnect ) if not retVal[ 'Value' ]: deleteMultiProxy( proxyToRenewDict ) return retVal proxyToConnectDict = retVal[ 'Value' ] userDN = proxyToRenewDict[ 'chain' ].getIssuerCert()[ 'Value' ].getSubjectDN()[ 'Value' ] retVal = proxyToRenewDict[ 'chain' ].getDIRACGroup() if not retVal[ 'OK' ]: deleteMultiProxy( proxyToRenewDict ) deleteMultiProxy( proxyToConnectDict ) return retVal userGroup = retVal[ 'Value' ] limited = proxyToRenewDict[ 'chain' ].isLimitedProxy()[ 'Value' ] voms = VOMS() retVal = voms.getVOMSAttributes( proxyToRenewDict[ 'chain' ] ) if not retVal[ 'OK' ]: deleteMultiProxy( proxyToRenewDict ) deleteMultiProxy( proxyToConnectDict ) return retVal vomsAttrs = retVal[ 'Value' ] if vomsAttrs: retVal = self.downloadVOMSProxy( userDN, userGroup, limited = limited, requiredTimeLeft = newProxyLifeTime, requiredVOMSAttribute = vomsAttrs[0], proxyToConnect = proxyToConnectDict[ 'chain' ] ) else: retVal = self.downloadProxy( userDN, userGroup, limited = limited, requiredTimeLeft = newProxyLifeTime, proxyToConnect = proxyToConnectDict[ 'chain' ] ) deleteMultiProxy( proxyToRenewDict ) deleteMultiProxy( proxyToConnectDict ) if not retVal[ 'OK' ]: return retVal chain = retVal['Value'] if not proxyToRenewDict[ 'tempFile' ]: return chain.dumpAllToFile( proxyToRenewDict[ 'file' ] ) return S_OK( chain ) def getDBContents( self, condDict = {} ): """ Get the contents of the db """ rpcClient = RPCClient( "Framework/ProxyManager", timeout = 120 ) return rpcClient.getContents( condDict, [ [ 'UserDN', 'DESC' ] ], 0, 0 ) def getVOMSAttributes( self, chain ): """ Get the voms attributes for a chain """ return VOMS().getVOMSAttributes( chain ) def getUploadedProxyLifeTime( self, DN, group ): """ Get the remaining seconds for an uploaded proxy """ result = self.getDBContents( { 'UserDN' : [ DN ], 'UserGroup' : [ group ] } ) if not result[ 'OK' ]: return result data = result[ 'Value' ] if len( data[ 'Records' ] ) == 0: return S_OK( 0 ) pNames = list( data[ 'ParameterNames' ] ) dnPos = pNames.index( 'UserDN' ) groupPos = pNames.index( 'UserGroup' ) expiryPos = pNames.index( 'ExpirationTime' ) for row in data[ 'Records' ]: if DN == row[ dnPos ] and group == row[ groupPos ]: td = row[ expiryPos ] - datetime.datetime.utcnow() secondsLeft = td.days * 86400 + td.seconds return S_OK( max( 0, secondsLeft ) ) return S_OK( 0 ) def getUserProxiesInfo( self ): """ Get the user proxies uploaded info """ result = RPCClient( "Framework/ProxyManager", timeout = 120 ).getUserProxiesInfo() if 'rpcStub' in result: result.pop( 'rpcStub' ) return result
class GridPilotDirector( PilotDirector ): """ Base Grid PilotDirector class Derived classes must declare: self.Middleware: It must correspond to the string before "PilotDirector". (For proper naming of the logger) self.ResourceBrokers: list of Brokers used by the Director. (For proper error reporting) """ def __init__( self, submitPool ): """ Define some defaults and call parent __init__ """ self.gridEnv = GRIDENV self.cpuPowerRef = CPU_POWER_REF self.requirements = REQUIREMENTS self.rank = RANK self.fuzzyRank = FUZZY_RANK self.__failingWMSCache = DictCache() self.__ticketsWMSCache = DictCache() self.__listMatchWMSCache = DictCache() PilotDirector.__init__( self, submitPool ) def configure( self, csSection, submitPool ): """ Here goes common configuration for all Grid PilotDirectors """ PilotDirector.configure( self, csSection, submitPool ) self.reloadConfiguration( csSection, submitPool ) self.__failingWMSCache.purgeExpired() self.__ticketsWMSCache.purgeExpired() for rb in self.__failingWMSCache.getKeys(): if rb in self.resourceBrokers: try: self.resourceBrokers.remove( rb ) except: pass self.resourceBrokers = List.randomize( self.resourceBrokers ) if self.gridEnv: self.log.info( ' GridEnv: ', self.gridEnv ) if self.resourceBrokers: self.log.info( ' ResourceBrokers:', ', '.join( self.resourceBrokers ) ) def configureFromSection( self, mySection ): """ reload from CS """ PilotDirector.configureFromSection( self, mySection ) self.gridEnv = gConfig.getValue( mySection + '/GridEnv', self.gridEnv ) if not self.gridEnv: # No specific option found, try a general one setup = gConfig.getValue( '/DIRAC/Setup', '' ) if setup: instance = gConfig.getValue( '/DIRAC/Setups/%s/WorkloadManagement' % setup, '' ) if instance: self.gridEnv = gConfig.getValue( '/Systems/WorkloadManagement/%s/GridEnv' % instance, '' ) self.resourceBrokers = gConfig.getValue( mySection + '/ResourceBrokers' , self.resourceBrokers ) self.cpuPowerRef = gConfig.getValue( mySection + '/CPUPowerRef' , self.cpuPowerRef ) self.requirements = gConfig.getValue( mySection + '/Requirements' , self.requirements ) self.rank = gConfig.getValue( mySection + '/Rank' , self.rank ) self.fuzzyRank = gConfig.getValue( mySection + '/FuzzyRank' , self.fuzzyRank ) def _submitPilots( self, workDir, taskQueueDict, pilotOptions, pilotsToSubmit, ceMask, submitPrivatePilot, privateTQ, proxy, pilotsPerJob ): """ This method does the actual pilot submission to the Grid RB The logic is as follows: - If there are no available RB it return error - If there is no VOMS extension in the proxy, return error - It creates a temp directory - Prepare a JDL it has some part common to gLite and LCG (the payload description) it has some part specific to each middleware """ taskQueueID = taskQueueDict['TaskQueueID'] # ownerDN = taskQueueDict['OwnerDN'] credDict = proxy.getCredentials()['Value'] ownerDN = credDict['identity'] ownerGroup = credDict[ 'group' ] if not self.resourceBrokers: # Since we can exclude RBs from the list, it may become empty return S_ERROR( ERROR_RB ) # Need to get VOMS extension for the later interactions with WMS ret = gProxyManager.getVOMSAttributes( proxy ) if not ret['OK']: self.log.error( ERROR_VOMS, ret['Message'] ) return S_ERROR( ERROR_VOMS ) if not ret['Value']: return S_ERROR( ERROR_VOMS ) workingDirectory = tempfile.mkdtemp( prefix = 'TQ_%s_' % taskQueueID, dir = workDir ) self.log.verbose( 'Using working Directory:', workingDirectory ) # Write JDL retDict = self._prepareJDL( taskQueueDict, workingDirectory, pilotOptions, pilotsPerJob, ceMask, submitPrivatePilot, privateTQ ) jdl = retDict['JDL'] pilotRequirements = retDict['Requirements'] rb = retDict['RB'] if not jdl: try: shutil.rmtree( workingDirectory ) except: pass return S_ERROR( ERROR_JDL ) # Check that there are available queues for the Job: if self.enableListMatch: availableCEs = [] now = Time.dateTime() availableCEs = self.listMatchCache.get( pilotRequirements ) if availableCEs is None: availableCEs = self._listMatch( proxy, jdl, taskQueueID, rb ) if availableCEs != False: self.log.verbose( 'LastListMatch', now ) self.log.verbose( 'AvailableCEs ', availableCEs ) self.listMatchCache.add( pilotRequirements, self.listMatchDelay * 60, value = availableCEs ) # it is given in minutes if not availableCEs: try: shutil.rmtree( workingDirectory ) except: pass return S_ERROR( ERROR_CE + ' TQ: %d' % taskQueueID ) # Now we are ready for the actual submission, so self.log.verbose( 'Submitting Pilots for TaskQueue', taskQueueID ) # FIXME: what is this?? If it goes on the super class, it is doomed submitRet = self._submitPilot( proxy, pilotsPerJob, jdl, taskQueueID, rb ) try: shutil.rmtree( workingDirectory ) except: pass if not submitRet: return S_ERROR( 'Pilot Submission Failed for TQ %d ' % taskQueueID ) # pilotReference, resourceBroker = submitRet submittedPilots = 0 if pilotsPerJob != 1 and len( submitRet ) != pilotsPerJob: # Parametric jobs are used for pilotReference, resourceBroker in submitRet: pilotReference = self._getChildrenReferences( proxy, pilotReference, taskQueueID ) submittedPilots += len( pilotReference ) pilotAgentsDB.addPilotTQReference( pilotReference, taskQueueID, ownerDN, ownerGroup, resourceBroker, self.gridMiddleware, pilotRequirements ) else: for pilotReference, resourceBroker in submitRet: pilotReference = [pilotReference] submittedPilots += len( pilotReference ) pilotAgentsDB.addPilotTQReference( pilotReference, taskQueueID, ownerDN, ownerGroup, resourceBroker, self.gridMiddleware, pilotRequirements ) # add some sleep here time.sleep( 0.1 * submittedPilots ) if pilotsToSubmit > pilotsPerJob: # Additional submissions are necessary, need to get a new token and iterate. pilotsToSubmit -= pilotsPerJob result = gProxyManager.requestToken( ownerDN, ownerGroup, max( pilotsToSubmit, self.maxJobsInFillMode ) ) if not result[ 'OK' ]: self.log.error( ERROR_TOKEN, result['Message'] ) result = S_ERROR( ERROR_TOKEN ) result['Value'] = submittedPilots return result ( token, numberOfUses ) = result[ 'Value' ] for option in pilotOptions: if option.find( '-o /Security/ProxyToken=' ) == 0: pilotOptions.remove( option ) pilotOptions.append( '-o /Security/ProxyToken=%s' % token ) pilotsPerJob = max( 1, min( pilotsPerJob, int( numberOfUses / self.maxJobsInFillMode ) ) ) result = self._submitPilots( workDir, taskQueueDict, pilotOptions, pilotsToSubmit, ceMask, submitPrivatePilot, privateTQ, proxy, pilotsPerJob ) if not result['OK']: if 'Value' not in result: result['Value'] = 0 result['Value'] += submittedPilots return result submittedPilots += result['Value'] return S_OK( submittedPilots ) def _prepareJDL( self, taskQueueDict, workingDirectory, pilotOptions, pilotsToSubmit, ceMask, submitPrivatePilot, privateTQ ): """ This method should be overridden in a subclass """ self.log.error( '_prepareJDL() method should be implemented in a subclass' ) sys.exit() def _JobJDL( self, taskQueueDict, pilotOptions, ceMask ): """ The Job JDL is the same for LCG and GLite """ pilotJDL = 'Executable = "%s";\n' % os.path.basename( self.pilot ) executable = self.pilot pilotJDL += 'Arguments = "%s";\n' % ' '.join( pilotOptions ) pilotJDL += 'CPUTimeRef = %s;\n' % taskQueueDict['CPUTime'] pilotJDL += 'CPUPowerRef = %s;\n' % self.cpuPowerRef pilotJDL += """CPUWorkRef = real( CPUTimeRef * CPUPowerRef ); Lookup = "CPUScalingReferenceSI00=*"; cap = isList( other.GlueCECapability ) ? other.GlueCECapability : { "dummy" }; i0 = regexp( Lookup, cap[0] ) ? 0 : undefined; i1 = isString( cap[1] ) && regexp( Lookup, cap[1] ) ? 1 : i0; i2 = isString( cap[2] ) && regexp( Lookup, cap[2] ) ? 2 : i1; i3 = isString( cap[3] ) && regexp( Lookup, cap[3] ) ? 3 : i2; i4 = isString( cap[4] ) && regexp( Lookup, cap[4] ) ? 4 : i3; i5 = isString( cap[5] ) && regexp( Lookup, cap[5] ) ? 5 : i4; index = isString( cap[6] ) && regexp( Lookup, cap[6] ) ? 6 : i5; i = isUndefined( index ) ? 0 : index; QueuePowerRef = real( ! isUndefined( index ) ? int( substr( cap[i], size( Lookup ) - 1 ) ) : other.GlueHostBenchmarkSI00 ); QueueTimeRef = real( other.GlueCEPolicyMaxCPUTime * 60 ); QueueWorkRef = QueuePowerRef * QueueTimeRef; """ requirements = list( self.requirements ) if 'GridCEs' in taskQueueDict and taskQueueDict['GridCEs']: # if there an explicit Grig CE requested by the TQ, remove the Ranking requirement for req in self.requirements: if req.strip().lower()[:6] == 'rank >': requirements.remove( req ) requirements.append( 'QueueWorkRef > CPUWorkRef' ) siteRequirements = '\n || '.join( [ 'other.GlueCEInfoHostName == "%s"' % s for s in ceMask ] ) requirements.append( "( %s\n )" % siteRequirements ) pilotRequirements = '\n && '.join( requirements ) pilotJDL += 'pilotRequirements = %s;\n' % pilotRequirements pilotJDL += 'Rank = %s;\n' % self.rank pilotJDL += 'FuzzyRank = %s;\n' % self.fuzzyRank pilotJDL += 'StdOutput = "%s";\n' % outputSandboxFiles[0] pilotJDL += 'StdError = "%s";\n' % outputSandboxFiles[1] pilotJDL += 'InputSandbox = { "%s" };\n' % '", "'.join( [ self.install, executable ] + self.extraModules ) pilotJDL += 'OutputSandbox = { %s };\n' % ', '.join( [ '"%s"' % f for f in outputSandboxFiles ] ) self.log.verbose( pilotJDL ) return ( pilotJDL, pilotRequirements ) def parseListMatchStdout( self, proxy, cmd, taskQueueID, rb ): """ Parse List Match stdout to return list of matched CE's """ self.log.verbose( 'Executing List Match for TaskQueue', taskQueueID ) start = time.time() ret = executeGridCommand( proxy, cmd, self.gridEnv ) if not ret['OK']: self.log.error( 'Failed to execute List Match:', ret['Message'] ) self.__sendErrorMail( rb, 'List Match', cmd, ret, proxy ) return False if ret['Value'][0] != 0: self.log.error( 'Error executing List Match:', str( ret['Value'][0] ) + '\n'.join( ret['Value'][1:3] ) ) self.__sendErrorMail( rb, 'List Match', cmd, ret, proxy ) return False self.log.info( 'List Match Execution Time: %.2f for TaskQueue %d' % ( ( time.time() - start ), taskQueueID ) ) stdout = ret['Value'][1] stderr = ret['Value'][2] availableCEs = [] # Parse std.out for line in List.fromChar( stdout, '\n' ): if re.search( '/jobmanager-', line ) or re.search( '/cream-', line ): # TODO: the line has to be stripped from extra info availableCEs.append( line ) if not availableCEs: self.log.info( 'List-Match failed to find CEs for TaskQueue', taskQueueID ) self.log.info( stdout ) self.log.info( stderr ) else: self.log.debug( 'List-Match returns:', str( ret['Value'][0] ) + '\n'.join( ret['Value'][1:3] ) ) self.log.info( 'List-Match found %s CEs for TaskQueue' % len( availableCEs ), taskQueueID ) self.log.verbose( ', '.join( availableCEs ) ) return availableCEs def parseJobSubmitStdout( self, proxy, cmd, taskQueueID, rb ): """ Parse Job Submit stdout to return pilot reference """ start = time.time() self.log.verbose( 'Executing Job Submit for TaskQueue', taskQueueID ) ret = executeGridCommand( proxy, cmd, self.gridEnv ) if not ret['OK']: self.log.error( 'Failed to execute Job Submit:', ret['Message'] ) self.__sendErrorMail( rb, 'Job Submit', cmd, ret, proxy ) return False if ret['Value'][0] != 0: self.log.error( 'Error executing Job Submit:', str( ret['Value'][0] ) + '\n'.join( ret['Value'][1:3] ) ) self.__sendErrorMail( rb, 'Job Submit', cmd, ret, proxy ) return False self.log.info( 'Job Submit Execution Time: %.2f for TaskQueue %d' % ( ( time.time() - start ), taskQueueID ) ) stdout = ret['Value'][1] failed = 1 rb = '' for line in List.fromChar( stdout, '\n' ): m = re.search( "(https:\S+)", line ) if ( m ): glite_id = m.group( 1 ) if not rb: m = re.search( "https://(.+):.+", glite_id ) rb = m.group( 1 ) failed = 0 if failed: self.log.error( 'Job Submit returns no Reference:', str( ret['Value'][0] ) + '\n'.join( ret['Value'][1:3] ) ) return False self.log.info( 'Reference %s for TaskQueue %s' % ( glite_id, taskQueueID ) ) return glite_id, rb def _writeJDL( self, filename, jdlList ): try: f = open( filename, 'w' ) f.write( '\n'.join( jdlList ) ) f.close() except Exception as x: self.log.exception( x ) return '' return filename def __sendErrorMail( self, rb, name, command, result, proxy ): """ In case or error with RB/WM: - check if RB/WMS still in use - remove RB/WMS from current list - check if RB/WMS not in cache - add RB/WMS to cache - send Error mail """ if rb in self.resourceBrokers: try: self.resourceBrokers.remove( rb ) self.log.info( 'Removed RB from list', rb ) except: pass if not self.__failingWMSCache.exists( rb ): self.__failingWMSCache.add( rb, self.errorClearTime ) # disable for 30 minutes mailAddress = self.errorMailAddress msg = '' if not result['OK']: subject = "%s: timeout executing %s" % ( rb, name ) msg += '\n%s' % result['Message'] elif result['Value'][0] != 0: if re.search( 'the server is temporarily drained', ' '.join( result['Value'][1:3] ) ): return if re.search( 'System load is too high:', ' '.join( result['Value'][1:3] ) ): return subject = "%s: error executing %s" % ( rb, name ) else: return msg += ' '.join( command ) msg += '\nreturns: %s\n' % str( result['Value'][0] ) + '\n'.join( result['Value'][1:3] ) msg += '\nUsing Proxy:\n' + getProxyInfoAsString( proxy )['Value'] #msg += '\nUsing Proxy:\n' + gProxyManager. ticketTime = self.errorClearTime + self.errorTicketTime if self.__ticketsWMSCache.exists( rb ): mailAddress = self.alarmMailAddress # the RB was already detected failing a short time ago msg = 'Submit GGUS Ticket for this error if not already opened\n' + \ 'It has been failing at least for %s hours\n' % ( ticketTime / 60 / 60 ) + msg else: self.__ticketsWMSCache.add( rb, ticketTime ) if mailAddress: result = NotificationClient().sendMail( mailAddress, subject, msg, fromAddress = self.mailFromAddress ) if not result[ 'OK' ]: self.log.error( "Mail could not be sent" ) return