def getGlue2CEInfo(vo, host): """ call ldap for GLUE2 and get information :param str vo: Virtual Organisation :param str host: host to query for information :returns: result structure with result['Value'][siteID]['CEs'][ceID]['Queues'][queueName]. For each siteID, ceID, queueName all the GLUE2 parameters are retrieved """ # get all Policies allowing given VO filt = "(&(objectClass=GLUE2Policy)(|(GLUE2PolicyRule=VO:%s)(GLUE2PolicyRule=vo:%s)))" % ( vo, vo) polRes = __ldapsearchBDII(filt=filt, attr=None, host=host, base="o=glue", selectionString="GLUE2") if not polRes['OK']: return S_ERROR("Failed to get policies for this VO") polRes = polRes['Value'] sLog.notice("Found %s policies for this VO %s" % (len(polRes), vo)) # get all shares for this policy # create an or'ed list of all the shares and then call the search listOfSitesWithPolicies = set() shareFilter = '' for policyValues in polRes: # skip entries without GLUE2DomainID in the DN because we cannot associate them to a site if 'GLUE2DomainID' not in policyValues['attr']['dn']: continue shareID = policyValues['attr'].get('GLUE2MappingPolicyShareForeignKey', None) policyID = policyValues['attr']['GLUE2PolicyID'] siteName = policyValues['attr']['dn'].split('GLUE2DomainID=')[1].split( ',', 1)[0] listOfSitesWithPolicies.add(siteName) if shareID is None: # policy not pointing to ComputingInformation sLog.debug("Policy %s does not point to computing information" % (policyID, )) continue sLog.verbose("%s policy %s pointing to %s " % (siteName, policyID, shareID)) sLog.debug("Policy values:\n%s" % pformat(policyValues)) shareFilter += '(GLUE2ShareID=%s)' % shareID filt = '(&(objectClass=GLUE2Share)(|%s))' % shareFilter shareRes = __ldapsearchBDII(filt=filt, attr=None, host=host, base="o=glue", selectionString="GLUE2") if not shareRes['OK']: sLog.error("Could not get share information", shareRes['Message']) return shareRes shareInfoLists = {} for shareInfo in shareRes['Value']: if 'GLUE2DomainID' not in shareInfo['attr']['dn']: continue if 'GLUE2ComputingShare' not in shareInfo['objectClass']: sLog.debug('Share %r is not a ComputingShare: \n%s' % (shareID, pformat(shareInfo))) continue sLog.debug("Found computing share:\n%s" % pformat(shareInfo)) siteName = shareInfo['attr']['dn'].split('GLUE2DomainID=')[1].split( ',', 1)[0] shareInfoLists.setdefault(siteName, []).append(shareInfo['attr']) siteInfo = __getGlue2ShareInfo(host, shareInfoLists) if not siteInfo['OK']: sLog.error("Could not get CE info for", "%s: %s" % (shareID, siteInfo['Message'])) return siteInfo siteDict = siteInfo['Value'] sLog.debug("Found Sites:\n%s" % pformat(siteDict)) sitesWithoutShares = set(siteDict) - listOfSitesWithPolicies if sitesWithoutShares: sLog.error("Found some sites without any shares", pformat(sitesWithoutShares)) else: sLog.notice("Found information for all known sites") # remap siteDict to assign CEs to known sites, in case their names differ from the "gocdb name" in # the CS. newSiteDict = {} ceSiteMapping = getCESiteMapping().get('Value', {}) # FIXME: pylint thinks siteDict is a tuple, so we cast for siteName, infoDict in dict(siteDict).items(): for ce, ceInfo in infoDict.get('CEs', {}).items(): ceSiteName = ceSiteMapping.get(ce, siteName) gocSiteName = getGOCSiteName(ceSiteName).get('Value', siteName) newSiteDict.setdefault(gocSiteName, {}).setdefault('CEs', {})[ce] = ceInfo return S_OK(newSiteDict)
def ldapsearchBDII(filt=None, attr=None, host=None, base=None): """ Python wrapper for ldapserch at bdii. :param filt: Filter used to search ldap, default = '', means select all :param attr: Attributes returned by ldapsearch, default = '*', means return all :param host: Host used for ldapsearch, default = 'lcg-bdii.cern.ch:2170', can be changed by $LCG_GFAL_INFOSYS :return: standard DIRAC answer with Value equals to list of ldapsearch responses Each element of list is dictionary with keys: 'dn': Distinguished name of ldapsearch response 'objectClass': List of classes in response 'attr': Dictionary of attributes """ if filt == None: filt = '' if attr == None: attr = '' if host == None: host = 'lcg-bdii.cern.ch:2170' if base == None: base = 'Mds-Vo-name=local,o=grid' if type(attr) == types.ListType: attr = ' '.join(attr) cmd = 'ldapsearch -x -LLL -o ldif-wrap=no -h %s -b %s "%s" %s' % ( host, base, filt, attr) result = shellCall(0, cmd) response = [] if not result['OK']: return result status = result['Value'][0] stdout = result['Value'][1] stderr = result['Value'][2] if not status == 0: return S_ERROR(stderr) lines = [] for line in stdout.split("\n"): if line.find(" ") == 0: lines[-1] += line.strip() else: lines.append(line.strip()) record = None for line in lines: if line.find('dn:') == 0: record = { 'dn': line.replace('dn:', '').strip(), 'objectClass': [], 'attr': { 'dn': line.replace('dn:', '').strip() } } response.append(record) continue if record: if line.find('objectClass:') == 0: record['objectClass'].append( line.replace('objectClass:', '').strip()) continue if line.find('Glue') == 0: index = line.find(':') if index > 0: attr = line[:index] value = line[index + 1:].strip() if record['attr'].has_key(attr): if type(record['attr'][attr]) == type([]): record['attr'][attr].append(value) else: record['attr'][attr] = [ record['attr'][attr], value ] else: record['attr'][attr] = value return S_OK(response)
def submit(self, context=None, ftsServer=None, ucert=None, pinTime=36000, protocols=None): """ submit the job to the FTS server Some attributes are expected to be defined for the submission to work: * type (set by FTS3Operation) * sourceSE (only for Transfer jobs) * targetSE * activity (optional) * priority (optional) * username * userGroup * filesToSubmit * operationID (optional, used as metadata for the job) We also expect the FTSFiles have an ID defined, as it is given as transfer metadata :param pinTime: Time the file should be pinned on disk (used for transfers and staging) Used only if he source SE is a tape storage :param context: fts3 context. If not given, it is created (see ftsServer & ucert param) :param ftsServer: the address of the fts server to submit to. Used only if context is not given. if not given either, use the ftsServer object attribute :param ucert: path to the user certificate/proxy. Might be inferred by the fts cli (see its doc) :param protocols: list of protocols from which we should choose the protocol to use :returns: S_OK([FTSFiles ids of files submitted]) """ log = gLogger.getSubLogger( "submit/%s/%s_%s" % (self.operationID, self.sourceSE, self.targetSE), True) if not context: if not ftsServer: ftsServer = self.ftsServer context = fts3.Context(endpoint=ftsServer, ucert=ucert, request_class=ftsSSLRequest, verify=False) # Construct the target SURL res = self.__fetchSpaceToken(self.targetSE, self.vo) if not res['OK']: return res target_spacetoken = res['Value'] allLFNs = [ftsFile.lfn for ftsFile in self.filesToSubmit] if self.type == 'Transfer': res = self._constructTransferJob(pinTime, allLFNs, target_spacetoken, protocols=protocols) elif self.type == 'Staging': res = self._constructStagingJob(pinTime, allLFNs, target_spacetoken) # elif self.type == 'Removal': # res = self._constructRemovalJob(context, allLFNs, failedLFNs, target_spacetoken) if not res['OK']: return res job, fileIDsInTheJob = res['Value'] setFileIdsInTheJob = set(fileIDsInTheJob) try: self.ftsGUID = fts3.submit(context, job) log.info("Got GUID %s" % self.ftsGUID) # Only increase the amount of attempt # if we succeeded in submitting -> no ! Why did I do that ?? for ftsFile in self.filesToSubmit: ftsFile.attempt += 1 # This should never happen because a file should be "released" # first by the previous job. # But we just print a warning if ftsFile.ftsGUID is not None: log.warn( "FTSFile has a non NULL ftsGUID at job submission time", "FileID: %s existing ftsGUID: %s" % (ftsFile.fileID, ftsFile.ftsGUID)) # `assign` the file to this job ftsFile.ftsGUID = self.ftsGUID if ftsFile.fileID in setFileIdsInTheJob: ftsFile.status = 'Submitted' now = datetime.datetime.utcnow().replace(microsecond=0) self.submitTime = now self.lastUpdate = now self.lastMonitor = now except FTS3ClientException as e: log.exception("Error at submission", repr(e)) return S_ERROR("Error at submission: %s" % e) return S_OK(fileIDsInTheJob)
def getConfigurationTree(self, root='', *filters): """ Create a dictionary with all sections, subsections and options starting from given root. Result can be filtered. :param str root: Starting point in the configuration tree. :param filters: Select results that contain given substrings (check full path, i.e. with option name) :type filters: str or python:list[str] :return: S_OK(dict)/S_ERROR() -- dictionary where keys are paths taken from the configuration (e.g. /Systems/Configuration/...). Value is "None" when path points to a section or not "None" if path points to an option. """ # check if root is an option (special case) option = self.getOption(root) if option['OK']: result = {root: option['Value']} else: result = {root: None} for substr in filters: if substr not in root: result = {} break # remove slashes at the end root = root.rstrip('/') # get options of current root options = self.getOptionsDict(root) if not options['OK']: return S_ERROR("getOptionsDict() failed with message: %s" % options['Message']) for key, value in options['Value'].items(): path = cfgPath(root, key) addOption = True for substr in filters: if substr not in path: addOption = False break if addOption: result[path] = value # get subsections of the root sections = self.getSections(root) if not sections['OK']: return S_ERROR("getSections() failed with message: %s" % sections['Message']) # recursively go through subsections and get their subsections for section in sections['Value']: subtree = self.getConfigurationTree("%s/%s" % (root, section), *filters) if not subtree['OK']: return S_ERROR( "getConfigurationTree() failed with message: %s" % sections['Message']) result.update(subtree['Value']) return S_OK(result)
return S_ERROR("Read limit exceeded (%s chars)" % maxBufferSize) #Data is here! take it out from the bytestream, dencode and return data = self.byteStream[:size] self.byteStream = self.byteStream[size:] try: data = DEncode.decode(data)[0] except Exception, e: return S_ERROR("Could not decode received data: %s" % str(e)) if idleReceive: self.receivedMessages.append(data) return S_OK() return data except Exception, e: gLogger.exception("Network error while receiving data") return S_ERROR("Network error while receiving data: %s" % str(e)) def __processKeepAlive(self, maxBufferSize, blockAfterKeepAlive=True): gLogger.debug("Received Keep Alive") #Next message down the stream will be the ka data result = self.receiveData(maxBufferSize, blockAfterKeepAlive=False) if not result['OK']: gLogger.debug("Error while receiving keep alive: %s" % result['Message']) return result #Is it a valid ka? kaData = result['Value'] for reqField in ('id', 'kaping'): if reqField not in kaData: errMsg = "Invalid keep alive, missing %s" % reqField gLogger.debug(errMsg)
def transfer_toClient(self, fileId, token, fileHelper): # pylint: disable=unused-argument return S_ERROR("This server does no allow sending files")
def transfer_bulkToClient(self, bulkId, token, fileHelper): # pylint: disable=unused-argument return S_ERROR("This server does no allow bulk sending")
def transfer_bulkToClient( self, bulkId, token, fileHelper ): return S_ERROR( "This server does no allow bulk sending" )
def transfer_listBulk( self, bulkId, token, fileHelper ): return S_ERROR( "This server does no allow bulk listing" )
def transfer_toClient( self, fileId, token, fileHelper ): return S_ERROR( "This server does no allow sending files" )
def transfer_bulkFromClient( self, bulkId, token, bulkSize, fileHelper ): return S_ERROR( "This server does no allow bulk receiving" )
def transfer_fromClient( self, fileId, token, fileSize, fileHelper ): return S_ERROR( "This server does no allow receiving files" )
def __executeMethod(self, lfn, *args, **kwargs): """ Forward the call to each storage in turn until one works. The method to be executed is stored in self.methodName :param lfn : string, list or dictionnary :param *args : variable amount of non-keyword arguments. SHOULD BE EMPTY :param **kwargs : keyword arguments :returns S_OK( { 'Failed': {lfn : reason} , 'Successful': {lfn : value} } ) The Failed dict contains the lfn only if the operation failed on all the storages The Successful dict contains the value returned by the successful storages. """ removedArgs = {} log = self.log.getSubLogger('__executeMethod') log.verbose("preparing the execution of %s" % (self.methodName)) # args should normaly be empty to avoid problem... if len(args): log.verbose("args should be empty!%s" % args) # because there is normally only one kw argument, I can move it from args to kwargs methDefaultArgs = StorageElementItem.__defaultsArguments.get( self.methodName, {}).keys() if len(methDefaultArgs): kwargs[methDefaultArgs[0]] = args[0] args = args[1:] log.verbose( "put it in kwargs, but dirty and might be dangerous!args %s kwargs %s" % (args, kwargs)) # We check the deprecated arguments for depArg in StorageElementItem.__deprecatedArguments: if depArg in kwargs: log.verbose( "%s is not an allowed argument anymore. Please change your code!" % depArg) removedArgs[depArg] = kwargs[depArg] del kwargs[depArg] # Set default argument if any methDefaultArgs = StorageElementItem.__defaultsArguments.get( self.methodName, {}) for argName in methDefaultArgs: if argName not in kwargs: log.debug("default argument %s for %s not present.\ Setting value %s." % (argName, self.methodName, methDefaultArgs[argName])) kwargs[argName] = methDefaultArgs[argName] res = checkArgumentFormat(lfn) if not res['OK']: errStr = "Supplied lfns must be string, list of strings or a dictionary." log.debug(errStr) return S_ERROR(errStr) lfnDict = res['Value'] log.verbose("Attempting to perform '%s' operation with %s lfns." % (self.methodName, len(lfnDict))) res = self.isValid(operation=self.methodName) if not res['OK']: return res else: if not self.valid: return S_ERROR(self.errorReason) successful = {} failed = {} localSE = self.__isLocalSE()['Value'] # Try all of the storages one by one for storage in self.storages: # Determine whether to use this storage object storageParameters = storage.getParameters() if not storageParameters: log.debug("Failed to get storage parameters.", "%s %s" % (self.name, res['Message'])) continue pluginName = storageParameters['PluginName'] if not lfnDict: log.debug("No lfns to be attempted for %s protocol." % pluginName) continue if not (pluginName in self.remotePlugins ) and not localSE and not storage.pluginName == "Proxy": # If the SE is not local then we can't use local protocols log.debug( "Local protocol not appropriate for remote use: %s." % pluginName) continue log.verbose("Generating %s protocol URLs for %s." % (len(lfnDict), pluginName)) replicaDict = kwargs.pop('replicaDict', {}) if storage.pluginName != "Proxy": res = self.__generateURLDict(lfnDict, storage, replicaDict=replicaDict) urlDict = res['Value']['Successful'] # url : lfn failed.update(res['Value']['Failed']) else: urlDict = dict([(lfn, lfn) for lfn in lfnDict]) if not len(urlDict): log.verbose( "__executeMethod No urls generated for protocol %s." % pluginName) else: log.verbose( "Attempting to perform '%s' for %s physical files" % (self.methodName, len(urlDict))) fcn = None if hasattr(storage, self.methodName) and callable( getattr(storage, self.methodName)): fcn = getattr(storage, self.methodName) if not fcn: return S_ERROR( "SE.__executeMethod: unable to invoke %s, it isn't a member function of storage" ) urlsToUse = { } # url : the value of the lfn dictionary for the lfn of this url for url in urlDict: urlsToUse[url] = lfnDict[urlDict[url]] res = fcn(urlsToUse, *args, **kwargs) if not res['OK']: errStr = "Completely failed to perform %s." % self.methodName log.debug( errStr, 'with plugin %s: %s' % (pluginName, res['Message'])) for lfn in urlDict.values(): if lfn not in failed: failed[lfn] = '' failed[lfn] = "%s %s" % ( failed[lfn], res['Message']) if failed[lfn] else res['Message'] else: for url, lfn in urlDict.items(): if url not in res['Value']['Successful']: if lfn not in failed: failed[lfn] = '' if url in res['Value']['Failed']: failed[lfn] = "%s %s" % ( failed[lfn], res['Value']['Failed'][url]) if failed[ lfn] else res['Value']['Failed'][url] else: errStr = 'No error returned from plug-in' failed[lfn] = "%s %s" % ( failed[lfn], errStr) if failed[lfn] else errStr else: successful[lfn] = res['Value']['Successful'][url] if lfn in failed: failed.pop(lfn) lfnDict.pop(lfn) return S_OK({'Failed': failed, 'Successful': successful})
def isValid(self, operation=''): """ check CS/RSS statuses for :operation: :param str operation: operation name """ log = self.log.getSubLogger('isValid', True) log.verbose("Determining if the StorageElement %s is valid for VO %s" % (self.name, self.vo)) if not self.valid: log.debug("Failed to create StorageElement plugins.", self.errorReason) return S_ERROR( "SE.isValid: Failed to create StorageElement plugins: %s" % self.errorReason) # Check if the Storage Element is eligible for the user's VO if 'VO' in self.options and not self.vo in self.options['VO']: log.debug("StorageElement is not allowed for VO", self.vo) return S_ERROR("SE.isValid: StorageElement is not allowed for VO.") log.verbose("Determining if the StorageElement %s is valid for %s" % (self.name, operation)) if (not operation) or (operation in self.okMethods): return S_OK() # Determine whether the StorageElement is valid for checking, reading, writing res = self.getStatus() if not res['OK']: log.debug("Could not call getStatus", res['Message']) return S_ERROR("SE.isValid could not call the getStatus method") checking = res['Value']['Check'] reading = res['Value']['Read'] writing = res['Value']['Write'] removing = res['Value']['Remove'] # Determine whether the requested operation can be fulfilled if (not operation) and (not reading) and (not writing) and ( not checking): log.debug("Read, write and check access not permitted.") return S_ERROR( "SE.isValid: Read, write and check access not permitted.") # The supplied operation can be 'Read','Write' or any of the possible StorageElement methods. if (operation in self.readMethods) or (operation.lower() in ('read', 'readaccess')): operation = 'ReadAccess' elif operation in self.writeMethods or (operation.lower() in ('write', 'writeaccess')): operation = 'WriteAccess' elif operation in self.removeMethods or (operation.lower() in ('remove', 'removeaccess')): operation = 'RemoveAccess' elif operation in self.checkMethods or (operation.lower() in ('check', 'checkaccess')): operation = 'CheckAccess' else: log.debug("The supplied operation is not known.", operation) return S_ERROR("SE.isValid: The supplied operation is not known.") log.debug("check the operation: %s " % operation) # Check if the operation is valid if operation == 'CheckAccess': if not reading: if not checking: log.debug("Check access not currently permitted.") return S_ERROR( "SE.isValid: Check access not currently permitted.") if operation == 'ReadAccess': if not reading: log.debug("Read access not currently permitted.") return S_ERROR( "SE.isValid: Read access not currently permitted.") if operation == 'WriteAccess': if not writing: log.debug("Write access not currently permitted.") return S_ERROR( "SE.isValid: Write access not currently permitted.") if operation == 'RemoveAccess': if not removing: log.debug("Remove access not currently permitted.") return S_ERROR( "SE.isValid: Remove access not currently permitted.") return S_OK()
def __doFileTransfer(self, sDirection): """ Execute a file transfer action :type sDirection: string :param sDirection: Direction of the transfer :return: S_OK/S_ERROR """ retVal = self.__trPool.receive(self.__trid) if not retVal['OK']: raise RequestHandler.ConnectionError( "Error while receiving file description %s %s" % (self.srv_getFormattedRemoteCredentials(), retVal['Message'])) fileInfo = retVal['Value'] sDirection = "%s%s" % (sDirection[0].lower(), sDirection[1:]) if "transfer_%s" % sDirection not in dir(self): self.__trPool.send( self.__trid, S_ERROR("Service can't transfer files %s" % sDirection)) return retVal = self.__trPool.send(self.__trid, S_OK("Accepted")) if not retVal['OK']: return retVal self.__logRemoteQuery("FileTransfer/%s" % sDirection, fileInfo) self.__lockManager.lock("FileTransfer/%s" % sDirection) try: try: fileHelper = FileHelper(self.__trPool.get(self.__trid)) if sDirection == "fromClient": fileHelper.setDirection("fromClient") uRetVal = self.transfer_fromClient(fileInfo[0], fileInfo[1], fileInfo[2], fileHelper) elif sDirection == "toClient": fileHelper.setDirection("toClient") uRetVal = self.transfer_toClient(fileInfo[0], fileInfo[1], fileHelper) elif sDirection == "bulkFromClient": fileHelper.setDirection("fromClient") uRetVal = self.transfer_bulkFromClient( fileInfo[0], fileInfo[1], fileInfo[2], fileHelper) elif sDirection == "bulkToClient": fileHelper.setDirection("toClient") uRetVal = self.transfer_bulkToClient( fileInfo[0], fileInfo[1], fileHelper) elif sDirection == "listBulk": fileHelper.setDirection("toClient") uRetVal = self.transfer_listBulk(fileInfo[0], fileInfo[1], fileHelper) else: return S_ERROR("Direction %s does not exist!!!" % sDirection) if uRetVal['OK'] and not fileHelper.finishedTransmission(): gLogger.error( "You haven't finished receiving/sending the file", str(fileInfo)) return S_ERROR("Incomplete transfer") del fileHelper return uRetVal finally: self.__lockManager.unlock("FileTransfer/%s" % sDirection) except Exception as e: # pylint: disable=broad-except gLogger.exception("Uncaught exception when serving Transfer", "%s" % sDirection, lException=e) return S_ERROR("Server error while serving %s: %s" % (sDirection, repr(e)))
class RequestHandler( object ): class ConnectionError( Exception ): def __init__( self, msg ): self.__msg = msg def __str__( self ): return "ConnectionError: %s" % self.__msg def __init__( self, handlerInitDict, trid ): """ Constructor :type handlerInitDict: dictionary :param handlerInitDict: Information vars for the service :type trid: object :param trid: Transport to use """ #Initially serviceInfoDict is the one base to the RequestHandler # the one created in _rh_initializeClass #FSM help me for I have made a complex stuff that I will forget in 5 mins :P handlerInitDict.update( self.__srvInfoDict ) self.serviceInfoDict = handlerInitDict self.__trid = trid def initialize( self ): """Initialize this instance of the handler (to be overwritten) """ pass @classmethod def _rh__initializeClass( cls, serviceInfoDict, lockManager, msgBroker, monitor ): """ Class initialization (not to be called by hand or overwritten!!) :type serviceInfoDict: dictionary :param serviceInfoDict: Information vars for the service :type msgBroker: object :param msgBroker: Message delivery :type lockManager: object :param lockManager: Lock manager to use """ cls.__srvInfoDict = serviceInfoDict cls.__svcName = cls.__srvInfoDict[ 'serviceName' ] cls.__lockManager = lockManager cls.__msgBroker = msgBroker cls.__trPool = msgBroker.getTransportPool() cls.__monitor = monitor cls.log = gLogger def getRemoteAddress( self ): """ Get the address of the remote peer. :return : Address of remote peer. """ return self.__trPool.get( self.__trid ).getRemoteAddress() def getRemoteCredentials( self ): """ Get the credentials of the remote peer. :return : Credentials dictionary of remote peer. """ return self.__trPool.get( self.__trid ).getConnectingCredentials() @classmethod def getCSOption( cls, optionName, defaultValue = False ): """ Get an option from the CS section of the services :return : Value for serviceSection/optionName in the CS being defaultValue the default """ return cls.srv_getCSOption( optionName, defaultValue ) def _rh_executeAction( self, proposalTuple ): """ Execute an action. :type proposalTuple: tuple :param proposalTuple: Type of action to execute. First position of the tuple must be the type of action to execute. The second position is the action itself. """ actionTuple = proposalTuple[1] gLogger.debug( "Executing %s:%s action" % actionTuple ) startTime = time.time() actionType = actionTuple[0] self.serviceInfoDict[ 'actionTuple' ] = actionTuple try: if actionType == "RPC": retVal = self.__doRPC( actionTuple[1] ) elif actionType == "FileTransfer": retVal = self.__doFileTransfer( actionTuple[1] ) elif actionType == "Connection": retVal = self.__doConnection( actionTuple[1] ) else: return S_ERROR( "Unknown action %s" % actionType ) except RequestHandler.ConnectionError, excp: gLogger.error( "ConnectionError", str( excp ) ) return S_ERROR( excp ) if not isReturnStructure( retVal ): message = "Method %s for action %s does not return a S_OK/S_ERROR!" % ( actionTuple[1], actionTuple[0] ) gLogger.error( message ) retVal = S_ERROR( message ) self.__logRemoteQueryResponse( retVal, time.time() - startTime ) return self.__trPool.send( self.__trid, retVal )
def transfer_fromClient(self, fileId, token, fileSize, fileHelper): # pylint: disable=unused-argument return S_ERROR("This server does no allow receiving files")
def __findServiceURL(self): """ Discovers the URL of a service, taking into account gateways, multiple URLs, banned URLs If the site on which we run is configured to use gateways (/DIRAC/Gateways/<siteName>), these URLs will be used. To ignore the gateway, it is possible to set KW_IGNORE_GATEWAYS to False in kwargs. If self._destinationSrv (given as constructor attribute) is a properly formed URL, we just return this one. If we have to use a gateway, we just replace the server name in the url. The list of URLs defined in the CS (<System>/URLs/<Component>) is randomized This method also sets some attributes: * self.__nbOfUrls = number of URLs * self.__nbOfRetry = 2 if we have more than 2 urls, otherwise 3 * self.__bannedUrls is reinitialized if all the URLs are banned :return: S_OK(str)/S_ERROR() -- the selected URL """ if not self.__initStatus['OK']: return self.__initStatus # Load the Gateways URLs for the current site Name gatewayURL = False if not self.kwargs.get(self.KW_IGNORE_GATEWAYS): dRetVal = gConfig.getOption("/DIRAC/Gateways/%s" % DIRAC.siteName()) if dRetVal['OK']: rawGatewayURL = List.randomize(List.fromChar(dRetVal['Value'], ","))[0] gatewayURL = "/".join(rawGatewayURL.split("/")[:3]) # If what was given as constructor attribute is a properly formed URL, # we just return this one. # If we have to use a gateway, we just replace the server name in it for protocol in gProtocolDict: if self._destinationSrv.find("%s://" % protocol) == 0: gLogger.debug("Already given a valid url", self._destinationSrv) if not gatewayURL: return S_OK(self._destinationSrv) gLogger.debug("Reconstructing given URL to pass through gateway") path = "/".join(self._destinationSrv.split("/")[3:]) finalURL = "%s/%s" % (gatewayURL, path) gLogger.debug("Gateway URL conversion:\n %s -> %s" % (self._destinationSrv, finalURL)) return S_OK(finalURL) if gatewayURL: gLogger.debug("Using gateway", gatewayURL) return S_OK("%s/%s" % (gatewayURL, self._destinationSrv)) # We extract the list of URLs from the CS (System/URLs/Component) try: urls = getServiceURL(self._destinationSrv, setup=self.setup) except Exception as e: return S_ERROR("Cannot get URL for %s in setup %s: %s" % (self._destinationSrv, self.setup, repr(e))) if not urls: return S_ERROR("URL for service %s not found" % self._destinationSrv) failoverUrls = [] # Try if there are some failover URLs to use as last resort try: failoverUrlsStr = getServiceFailoverURL(self._destinationSrv, setup=self.setup) if failoverUrlsStr: failoverUrls = failoverUrlsStr.split(',') except Exception as e: pass # We randomize the list, and add at the end the failover URLs (System/FailoverURLs/Component) urlsList = List.randomize(List.fromChar(urls, ",")) + failoverUrls self.__nbOfUrls = len(urlsList) self.__nbOfRetry = 2 if self.__nbOfUrls > 2 else 3 # we retry 2 times all services, if we run more than 2 services if self.__nbOfUrls == len(self.__bannedUrls): self.__bannedUrls = [] # retry all urls gLogger.debug("Retrying again all URLs") if len(self.__bannedUrls) > 0 and len(urlsList) > 1: # we have host which is not accessible. We remove that host from the list. # We only remove if we have more than one instance for i in self.__bannedUrls: gLogger.debug("Removing banned URL", "%s" % i) urlsList.remove(i) # Take the first URL from the list # randUrls = List.randomize( urlsList ) + failoverUrls sURL = urlsList[0] # If we have banned URLs, and several URLs at disposals, we make sure that the selected sURL # is not on a host which is banned. If it is, we take the next one in the list using __selectUrl # If we have banned URLs, and several URLs at disposals, we make sure that the selected sURL # is not on a host which is banned. If it is, we take the next one in the list using __selectUrl if len(self.__bannedUrls) > 0 and self.__nbOfUrls > 2: # when we have multiple services then we can # have a situation when two services are running on the same machine with different ports... retVal = Network.splitURL(sURL) nexturl = None if retVal['OK']: nexturl = retVal['Value'] found = False for i in self.__bannedUrls: retVal = Network.splitURL(i) if retVal['OK']: bannedurl = retVal['Value'] else: break # We found a banned URL on the same host as the one we are running on if nexturl[1] == bannedurl[1]: found = True break if found: nexturl = self.__selectUrl(nexturl, urlsList[1:]) if nexturl: # an url found which is in different host sURL = nexturl gLogger.debug("Discovering URL for service", "%s -> %s" % (self._destinationSrv, sURL)) return S_OK(sURL)
def transfer_bulkFromClient(self, bulkId, token, bulkSize, fileHelper): # pylint: disable=unused-argument return S_ERROR("This server does no allow bulk receiving")
def _connect(self): """ Establish the connection. It uses the URL discovered in __discoverURL. In case the connection cannot be established, __discoverURL is called again, and _connect calls itself. We stop after trying self.__nbOfRetry * self.__nbOfUrls :return: S_OK()/S_ERROR() """ # Check if the useServerCertificate configuration changed # Note: I am not really sure that all this block makes # any sense at all since all these variables are # evaluated in __discoverCredentialsToUse if gConfig.useServerCertificate() != self.__useCertificates: if self.__forceUseCertificates is None: self.__useCertificates = gConfig.useServerCertificate() self.kwargs[self.KW_USE_CERTIFICATES] = self.__useCertificates # The server certificate use context changed, rechecking the transport sanity result = self.__checkTransportSanity() if not result['OK']: return result # Take all the extra credentials self.__discoverExtraCredentials() if not self.__initStatus['OK']: return self.__initStatus if self.__enableThreadCheck: self.__checkThreadID() gLogger.debug("Trying to connect to: %s" % self.serviceURL) try: # Calls the transport method of the apropriate protocol. # self.__URLTuple[1:3] = [server name, port, System/Component] transport = gProtocolDict[self.__URLTuple[0]]['transport'](self.__URLTuple[1:3], **self.kwargs) # the socket timeout is the default value which is 1. # later we increase to 5 retVal = transport.initAsClient() # We try at most __nbOfRetry each URLs if not retVal['OK']: gLogger.warn("Issue getting socket:", "%s : %s : %s" % (transport, self.__URLTuple, retVal['Message'])) # We try at most __nbOfRetry each URLs if self.__retry < self.__nbOfRetry * self.__nbOfUrls - 1: # Recompose the URL (why not using self.serviceURL ? ) url = "%s://%s:%d/%s" % (self.__URLTuple[0], self.__URLTuple[1], int(self.__URLTuple[2]), self.__URLTuple[3]) # Add the url to the list of banned URLs if it is not already there. (Can it happen ? I don't think so) if url not in self.__bannedUrls: gLogger.warn("Non-responding URL temporarily banned", "%s" % url) self.__bannedUrls += [url] # Increment the retry counter self.__retry += 1 # 16.07.20 CHRIS: I guess this setSocketTimeout does not behave as expected. # If the initasClient did not work, we anyway re-enter the whole method, # so a new transport object is created. # However, it migh be that this timeout value was propagated down to the # SocketInfoFactory singleton, and thus used, but that means that the timeout # specified in parameter was then void. # If it is our last attempt for each URL, we increase the timeout if self.__retryCounter == self.__nbOfRetry - 1: transport.setSocketTimeout(5) # we increase the socket timeout in case the network is not good gLogger.info("Retry connection", ": %d to %s" % (self.__retry, self.serviceURL)) # If we tried all the URL, we increase the global counter (__retryCounter), and sleep if len(self.__bannedUrls) == self.__nbOfUrls: self.__retryCounter += 1 # we run only one service! In that case we increase the retry delay. self.__retryDelay = 3. / self.__nbOfUrls if self.__nbOfUrls > 1 else 2 gLogger.info("Waiting %f seconds before retry all service(s)" % self.__retryDelay) time.sleep(self.__retryDelay) # rediscover the URL self.__discoverURL() # try to reconnect return self._connect() else: return retVal except Exception as e: gLogger.exception(lException=True, lExcInfo=True) return S_ERROR("Can't connect to %s: %s" % (self.serviceURL, repr(e))) # We add the connection to the transport pool gLogger.debug("Connected to: %s" % self.serviceURL) trid = getGlobalTransportPool().add(transport) return S_OK((trid, transport))
def transfer_listBulk(self, bulkId, token, fileHelper): # pylint: disable=unused-argument return S_ERROR("This server does no allow bulk listing")
def generateParametricJobs(jobClassAd): """ Generate a series of ClassAd job descriptions expanding job parameters :param jobClassAd: ClassAd job description object :return: list of ClassAd job description objects """ if not jobClassAd.lookupAttribute('Parameters'): return S_OK([jobClassAd.asJDL()]) result = getParameterVectorLength(jobClassAd) if not result['OK']: return result nParValues = result['Value'] if nParValues is None: return S_ERROR(EWMSJDL, 'Can not determine the number of job parameters') parameterDict = {} attributes = jobClassAd.getAttributes() for attribute in attributes: for key in [ 'Parameters', 'ParameterStart', 'ParameterStep', 'ParameterFactor' ]: if attribute.startswith(key): seqID = '0' if '.' not in attribute else attribute.split( '.')[1] parameterDict.setdefault(seqID, {}) if key == 'Parameters': if jobClassAd.isAttributeList(attribute): parList = jobClassAd.getListFromExpression(attribute) if len(parList) != nParValues: return S_ERROR( EWMSJDL, 'Inconsistent parametric job description') parameterDict[seqID]['ParameterList'] = parList else: if attribute != "Parameters": return S_ERROR( EWMSJDL, 'Inconsistent parametric job description') nPar = jobClassAd.getAttributeInt(attribute) if nPar is None: value = jobClassAd.get_expression(attribute) return S_ERROR( EWMSJDL, 'Inconsistent parametric job description: %s=%s' % (attribute, value)) parameterDict[seqID]['Parameters'] = nPar else: value = jobClassAd.getAttributeInt(attribute) if value is None: value = jobClassAd.getAttributeFloat(attribute) if value is None: value = jobClassAd.get_expression(attribute) return S_ERROR( 'Illegal value for %s JDL field: %s' % (attribute, value)) parameterDict[seqID][key] = value if '0' in parameterDict and not parameterDict.get('0'): parameterDict.pop('0') parameterLists = {} for seqID in parameterDict: parList = __getParameterSequence( nParValues, parList=parameterDict[seqID].get('ParameterList', []), parStart=parameterDict[seqID].get('ParameterStart', 1), parStep=parameterDict[seqID].get('ParameterStep', 0), parFactor=parameterDict[seqID].get('ParameterFactor', 1)) if not parList: return S_ERROR(EWMSJDL, 'Inconsistent parametric job description') parameterLists[seqID] = parList jobDescList = [] jobDesc = jobClassAd.asJDL() # Width of the sequential parameter number zLength = len(str(nParValues - 1)) for n in range(nParValues): newJobDesc = jobDesc newJobDesc = newJobDesc.replace('%n', str(n).zfill(zLength)) newClassAd = ClassAd(newJobDesc) for seqID in parameterLists: parameter = parameterLists[seqID][n] for attribute in newClassAd.getAttributes(): __updateAttribute(newClassAd, attribute, seqID, str(parameter)) for seqID in parameterLists: for attribute in [ 'Parameters', 'ParameterStart', 'ParameterStep', 'ParameterFactor' ]: if seqID == '0': newClassAd.deleteAttribute(attribute) else: newClassAd.deleteAttribute('%s.%s' % (attribute, seqID)) parameter = parameterLists[seqID][n] if seqID == '0': attribute = 'Parameter' else: attribute = 'Parameter.%s' % seqID if isinstance(parameter, six.string_types) and parameter.startswith('{'): newClassAd.insertAttributeInt(attribute, str(parameter)) else: newClassAd.insertAttributeString(attribute, str(parameter)) newClassAd.insertAttributeInt('ParameterNumber', n) newJDL = newClassAd.asJDL() jobDescList.append(newJDL) return S_OK(jobDescList)
def receiveData(self, maxBufferSize=0, blockAfterKeepAlive=True, idleReceive=False): self.__updateLastActionTimestamp() if self.receivedMessages: return self.receivedMessages.pop(0) #Buffer size can't be less than 0 maxBufferSize = max(maxBufferSize, 0) try: #Look either for message length of keep alive magic string iSeparatorPosition = self.byteStream.find(":", 0, 10) keepAliveMagicLen = len(BaseTransport.keepAliveMagic) isKeepAlive = self.byteStream.find(BaseTransport.keepAliveMagic, 0, keepAliveMagicLen) == 0 #While not found the message length or the ka, keep receiving while iSeparatorPosition == -1 and not isKeepAlive: retVal = self._read(16384) #If error return if not retVal['OK']: return retVal #If closed return error if not retVal['Value']: return S_ERROR("Peer closed connection") #New data! self.byteStream += retVal['Value'] #Look again for either message length of ka magic string iSeparatorPosition = self.byteStream.find(":", 0, 10) isKeepAlive = self.byteStream.find( BaseTransport.keepAliveMagic, 0, keepAliveMagicLen) == 0 #Over the limit? if maxBufferSize and len( self.byteStream ) > maxBufferSize and iSeparatorPosition == -1: return S_ERROR("Read limit exceeded (%s chars)" % maxBufferSize) #Keep alive magic! if isKeepAlive: gLogger.debug("Received keep alive header") #Remove the ka magic from the buffer and process the keep alive self.byteStream = self.byteStream[keepAliveMagicLen:] return self.__processKeepAlive(maxBufferSize, blockAfterKeepAlive) #From here it must be a real message! #Process the size and remove the msg length from the bytestream pkgSize = int(self.byteStream[:iSeparatorPosition]) pkgData = self.byteStream[iSeparatorPosition + 1:] readSize = len(pkgData) if readSize >= pkgSize: #If we already have all the data we need data = pkgData[:pkgSize] self.byteStream = pkgData[pkgSize:] else: #If we still need to read stuff pkgMem = cStringIO.StringIO() pkgMem.write(pkgData) #Receive while there's still data to be received while readSize < pkgSize: retVal = self._read(pkgSize - readSize, skipReadyCheck=True) if not retVal['OK']: return retVal if not retVal['Value']: return S_ERROR("Peer closed connection") rcvData = retVal['Value'] readSize += len(rcvData) pkgMem.write(rcvData) if maxBufferSize and readSize > maxBufferSize: return S_ERROR("Read limit exceeded (%s chars)" % maxBufferSize) #Data is here! take it out from the bytestream, dencode and return if readSize == pkgSize: data = pkgMem.getvalue() self.byteStream = "" else: #readSize > pkgSize: pkgMem.seek(0, 0) data = pkgMem.read(pkgSize) self.byteStream = pkgMem.read() try: data = DEncode.decode(data)[0] except Exception as e: return S_ERROR("Could not decode received data: %s" % str(e)) if idleReceive: self.receivedMessages.append(data) return S_OK() return data except Exception as e: gLogger.exception("Network error while receiving data") return S_ERROR("Network error while receiving data: %s" % str(e))
def run(self): """ task execution reads and executes ProcessTask :task: out of pending queue and then pushes it to the results queue for callback execution :param self: self reference """ ## start watchdog thread self.__watchdogThread = threading.Thread(target=self.__watchdog) self.__watchdogThread.daemon = True self.__watchdogThread.start() ## http://cdn.memegenerator.net/instances/400x/19450565.jpg if LockRing: # Reset all locks lr = LockRing() lr._openAll() lr._setAllEvents() ## zero processed task counter taskCounter = 0 ## zero idle loop counter idleLoopCount = 0 ## main loop while True: ## draining, stopEvent is set, exiting if self.__stopEvent.is_set(): return ## clear task self.task = None ## read from queue try: task = self.__pendingQueue.get(block=True, timeout=10) except Queue.Empty: ## idle loop? idleLoopCount += 1 ## 10th idle loop - exit, nothing to do if idleLoopCount == 10: return continue ## toggle __working flag self.__working.value = 1 ## save task self.task = task ## reset idle loop counter idleLoopCount = 0 ## process task in a separate thread self.__processThread = threading.Thread(target=self.__processTask) self.__processThread.start() ## join processThread with or without timeout if self.task.getTimeOut(): self.__processThread.join(self.task.getTimeOut() + 10) else: self.__processThread.join() ## processThread is still alive? stop it! if self.__processThread.is_alive(): self.__processThread._Thread__stop() ## check results and callbacks presence, put task to results queue if self.task.hasCallback() or self.task.hasPoolCallback(): if not self.task.taskResults() and not self.task.taskException( ): self.task.setResult(S_ERROR("Timed out")) self.__resultsQueue.put(task) ## increase task counter taskCounter += 1 self.__taskCounter = taskCounter ## toggle __working flag self.__working.value = 0
def receiveData(self, maxBufferSize=0, blockAfterKeepAlive=True, idleReceive=False): self.__updateLastActionTimestamp() if self.receivedMessages: return self.receivedMessages.pop(0) #Buffer size can't be less than 0 maxBufferSize = max(maxBufferSize, 0) try: #Look either for message length of keep alive magic string iSeparatorPosition = self.byteStream.find(":", 0, 10) keepAliveMagicLen = len(BaseTransport.keepAliveMagic) isKeepAlive = self.byteStream.find(BaseTransport.keepAliveMagic, 0, keepAliveMagicLen) == 0 #While not found the message length or the ka, keep receiving while iSeparatorPosition == -1 and not isKeepAlive: retVal = self._read(1024) #If error return if not retVal['OK']: return retVal #If closed return error if not retVal['Value']: return S_ERROR("Peer closed connection") #New data! self.byteStream += retVal['Value'] #Look again for either message length of ka magic string iSeparatorPosition = self.byteStream.find(":", 0, 10) isKeepAlive = self.byteStream.find( BaseTransport.keepAliveMagic, 0, keepAliveMagicLen) == 0 #Over the limit? if maxBufferSize and len( self.byteStream ) > maxBufferSize and iSeparatorPosition == -1: return S_ERROR("Read limit exceeded (%s chars)" % maxBufferSize) #Keep alive magic! if isKeepAlive: gLogger.debug("Received keep alive header") #Remove the ka magic from the buffer and process the keep alive self.byteStream = self.byteStream[keepAliveMagicLen:] return self.__processKeepAlive(maxBufferSize, blockAfterKeepAlive) #From here it must be a real message! #Process the size and remove the msg length from the bytestream size = int(self.byteStream[:iSeparatorPosition]) self.byteStream = self.byteStream[iSeparatorPosition + 1:] #Receive while there's still data to be received while len(self.byteStream) < size: retVal = self._read(size - len(self.byteStream), skipReadyCheck=True) if not retVal['OK']: return retVal if not retVal['Value']: return S_ERROR("Peer closed connection") self.byteStream += retVal['Value'] if maxBufferSize and len(self.byteStream) > maxBufferSize: return S_ERROR("Read limit exceeded (%s chars)" % maxBufferSize) #Data is here! take it out from the bytestream, dencode and return data = self.byteStream[:size] self.byteStream = self.byteStream[size:] try: data = DEncode.decode(data)[0] except Exception, e: return S_ERROR("Could not decode received data: %s" % str(e)) if idleReceive: self.receivedMessages.append(data) return S_OK() return data
def __getCAStore(self): SocketInfo.__cachedCAsCRLsLoadLock.acquire() try: if not SocketInfo.__cachedCAsCRLs or time.time() - SocketInfo.__cachedCAsCRLsLastLoaded > 900: # Need to generate the CA Store casDict = {} crlsDict = {} casPath = Locations.getCAsLocation() if not casPath: return S_ERROR("No valid CAs location found") gLogger.debug("CAs location is %s" % casPath) casFound = 0 crlsFound = 0 SocketInfo.__caStore = GSI.crypto.X509Store() for fileName in os.listdir(casPath): filePath = os.path.join(casPath, fileName) if not os.path.isfile(filePath): continue fObj = file(filePath, "rb") pemData = fObj.read() fObj.close() # Try to load CA Cert try: caCert = GSI.crypto.load_certificate(GSI.crypto.FILETYPE_PEM, pemData) if caCert.has_expired(): continue caID = (caCert.get_subject().one_line(), caCert.get_issuer().one_line()) caNotAfter = caCert.get_not_after() if caID not in casDict: casDict[caID] = (caNotAfter, caCert) casFound += 1 else: if casDict[caID][0] < caNotAfter: casDict[caID] = (caNotAfter, caCert) continue except BaseException: if fileName.find(".0") == len(fileName) - 2: gLogger.exception("LOADING %s" % filePath) if 'IgnoreCRLs' not in self.infoDict or not self.infoDict['IgnoreCRLs']: # Try to load CRL try: crl = GSI.crypto.load_crl(GSI.crypto.FILETYPE_PEM, pemData) if crl.has_expired(): continue crlID = crl.get_issuer().one_line() crlsDict[crlID] = crl crlsFound += 1 continue except Exception as e: if fileName.find(".r0") == len(fileName) - 2: gLogger.exception("LOADING %s ,Exception: %s" % (filePath, str(e))) gLogger.debug("Loaded %s CAs [%s CRLs]" % (casFound, crlsFound)) SocketInfo.__cachedCAsCRLs = ([casDict[k][1] for k in casDict], [crlsDict[k] for k in crlsDict]) SocketInfo.__cachedCAsCRLsLastLoaded = time.time() except BaseException: gLogger.exception("Failed to init CA store") finally: SocketInfo.__cachedCAsCRLsLoadLock.release() # Generate CA Store caStore = GSI.crypto.X509Store() caList = SocketInfo.__cachedCAsCRLs[0] for caCert in caList: caStore.add_cert(caCert) crlList = SocketInfo.__cachedCAsCRLs[1] for crl in crlList: caStore.add_crl(crl) return S_OK(caStore)
def _constructTransferJob(self, pinTime, allLFNs, target_spacetoken, protocols=None): """ Build a job for transfer Some attributes of the job are expected to be set * sourceSE * targetSE * activity (optional) * priority (optional) * filesToSubmit * operationID (optional, used as metadata for the job) :param pinTime: pining time in case staging is needed :param allLFNs: list of LFNs to transfer :param failedLFNs: set of LFNs in filesToSubmit for which there was a problem :param target_spacetoken: the space token of the target :param protocols: list of protocols to restrict the protocol choice for the transfer :return: S_OK( (job object, list of ftsFileIDs in the job)) """ log = gLogger.getSubLogger( "constructTransferJob/%s/%s_%s" % (self.operationID, self.sourceSE, self.targetSE), True) res = self.__fetchSpaceToken(self.sourceSE, self.vo) if not res['OK']: return res source_spacetoken = res['Value'] failedLFNs = set() dstSE = StorageElement(self.targetSE, vo=self.vo) srcSE = StorageElement(self.sourceSE, vo=self.vo) # If the source is not a tape SE, we should set the # copy_pin_lifetime and bring_online params to None, # otherwise they will do an extra useless queue in FTS sourceIsTape = self.__isTapeSE(self.sourceSE, self.vo) copy_pin_lifetime = pinTime if sourceIsTape else None bring_online = BRING_ONLINE_TIMEOUT if sourceIsTape else None # getting all the (source, dest) surls res = dstSE.generateTransferURLsBetweenSEs(allLFNs, srcSE, protocols=protocols) if not res['OK']: return res for lfn, reason in res['Value']['Failed'].items(): failedLFNs.add(lfn) log.error("Could not get source SURL", "%s %s" % (lfn, reason)) allSrcDstSURLs = res['Value']['Successful'] # This contains the staging URLs if they are different from the transfer URLs # (CTA...) allStageURLs = dict() # In case we are transfering from a tape system, and the stage protocol # is not the same as the transfer protocol, we generate the staging URLs # to do a multihop transfer. See below. if sourceIsTape: srcProto, _destProto = res['Value']['Protocols'] if srcProto not in srcSE.localStageProtocolList: # As of version 3.10, FTS can only handle one file per multi hop # job. If we are here, that means that we need one, so make sure that # we only have a single file to transfer (this should have been checked # at the job construction step in FTS3Operation). # This test is important, because multiple files would result in the source # being deleted ! if len(allLFNs) != 1: log.debug( "Multihop job has %s files while only 1 allowed" % len(allLFNs)) return S_ERROR( errno.E2BIG, "Trying multihop job with more than one file !") res = srcSE.getURL(allSrcDstSURLs, protocol=srcSE.localStageProtocolList) if not res['OK']: return res for lfn, reason in res['Value']['Failed'].items(): failedLFNs.add(lfn) log.error("Could not get stage SURL", "%s %s" % (lfn, reason)) allSrcDstSURLs.pop(lfn) allStageURLs = res['Value']['Successful'] transfers = [] fileIDsInTheJob = [] for ftsFile in self.filesToSubmit: if ftsFile.lfn in failedLFNs: log.debug("Not preparing transfer for file %s" % ftsFile.lfn) continue sourceSURL, targetSURL = allSrcDstSURLs[ftsFile.lfn] stageURL = allStageURLs.get(ftsFile.lfn) if sourceSURL == targetSURL: log.error("sourceSURL equals to targetSURL", "%s" % ftsFile.lfn) ftsFile.error = "sourceSURL equals to targetSURL" ftsFile.status = 'Defunct' continue ftsFileID = getattr(ftsFile, 'fileID') # Under normal circumstances, we simply submit an fts transfer as such: # * srcProto://myFile -> destProto://myFile # # Even in case of the source storage being a tape system, it works fine. # However, if the staging and transfer protocols are different (which might be the case for CTA), # we use the multihop machinery to submit two sequential fts transfers: # one to stage, one to transfer. # It looks like such # * stageProto://myFile -> stageProto://myFile # * srcProto://myFile -> destProto://myFile if stageURL: # We do not set a fileID in the metadata # such that we do not update the DB when monitoring stageTrans_metadata = {'desc': 'PreStage %s' % ftsFileID} stageTrans = fts3.new_transfer(stageURL, stageURL, checksum='ADLER32:%s' % ftsFile.checksum, filesize=ftsFile.size, metadata=stageTrans_metadata, activity=self.activity) transfers.append(stageTrans) trans_metadata = { 'desc': 'Transfer %s' % ftsFileID, 'fileID': ftsFileID } trans = fts3.new_transfer(sourceSURL, targetSURL, checksum='ADLER32:%s' % ftsFile.checksum, filesize=ftsFile.size, metadata=trans_metadata, activity=self.activity) transfers.append(trans) fileIDsInTheJob.append(ftsFileID) if not transfers: log.error("No transfer possible!") return S_ERROR("No transfer possible") # We add a few metadata to the fts job so that we can reuse them later on without # querying our DB. # source and target SE are just used for accounting purpose job_metadata = { 'operationID': self.operationID, 'rmsReqID': self.rmsReqID, 'sourceSE': self.sourceSE, 'targetSE': self.targetSE } job = fts3.new_job( transfers=transfers, overwrite=True, source_spacetoken=source_spacetoken, spacetoken=target_spacetoken, bring_online=bring_online, copy_pin_lifetime=copy_pin_lifetime, retry=3, verify_checksum= 'target', # Only check target vs specified, since we verify the source earlier multihop=bool( allStageURLs), # if we have stage urls, then we need multihop metadata=job_metadata, priority=self.priority) return S_OK((job, fileIDsInTheJob))
def clone(self): try: return S_OK(SocketInfo(dict(self.infoDict), self.sslContext)) except Exception as e: return S_ERROR(str(e))
def monitor(self, context=None, ftsServer=None, ucert=None): """ Queries the fts server to monitor the job. The internal state of the object is updated depending on the monitoring result. In case the job is not found on the server, the status is set to 'Failed' Within a job, only the transfers having a `fileID` metadata are considered. This is to allow for multihop jobs doing a staging This method assumes that the attribute self.ftsGUID is set :param context: fts3 context. If not given, it is created (see ftsServer & ucert param) :param ftsServer: the address of the fts server to submit to. Used only if context is not given. if not given either, use the ftsServer object attribute :param ucert: path to the user certificate/proxy. Might be infered by the fts cli (see its doc) :returns: {FileID: { status, error } } Possible error numbers * errno.ESRCH: If the job does not exist on the server * errno.EDEADLK: In case the job and file status are inconsistent (see comments inside the code) """ if not self.ftsGUID: return S_ERROR("FTSGUID not set, FTS job not submitted?") if not context: if not ftsServer: ftsServer = self.ftsServer context = fts3.Context(endpoint=ftsServer, ucert=ucert, request_class=ftsSSLRequest, verify=False) jobStatusDict = None try: jobStatusDict = fts3.get_job_status(context, self.ftsGUID, list_files=True) # The job is not found # Set its status to Failed and return except NotFound: self.status = 'Failed' return S_ERROR( errno.ESRCH, "FTSGUID %s not found on %s" % (self.ftsGUID, self.ftsServer)) except FTS3ClientException as e: return S_ERROR("Error getting the job status %s" % e) now = datetime.datetime.utcnow().replace(microsecond=0) self.lastMonitor = now newStatus = jobStatusDict['job_state'].capitalize() if newStatus != self.status: self.status = newStatus self.lastUpdate = now self.error = jobStatusDict['reason'] if newStatus in self.FINAL_STATES: self._fillAccountingDict(jobStatusDict) filesInfoList = jobStatusDict['files'] filesStatus = {} statusSummary = {} # Make a copy, since we are potentially # deleting objects for fileDict in list(filesInfoList): file_state = fileDict['file_state'].capitalize() file_metadata = fileDict['file_metadata'] # previous version of the code did not have dictionary as # file_metadata if isinstance(file_metadata, dict): file_id = file_metadata.get('fileID') else: file_id = file_metadata # The transfer does not have a fileID attached to it # so it does not correspond to a file in our DB: skip it # (typical of jobs with different staging protocol == CTA) # We also remove it from the fileInfoList, such that it is # not considered for accounting if not file_id: filesInfoList.remove(fileDict) continue file_error = fileDict['reason'] filesStatus[file_id] = {'status': file_state, 'error': file_error} # If the state of the file is final for FTS, set ftsGUID of the file to None, # such that it is "released" from this job and not updated anymore in future # monitoring calls if file_state in FTS3File.FTS_FINAL_STATES: filesStatus[file_id]['ftsGUID'] = None # If the file is not in a final state, but the job is, we return an error # FTS can have inconsistencies where the FTS Job is in a final state # but not all the files. # The inconsistencies are cleaned every hour on the FTS side. # https://its.cern.ch/jira/browse/FTS-1482 elif self.status in self.FINAL_STATES: return S_ERROR( errno.EDEADLK, "Job %s in a final state (%s) while File %s is not (%s)" % (self.ftsGUID, self.status, file_id, file_state)) statusSummary[file_state] = statusSummary.get(file_state, 0) + 1 # We've removed all the intermediate transfers that we are not interested in # so we put this back into the monitoring data such that the accounting is done properly jobStatusDict['files'] = filesInfoList if newStatus in self.FINAL_STATES: self._fillAccountingDict(jobStatusDict) total = len(filesInfoList) completed = sum([ statusSummary.get(state, 0) for state in FTS3File.FTS_FINAL_STATES ]) self.completeness = int(100 * completed / total) return S_OK(filesStatus)
def checkSanity(urlTuple, kwargs): """ Check that all ssl environment is ok """ useCerts = False certFile = "" if "useCertificates" in kwargs and kwargs["useCertificates"]: certTuple = Locations.getHostCertificateAndKeyLocation() if not certTuple: gLogger.error("No cert/key found! ") return S_ERROR("No cert/key found! ") certFile = certTuple[0] useCerts = True elif "proxyString" in kwargs: if not isinstance(kwargs["proxyString"], six.string_types if six.PY2 else bytes): gLogger.error("proxyString parameter is not a valid type", str(type(kwargs["proxyString"]))) return S_ERROR("proxyString parameter is not a valid type") else: if "proxyLocation" in kwargs: certFile = kwargs["proxyLocation"] else: certFile = Locations.getProxyLocation() if not certFile: gLogger.error("No proxy found") return S_ERROR("No proxy found") elif not os.path.isfile(certFile): gLogger.error("Proxy file does not exist", certFile) return S_ERROR("%s proxy file does not exist" % certFile) # For certs always check CA's. For clients skipServerIdentityCheck if "skipCACheck" not in kwargs or not kwargs["skipCACheck"]: if not Locations.getCAsLocation(): gLogger.error("No CAs found!") return S_ERROR("No CAs found!") if "proxyString" in kwargs: certObj = X509Chain() retVal = certObj.loadChainFromString(kwargs["proxyString"]) if not retVal["OK"]: gLogger.error("Can't load proxy string") return S_ERROR("Can't load proxy string") else: if useCerts: certObj = X509Certificate() certObj.loadFromFile(certFile) else: certObj = X509Chain() certObj.loadChainFromFile(certFile) retVal = certObj.hasExpired() if not retVal["OK"]: gLogger.error("Can't verify proxy or certificate file", "%s:%s" % (certFile, retVal["Message"])) return S_ERROR("Can't verify file %s:%s" % (certFile, retVal["Message"])) else: if retVal["Value"]: notAfter = certObj.getNotAfterDate() if notAfter["OK"]: notAfter = notAfter["Value"] else: notAfter = "unknown" gLogger.error("PEM file has expired", "%s is not valid after %s" % (certFile, notAfter)) return S_ERROR("PEM file %s has expired, not valid after %s" % (certFile, notAfter)) idDict = {} retVal = certObj.getDIRACGroup(ignoreDefault=True) if retVal["OK"] and retVal["Value"] is not False: idDict["group"] = retVal["Value"] if useCerts: idDict["DN"] = certObj.getSubjectDN()["Value"] else: idDict["DN"] = certObj.getIssuerCert()["Value"].getSubjectDN()["Value"] return S_OK(idDict)