class OperationHandlerBase( object ): """ .. class:: OperationHandlerBase request operation handler base class """ __metaclass__ = DynamicProps # # private data logging client # __dataLoggingClient = None # # private ResourceStatusClient __rssClient = None # # shifter list __shifterList = [] def __init__( self, operation = None, csPath = None ): """c'tor :param Operation operation: Operation instance :param str csPath: config path in CS for this operation """ # # placeholders for operation and request self.operation = None self.request = None self.dm = DataManager() self.fc = FileCatalog() self.csPath = csPath if csPath else "" # # get name name = self.__class__.__name__ # # all options are r/o properties now csOptionsDict = gConfig.getOptionsDict( self.csPath ) csOptionsDict = csOptionsDict.get( "Value", {} ) for option, value in csOptionsDict.iteritems(): # # hack to set proper types try: value = eval( value ) except NameError: pass self.makeProperty( option, value, True ) #pylint: disable=no-member # # pre setup logger self.log = gLogger.getSubLogger( name, True ) # # set log level logLevel = getattr( self, "LogLevel" ) if hasattr( self, "LogLevel" ) else "INFO" self.log.setLevel( logLevel ) # # list properties for option in csOptionsDict: self.log.debug( "%s = %s" % ( option, getattr( self, option ) ) ) # # setup operation if operation: self.setOperation( operation ) # # initialize at least if hasattr( self, "initialize" ) and callable( getattr( self, "initialize" ) ): getattr( self, "initialize" )() def setOperation( self, operation ): """ operation and request setter :param ~DIRAC.RequestManagementSystem.Client.Operation.Operation operation: operation instance :raises TypeError: if `operation` in not an instance of :class:`~DIRAC.RequestManagementSystem.Client.Operation.Operation` """ if not isinstance( operation, Operation ): raise TypeError( "expecting Operation instance" ) self.operation = operation self.request = operation._parent self.log = gLogger.getSubLogger( "pid_%s/%s/%s/%s" % ( os.getpid(), self.request.RequestName, self.request.Order, self.operation.Type ) ) # @classmethod # def dataLoggingClient( cls ): # """ DataLoggingClient getter """ # if not cls.__dataLoggingClient: # from DIRAC.DataManagementSystem.Client.DataLoggingClient import DataLoggingClient # cls.__dataLoggingClient = DataLoggingClient() # return cls.__dataLoggingClient @classmethod def rssClient( cls ): """ ResourceStatusClient getter """ if not cls.__rssClient: from DIRAC.ResourceStatusSystem.Client.ResourceStatus import ResourceStatus cls.__rssClient = ResourceStatus() return cls.__rssClient def getProxyForLFN( self, lfn ): """ get proxy for lfn :param str lfn: LFN :return: S_ERROR or S_OK( "/path/to/proxy/file" ) """ dirMeta = returnSingleResult( self.fc.getDirectoryMetadata( os.path.dirname( lfn ) ) ) if not dirMeta["OK"]: return dirMeta dirMeta = dirMeta["Value"] ownerRole = "/%s" % dirMeta["OwnerRole"] if not dirMeta["OwnerRole"].startswith( "/" ) else dirMeta["OwnerRole"] ownerDN = dirMeta["OwnerDN"] ownerProxy = None for ownerGroup in getGroupsWithVOMSAttribute( ownerRole ): vomsProxy = gProxyManager.downloadVOMSProxy( ownerDN, ownerGroup, limited = True, requiredVOMSAttribute = ownerRole ) if not vomsProxy["OK"]: self.log.debug( "getProxyForLFN: failed to get VOMS proxy for %s role=%s: %s" % ( ownerDN, ownerRole, vomsProxy["Message"] ) ) continue ownerProxy = vomsProxy["Value"] self.log.debug( "getProxyForLFN: got proxy for %s@%s [%s]" % ( ownerDN, ownerGroup, ownerRole ) ) break if not ownerProxy: return S_ERROR( "Unable to get owner proxy" ) dumpToFile = ownerProxy.dumpAllToFile() if not dumpToFile["OK"]: self.log.error( "getProxyForLFN: error dumping proxy to file: %s" % dumpToFile["Message"] ) else: os.environ["X509_USER_PROXY"] = dumpToFile["Value"] return dumpToFile def getWaitingFilesList( self ): """ prepare waiting files list, update Attempt, filter out MaxAttempt """ if not self.operation: self.log.warning( "getWaitingFilesList: operation not set, returning empty list" ) return [] waitingFiles = [ opFile for opFile in self.operation if opFile.Status == "Waiting" ] for opFile in waitingFiles: opFile.Attempt += 1 maxAttempts = getattr( self, "MaxAttempts" ) if hasattr( self, "MaxAttempts" ) else 1024 if opFile.Attempt > maxAttempts: opFile.Status = "Failed" if opFile.Error is None: opFile.Error = '' opFile.Error += " (Max attempts limit reached)" return [ opFile for opFile in self.operation if opFile.Status == "Waiting" ] def rssSEStatus( self, se, status, retries = 2 ): """ check SE :se: for status :status: :param str se: SE name :param str status: RSS status """ # Allow a transient failure for _i in range( retries ): rssStatus = self.rssClient().getElementStatus( se, "StorageElement", status ) # gLogger.always( rssStatus ) if rssStatus["OK"]: return S_OK( rssStatus["Value"][se][status] != "Banned" ) return S_ERROR( "%s status not found in RSS for SE %s" % ( status, se ) ) @property def shifter( self ): return self.__shifterList @shifter.setter def shifter( self, shifterList ): self.__shifterList = shifterList def __call__( self ): """ this one should be implemented in the inherited class should return S_OK/S_ERROR """ raise NotImplementedError( "Implement me please!" )
class StorageUsageAgent(AgentModule): ''' .. class:: StorageUsageAgent :param FileCatalog catalog: FileCatalog instance :parma mixed storageUsage: StorageUsageDB instance or its rpc client :param int pollingTime: polling time :param int activePeriod: active period on weeks :param threading.Lock dataLock: data lock :param threading.Lock replicaListLock: replica list lock :param DictCache proxyCache: creds cache ''' catalog = None storageUsage = None pollingTime = 43200 activePeriod = 0 dataLock = None # threading.Lock() replicaListLock = None # threading.Lock() proxyCache = None # DictCache() enableStartupSleep = True # Enable a random sleep so not all the user agents start together def __init__(self, *args, **kwargs): ''' c'tor ''' AgentModule.__init__(self, *args, **kwargs) self.__baseDir = '/lhcb' self.__baseDirLabel = "_".join(List.fromChar(self.__baseDir, "/")) self.__ignoreDirsList = [] self.__keepDirLevels = 4 self.__startExecutionTime = long(time.time()) self.__dirExplorer = DirectoryExplorer(reverse=True) self.__processedDirs = 0 self.__directoryOwners = {} self.catalog = FileCatalog() self.__maxToPublish = self.am_getOption('MaxDirectories', 5000) if self.am_getOption('DirectDB', False): self.storageUsage = StorageUsageDB() else: # Set a timeout of 0.1 seconds per directory (factor 5 margin) self.storageUsage = RPCClient('DataManagement/StorageUsage', timeout=self.am_getOption( 'Timeout', int(self.__maxToPublish * 0.1))) self.activePeriod = self.am_getOption('ActivePeriod', self.activePeriod) self.dataLock = threading.Lock() self.replicaListLock = threading.Lock() self.proxyCache = DictCache(removeProxy) self.__noProxy = set() self.__catalogType = None self.__recalculateUsage = Operations().getValue( 'DataManagement/RecalculateDirSize', False) self.enableStartupSleep = self.am_getOption('EnableStartupSleep', self.enableStartupSleep) self.__publishDirQueue = {} self.__dirsToPublish = {} self.__replicaFilesUsed = set() self.__replicaListFilesDir = "" def initialize(self): ''' agent initialisation ''' self.am_setOption("PollingTime", self.pollingTime) if self.enableStartupSleep: rndSleep = random.randint(1, self.pollingTime) self.log.info("Sleeping for %s seconds" % rndSleep) time.sleep(rndSleep) # This sets the Default Proxy to used as that defined under # /Operations/Shifter/DataManager # the shifterProxy option in the Configsorteduration can be used to change this default. self.am_setOption('shifterProxy', 'DataManager') return S_OK() def __writeReplicasListFiles(self, dirPathList): ''' dump replicas list to files ''' self.replicaListLock.acquire() try: self.log.info("Dumping replicas for %s dirs" % len(dirPathList)) result = self.catalog.getDirectoryReplicas(dirPathList) if not result['OK']: self.log.error("Could not get directory replicas", "%s -> %s" % (dirPathList, result['Message'])) return result resData = result['Value'] filesOpened = {} for dirPath in dirPathList: if dirPath in result['Value']['Failed']: self.log.error( "Could not get directory replicas", "%s -> %s" % (dirPath, resData['Failed'][dirPath])) continue dirData = resData['Successful'][dirPath] for lfn in dirData: for seName in dirData[lfn]: if seName not in filesOpened: filePath = os.path.join( self.__replicaListFilesDir, "replicas.%s.%s.filling" % (seName, self.__baseDirLabel)) # Check if file is opened and if not open it if seName not in filesOpened: if seName not in self.__replicaFilesUsed: self.__replicaFilesUsed.add(seName) filesOpened[seName] = file(filePath, "w") else: filesOpened[seName] = file(filePath, "a") # seName file is opened. Write filesOpened[seName].write("%s -> %s\n" % (lfn, dirData[lfn][seName])) # Close the files for seName in filesOpened: filesOpened[seName].close() return S_OK() finally: self.replicaListLock.release() def __resetReplicaListFiles(self): ''' prepare directories for replica list files ''' self.__replicaFilesUsed = set() self.__replicaListFilesDir = os.path.join( self.am_getOption("WorkDirectory"), "replicaLists") mkDir(self.__replicaListFilesDir) self.log.info("Replica Lists directory is %s" % self.__replicaListFilesDir) def __replicaListFilesDone(self): ''' rotate replicas list files ''' self.replicaListLock.acquire() try: old = re.compile(r"^replicas\.([a-zA-Z0-9\-_]*)\.%s\.old$" % self.__baseDirLabel) current = re.compile(r"^replicas\.([a-zA-Z0-9\-_]*)\.%s$" % self.__baseDirLabel) filling = re.compile( r"^replicas\.([a-zA-Z0-9\-_]*)\.%s\.filling$" % self.__baseDirLabel) # Delete old for fileName in os.listdir(self.__replicaListFilesDir): match = old.match(fileName) if match: os.unlink( os.path.join(self.__replicaListFilesDir, fileName)) # Current -> old for fileName in os.listdir(self.__replicaListFilesDir): match = current.match(fileName) if match: newFileName = "replicas.%s.%s.old" % (match.group(1), self.__baseDirLabel) self.log.info( "Moving \n %s\n to \n %s" % (os.path.join(self.__replicaListFilesDir, fileName), os.path.join(self.__replicaListFilesDir, newFileName))) os.rename( os.path.join(self.__replicaListFilesDir, fileName), os.path.join(self.__replicaListFilesDir, newFileName)) # filling to current for fileName in os.listdir(self.__replicaListFilesDir): match = filling.match(fileName) if match: newFileName = "replicas.%s.%s" % (match.group(1), self.__baseDirLabel) self.log.info( "Moving \n %s\n to \n %s" % (os.path.join(self.__replicaListFilesDir, fileName), os.path.join(self.__replicaListFilesDir, newFileName))) os.rename( os.path.join(self.__replicaListFilesDir, fileName), os.path.join(self.__replicaListFilesDir, newFileName)) return S_OK() finally: self.replicaListLock.release() def __printSummary(self): ''' pretty print summary ''' res = self.storageUsage.getStorageSummary() if res['OK']: self.log.notice("Storage Usage Summary") self.log.notice( "============================================================") self.log.notice( "%-40s %20s %20s" % ('Storage Element', 'Number of files', 'Total size')) for se in sorted(res['Value']): site = se.split('_')[0].split('-')[0] gMonitor.registerActivity("%s-used" % se, "%s usage" % se, "StorageUsage/%s usage" % site, "", gMonitor.OP_MEAN, bucketLength=600) gMonitor.registerActivity("%s-files" % se, "%s files" % se, "StorageUsage/%s files" % site, "Files", gMonitor.OP_MEAN, bucketLength=600) time.sleep(2) for se in sorted(res['Value']): usage = res['Value'][se]['Size'] files = res['Value'][se]['Files'] self.log.notice("%-40s %20s %20s" % (se, str(files), str(usage))) gMonitor.addMark("%s-used" % se, usage) gMonitor.addMark("%s-files" % se, files) def execute(self): ''' execution in one cycle ''' self.__publishDirQueue = {} self.__dirsToPublish = {} self.__baseDir = self.am_getOption('BaseDirectory', '/lhcb') self.__baseDirLabel = "_".join(List.fromChar(self.__baseDir, "/")) self.__ignoreDirsList = self.am_getOption('Ignore', []) self.__keepDirLevels = self.am_getOption("KeepDirLevels", 4) self.__startExecutionTime = long(time.time()) self.__dirExplorer = DirectoryExplorer(reverse=True) self.__resetReplicaListFiles() self.__noProxy = set() self.__processedDirs = 0 self.__directoryOwners = {} self.__printSummary() self.__dirExplorer.addDir(self.__baseDir) self.log.notice("Initiating with %s as base directory." % self.__baseDir) # Loop over all the directories and sub-directories totalIterTime = 0.0 numIterations = 0.0 iterMaxDirs = 100 while self.__dirExplorer.isActive(): startT = time.time() d2E = [ self.__dirExplorer.getNextDir() for _i in xrange(iterMaxDirs) if self.__dirExplorer.isActive() ] self.__exploreDirList(d2E) iterTime = time.time() - startT totalIterTime += iterTime numIterations += len(d2E) self.log.verbose("Query took %.2f seconds for %s dirs" % (iterTime, len(d2E))) self.log.verbose("Average query time: %2.f secs/dir" % (totalIterTime / numIterations)) # Publish remaining directories self.__publishData(background=False) # Move replica list files self.__replicaListFilesDone() # Clean records older than 1 day self.log.info("Finished recursive directory search.") if self.am_getOption("PurgeOutdatedRecords", True): elapsedTime = time.time() - self.__startExecutionTime outdatedSeconds = max( max(self.am_getOption("PollingTime"), elapsedTime) * 2, 86400) result = self.storageUsage.purgeOutdatedEntries( self.__baseDir, long(outdatedSeconds), self.__ignoreDirsList) if not result['OK']: return result self.log.notice("Purged %s outdated records" % result['Value']) return S_OK() def __exploreDirList(self, dirList): ''' collect directory size for directory in :dirList: ''' # Normalise dirList first dirList = [os.path.realpath(d) for d in dirList] self.log.notice("Retrieving info for %s dirs" % len(dirList)) # For top directories, no files anyway, hence no need to get full size dirContents = {} failed = {} successfull = {} startTime = time.time() nbDirs = len(dirList) chunkSize = 10 if self.__catalogType == 'DFC' or dirList == [self.__baseDir]: # Get the content of the directory as anyway this is needed for dirChunk in breakListIntoChunks(dirList, chunkSize): res = self.catalog.listDirectory(dirChunk, True, timeout=600) if not res['OK']: failed.update(dict.fromkeys(dirChunk, res['Message'])) else: failed.update(res['Value']['Failed']) dirContents.update(res['Value']['Successful']) self.log.info( 'Time to retrieve content of %d directories: %.1f seconds' % (nbDirs, time.time() - startTime)) for dirPath in failed: dirList.remove(dirPath) # We don't need to get the storage usage if there are no files... dirListSize = [ d for d in dirList if dirContents.get(d, {}).get('Files') ] startTime1 = time.time() # __recalculateUsage enables to recompute the directory usage in case the internal table is wrong for args in [(d, True, self.__recalculateUsage) for d in breakListIntoChunks(dirListSize, chunkSize)]: res = self.catalog.getDirectorySize(*args, timeout=600) if not res['OK']: failed.update(dict.fromkeys(args[0], res['Message'])) else: failed.update(res['Value']['Failed']) successfull.update(res['Value']['Successful']) errorReason = {} for dirPath in failed: error = str(failed[dirPath]) errorReason.setdefault(error, []).append(dirPath) for error in errorReason: self.log.error( 'Failed to get directory info', '- %s for:\n\t%s' % (error, '\n\t'.join(errorReason[error]))) self.log.info('Time to retrieve size of %d directories: %.1f seconds' % (len(dirListSize), time.time() - startTime1)) for dirPath in [d for d in dirList if d not in failed]: metadata = successfull.get(dirPath, {}) if 'SubDirs' in metadata: self.__processDir(dirPath, metadata) else: if not self.__catalogType: self.log.info('Catalog type determined to be DFC') self.__catalogType = 'DFC' self.__processDirDFC(dirPath, metadata, dirContents[dirPath]) self.log.info('Time to process %d directories: %.1f seconds' % (nbDirs, time.time() - startTime)) notCommited = len(self.__publishDirQueue) + len(self.__dirsToPublish) self.log.notice( "%d dirs to be explored, %d done. %d not yet committed." % (self.__dirExplorer.getNumRemainingDirs(), self.__processedDirs, notCommited)) def __processDirDFC(self, dirPath, metadata, subDirectories): ''' gets the list of subdirs that the DFC doesn't return, set the metadata like the FC and then call the same method as for the FC ''' if 'SubDirs' not in subDirectories: self.log.error('No subdirectory item for directory', dirPath) return dirMetadata = { 'Files': 0, 'TotalSize': 0, 'ClosedDirs': [], 'SiteUsage': {} } if 'PhysicalSize' in metadata: dirMetadata['Files'] = metadata['LogicalFiles'] dirMetadata['TotalSize'] = metadata['LogicalSize'] dirMetadata['SiteUsage'] = metadata['PhysicalSize'].copy() dirMetadata['SiteUsage'].pop('TotalFiles', None) dirMetadata['SiteUsage'].pop('TotalSize', None) subDirs = subDirectories['SubDirs'].copy() dirMetadata['SubDirs'] = subDirs dirUsage = dirMetadata['SiteUsage'] errorReason = {} for subDir in subDirs: self.__directoryOwners.setdefault( subDir, (subDirs[subDir]['Owner'], subDirs[subDir]['OwnerGroup'])) subDirs[subDir] = subDirs[subDir].get('CreationTime', dateTime()) if dirUsage: # This part here is for removing the recursivity introduced by the DFC args = [subDir] if len(subDir.split('/')) > self.__keepDirLevels: args += [True, self.__recalculateUsage] result = self.catalog.getDirectorySize(*args) if not result['OK']: errorReason.setdefault(str(result['Message']), []).append(subDir) else: metadata = result['Value']['Successful'].get(subDir) if metadata: dirMetadata['Files'] -= metadata['LogicalFiles'] dirMetadata['TotalSize'] -= metadata['LogicalSize'] else: errorReason.setdefault( str(result['Value']['Failed'][subDir], [])).append(subDir) if 'PhysicalSize' in metadata and dirUsage: seUsage = metadata['PhysicalSize'] seUsage.pop('TotalFiles', None) seUsage.pop('TotalSize', None) for se in seUsage: if se not in dirUsage: self.log.error('SE used in subdir but not in dir', se) else: dirUsage[se]['Files'] -= seUsage[se]['Files'] dirUsage[se]['Size'] -= seUsage[se]['Size'] for error in errorReason: self.log.error( 'Failed to get directory info', '- %s for:\n\t%s' % (error, '\n\t'.join(errorReason[error]))) for se, usage in dirUsage.items(): # Both info should be 0 or #0 if not usage['Files'] and not usage['Size']: dirUsage.pop(se) elif not usage['Files'] * usage['Size']: self.log.error('Directory inconsistent', '%s @ %s: %s' % (dirPath, se, str(usage))) return self.__processDir(dirPath, dirMetadata) def __processDir(self, dirPath, dirMetadata): ''' calculate nb of files and size of :dirPath:, remove it if it's empty ''' subDirs = dirMetadata['SubDirs'] closedDirs = dirMetadata['ClosedDirs'] ############################## # FIXME: Until we understand while closed dirs are not working... ############################## closedDirs = [] prStr = "%s: found %s sub-directories" % (dirPath, len(subDirs) if subDirs else 'no') if closedDirs: prStr += ", %s are closed (ignored)" % len(closedDirs) for rmDir in closedDirs + self.__ignoreDirsList: subDirs.pop(rmDir, None) numberOfFiles = long(dirMetadata['Files']) totalSize = long(dirMetadata['TotalSize']) if numberOfFiles: prStr += " and %s files (%s bytes)" % (numberOfFiles, totalSize) else: prStr += " and no files" self.log.notice(prStr) if closedDirs: self.log.verbose("Closed dirs:\n %s" % '\n'.join(closedDirs)) siteUsage = dirMetadata['SiteUsage'] if numberOfFiles > 0: dirData = { 'Files': numberOfFiles, 'TotalSize': totalSize, 'SEUsage': siteUsage } self.__addDirToPublishQueue(dirPath, dirData) # Print statistics self.log.verbose( "%-40s %20s %20s" % ('Storage Element', 'Number of files', 'Total size')) for storageElement in sorted(siteUsage): usageDict = siteUsage[storageElement] self.log.verbose( "%-40s %20s %20s" % (storageElement, str( usageDict['Files']), str(usageDict['Size']))) # If it's empty delete it elif len(subDirs) == 0 and len(closedDirs) == 0: if dirPath != self.__baseDir: self.removeEmptyDir(dirPath) return # We don't need the cached information about owner self.__directoryOwners.pop(dirPath, None) rightNow = dateTime() chosenDirs = [ subDir for subDir in subDirs if not self.activePeriod or timeInterval( subDirs[subDir], self.activePeriod * week).includes(rightNow) ] self.__dirExplorer.addDirList(chosenDirs) self.__processedDirs += 1 def __getOwnerProxy(self, dirPath): ''' get owner creds for :dirPath: ''' self.log.verbose("Retrieving dir metadata...") # get owner form the cached information, if not, try getDirectoryMetadata ownerName, ownerGroup = self.__directoryOwners.pop( dirPath, (None, None)) if not ownerName or not ownerGroup: result = returnSingleResult( self.catalog.getDirectoryMetadata(dirPath)) if not result['OK'] or 'OwnerRole' not in result['Value']: self.log.error("Could not get metadata info", result['Message']) return result ownerRole = result['Value']['OwnerRole'] ownerDN = result['Value']['OwnerDN'] if ownerRole[0] != "/": ownerRole = "/%s" % ownerRole cacheKey = (ownerDN, ownerRole) ownerName = 'unknown' byGroup = False else: ownerDN = Registry.getDNForUsername(ownerName) if not ownerDN['OK']: self.log.error("Could not get DN from user name", ownerDN['Message']) return ownerDN ownerDN = ownerDN['Value'][0] # This bloody method returns directly a string!!!! ownerRole = Registry.getVOMSAttributeForGroup(ownerGroup) byGroup = True # Get all groups for that VOMS Role, and add lhcb_user as in DFC this is a safe value ownerGroups = Registry.getGroupsWithVOMSAttribute(ownerRole) + [ 'lhcb_user' ] downErrors = [] for ownerGroup in ownerGroups: if byGroup: ownerRole = None cacheKey = (ownerDN, ownerGroup) if cacheKey in self.__noProxy: return S_ERROR("Proxy not available") # Getting the proxy... upFile = self.proxyCache.get(cacheKey, 3600) if upFile and os.path.exists(upFile): self.log.verbose( 'Returning cached proxy for %s %s@%s [%s] in %s' % (ownerName, ownerDN, ownerGroup, ownerRole, upFile)) return S_OK(upFile) if ownerRole: result = gProxyManager.downloadVOMSProxy( ownerDN, ownerGroup, limited=False, requiredVOMSAttribute=ownerRole) else: result = gProxyManager.downloadProxy(ownerDN, ownerGroup, limited=False) if not result['OK']: downErrors.append("%s : %s" % (cacheKey, result['Message'])) continue userProxy = result['Value'] secsLeft = max(0, userProxy.getRemainingSecs()['Value']) upFile = userProxy.dumpAllToFile() if upFile['OK']: upFile = upFile['Value'] else: return upFile self.proxyCache.add(cacheKey, secsLeft, upFile) self.log.info("Got proxy for %s %s@%s [%s]" % (ownerName, ownerDN, ownerGroup, ownerRole)) return S_OK(upFile) self.__noProxy.add(cacheKey) return S_ERROR("Could not download proxy for user (%s, %s):\n%s " % (ownerDN, ownerRole, "\n ".join(downErrors))) def removeEmptyDir(self, dirPath): self.log.notice("Deleting empty directory %s" % dirPath) for useOwnerProxy in (False, True): result = self.__removeEmptyDir(dirPath, useOwnerProxy=useOwnerProxy) if result['OK']: self.log.info( "Successfully removed empty directory from File Catalog and StorageUsageDB" ) break return result def __removeEmptyDir(self, dirPath, useOwnerProxy=True): ''' unlink empty folder :dirPath: ''' from DIRAC.ConfigurationSystem.Client.ConfigurationData import gConfigurationData if len(List.fromChar(dirPath, "/")) < self.__keepDirLevels: return S_OK() if useOwnerProxy: result = self.__getOwnerProxy(dirPath) if not result['OK']: if 'Proxy not available' not in result['Message']: self.log.error(result['Message']) return result upFile = result['Value'] prevProxyEnv = os.environ['X509_USER_PROXY'] os.environ['X509_USER_PROXY'] = upFile try: gConfigurationData.setOptionInCFG( '/DIRAC/Security/UseServerCertificate', 'false') # res = self.catalog.removeDirectory( dirPath ) res = self.catalog.writeCatalogs[0][1].removeDirectory(dirPath) if not res['OK']: self.log.error( "Error removing empty directory from File Catalog.", res['Message']) return res elif dirPath in res['Value']['Failed']: self.log.error( "Failed to remove empty directory from File Catalog.", res['Value']['Failed'][dirPath]) self.log.debug(str(res)) return S_ERROR(res['Value']['Failed'][dirPath]) res = self.storageUsage.removeDirectory(dirPath) if not res['OK']: self.log.error( "Failed to remove empty directory from Storage Usage database.", res['Message']) return res return S_OK() finally: gConfigurationData.setOptionInCFG( '/DIRAC/Security/UseServerCertificate', 'true') if useOwnerProxy: os.environ['X509_USER_PROXY'] = prevProxyEnv def __addDirToPublishQueue(self, dirName, dirData): ''' enqueue :dirName: and :dirData: for publishing ''' self.__publishDirQueue[dirName] = dirData numDirsToPublish = len(self.__publishDirQueue) if numDirsToPublish and numDirsToPublish % self.am_getOption( "PublishClusterSize", 100) == 0: self.__publishData(background=True) def __publishData(self, background=True): ''' publish data in a separate deamon thread ''' self.dataLock.acquire() try: # Dump to file if self.am_getOption("DumpReplicasToFile", False): pass # repThread = threading.Thread( target = self.__writeReplicasListFiles, # args = ( list( self.__publishDirQueue ), ) ) self.__dirsToPublish.update(self.__publishDirQueue) self.__publishDirQueue = {} finally: self.dataLock.release() if background: pubThread = threading.Thread(target=self.__executePublishData) pubThread.setDaemon(1) pubThread.start() else: self.__executePublishData() def __executePublishData(self): ''' publication thread target ''' self.dataLock.acquire() try: if not self.__dirsToPublish: self.log.info("No data to be published") return if len(self.__dirsToPublish) > self.__maxToPublish: toPublish = {} for dirName in sorted( self.__dirsToPublish)[:self.__maxToPublish]: toPublish[dirName] = self.__dirsToPublish.pop(dirName) else: toPublish = self.__dirsToPublish self.log.info("Publishing usage for %d directories" % len(toPublish)) res = self.storageUsage.publishDirectories(toPublish) if res['OK']: # All is OK, reset the dictionary, even if data member! toPublish.clear() else: # Put back dirs to be published, due to the error self.__dirsToPublish.update(toPublish) self.log.error("Failed to publish directories", res['Message']) return res finally: self.dataLock.release()
class OperationHandlerBase(object): """ .. class:: OperationHandlerBase request operation handler base class """ __metaclass__ = DynamicProps # # private data logging client # __dataLoggingClient = None # # private ResourceStatusClient __rssClient = None # # shifter list __shifterList = [] def __init__(self, operation=None, csPath=None): """c'tor :param Operation operation: Operation instance :param str csPath: config path in CS for this operation """ # # placeholders for operation and request self.operation = None self.request = None self.dm = DataManager() self.fc = FileCatalog() self.csPath = csPath if csPath else "" # # get name name = self.__class__.__name__ # # all options are r/o properties now csOptionsDict = gConfig.getOptionsDict(self.csPath) csOptionsDict = csOptionsDict.get("Value", {}) for option, value in csOptionsDict.iteritems(): # # hack to set proper types try: value = eval(value) except NameError: pass self.makeProperty(option, value, True) # # pre setup logger self.log = gLogger.getSubLogger(name, True) # # set log level logLevel = getattr(self, "LogLevel") if hasattr(self, "LogLevel") else "INFO" self.log.setLevel(logLevel) # # list properties for option in csOptionsDict: self.log.debug("%s = %s" % (option, getattr(self, option))) # # setup operation if operation: self.setOperation(operation) # # initialize at least if hasattr(self, "initialize") and callable(getattr( self, "initialize")): getattr(self, "initialize")() def setOperation(self, operation): """ operation and request setter :param ~DIRAC.RequestManagementSystem.Client.Operation.Operation operation: operation instance :raises TypeError: if `operation` in not an instance of :class:`~DIRAC.RequestManagementSystem.Client.Operation.Operation` """ if not isinstance(operation, Operation): raise TypeError("expecting Operation instance") self.operation = operation self.request = operation._parent self.log = gLogger.getSubLogger( "pid_%s/%s/%s/%s" % (os.getpid(), self.request.RequestName, self.request.Order, self.operation.Type)) # @classmethod # def dataLoggingClient( cls ): # """ DataLoggingClient getter """ # if not cls.__dataLoggingClient: # from DIRAC.DataManagementSystem.Client.DataLoggingClient import DataLoggingClient # cls.__dataLoggingClient = DataLoggingClient() # return cls.__dataLoggingClient @classmethod def rssClient(cls): """ ResourceStatusClient getter """ if not cls.__rssClient: from DIRAC.ResourceStatusSystem.Client.ResourceStatus import ResourceStatus cls.__rssClient = ResourceStatus() return cls.__rssClient def getProxyForLFN(self, lfn): """ get proxy for lfn :param str lfn: LFN :return: S_ERROR or S_OK( "/path/to/proxy/file" ) """ dirMeta = returnSingleResult( self.fc.getDirectoryMetadata(os.path.dirname(lfn))) if not dirMeta["OK"]: return dirMeta dirMeta = dirMeta["Value"] ownerRole = "/%s" % dirMeta["OwnerRole"] if not dirMeta[ "OwnerRole"].startswith("/") else dirMeta["OwnerRole"] ownerDN = dirMeta["OwnerDN"] ownerProxy = None for ownerGroup in getGroupsWithVOMSAttribute(ownerRole): vomsProxy = gProxyManager.downloadVOMSProxy( ownerDN, ownerGroup, limited=True, requiredVOMSAttribute=ownerRole) if not vomsProxy["OK"]: self.log.debug( "getProxyForLFN: failed to get VOMS proxy for %s role=%s: %s" % (ownerDN, ownerRole, vomsProxy["Message"])) continue ownerProxy = vomsProxy["Value"] self.log.debug("getProxyForLFN: got proxy for %s@%s [%s]" % (ownerDN, ownerGroup, ownerRole)) break if not ownerProxy: return S_ERROR("Unable to get owner proxy") dumpToFile = ownerProxy.dumpAllToFile() if not dumpToFile["OK"]: self.log.error("getProxyForLFN: error dumping proxy to file: %s" % dumpToFile["Message"]) else: os.environ["X509_USER_PROXY"] = dumpToFile["Value"] return dumpToFile def getWaitingFilesList(self): """ prepare waiting files list, update Attempt, filter out MaxAttempt """ if not self.operation: self.log.warning( "getWaitingFilesList: operation not set, returning empty list") return [] waitingFiles = [ opFile for opFile in self.operation if opFile.Status == "Waiting" ] for opFile in waitingFiles: opFile.Attempt += 1 maxAttempts = getattr(self, "MaxAttempts") if hasattr( self, "MaxAttempts") else 1024 if opFile.Attempt > maxAttempts: opFile.Status = "Failed" if opFile.Error is None: opFile.Error = '' opFile.Error += " (Max attempts limit reached)" return [ opFile for opFile in self.operation if opFile.Status == "Waiting" ] def rssSEStatus(self, se, status, retries=2): """ check SE :se: for status :status: :param str se: SE name :param str status: RSS status """ # Allow a transient failure for _i in range(retries): rssStatus = self.rssClient().getElementStatus( se, "StorageElement", status) # gLogger.always( rssStatus ) if rssStatus["OK"]: return S_OK(rssStatus["Value"][se][status] != "Banned") return S_ERROR("%s status not found in RSS for SE %s" % (status, se)) @property def shifter(self): return self.__shifterList @shifter.setter def shifter(self, shifterList): self.__shifterList = shifterList def __call__(self): """ this one should be implemented in the inherited class should return S_OK/S_ERROR """ raise NotImplementedError("Implement me please!")
class DIRACBackend(GridBackend): """Grid backend using the GFAL command line tools `gfal-*`.""" def __init__(self, **kwargs): GridBackend.__init__(self, catalogue_prefix='', **kwargs) from DIRAC.Core.Base import Script Script.initialize() from DIRAC.FrameworkSystem.Client.ProxyManagerClient import ProxyManagerClient self.pm = ProxyManagerClient() proxy = self.pm.getUserProxiesInfo() if not proxy['OK']: raise BackendException("Proxy error.") from DIRAC.Interfaces.API.Dirac import Dirac self.dirac = Dirac() from DIRAC.Resources.Catalog.FileCatalog import FileCatalog self.fc = FileCatalog() from DIRAC.DataManagementSystem.Client.DataManager import DataManager self.dm = DataManager() self._xattr_cmd = sh.Command('gfal-xattr').bake(_tty_out=False) self._replica_checksum_cmd = sh.Command('gfal-sum').bake(_tty_out=False) self._bringonline_cmd = sh.Command('gfal-legacy-bringonline').bake(_tty_out=False) self._cp_cmd = sh.Command('gfal-copy').bake(_tty_out=False) self._ls_se_cmd = sh.Command('gfal-ls').bake(color='never', _tty_out=False) self._move_cmd = sh.Command('gfal-rename').bake(_tty_out=False) self._mkdir_cmd = sh.Command('gfal-mkdir').bake(_tty_out=False) self._replicate_cmd = sh.Command('dirac-dms-replicate-lfn').bake(_tty_out=False) self._add_cmd = sh.Command('dirac-dms-add-file').bake(_tty_out=False) @staticmethod def _check_return_value(ret): if not ret['OK']: raise BackendException("Failed: %s", ret['Message']) for path, error in ret['Value']['Failed'].items(): if ('No such' in error) or ('Directory does not' in error): raise DoesNotExistException("No such file or directory.") else: raise BackendException(error) def _is_dir(self, lurl): isdir = self.fc.isDirectory(lurl) self._check_return_value(isdir) return isdir['Value']['Successful'][lurl] def _is_file(self, lurl): isfile = self.fc.isFile(lurl) self._check_return_value(isfile) return isfile['Value']['Successful'][lurl] def _get_dir_entry(self, lurl, infodict=None): """Take a lurl and return a DirEntry.""" # If no dctionary with the information is specified, get it from the catalogue try: md = infodict['MetaData'] except TypeError: md = self.fc.getFileMetadata(lurl) if not md['OK']: raise BackendException("Failed to list path '%s': %s", lurl, md['Message']) for path, error in md['Value']['Failed'].items(): if 'No such file' in error: # File does not exist, maybe a directory? md = self.fc.getDirectoryMetadata(lurl) for path, error in md['Value']['Failed'].items(): raise DoesNotExistException("No such file or directory.") else: raise BackendException(md['Value']['Failed'][lurl]) md = md['Value']['Successful'][lurl] return DirEntry(posixpath.basename(lurl), mode=oct(md.get('Mode', -1)), links=md.get('links', -1), gid=md['OwnerGroup'], uid=md['Owner'], size=md.get('Size', -1), modified=str(md.get('ModificationDate', '?'))) def _iter_directory(self, lurl): """Iterate over entries in a directory.""" ret = self.fc.listDirectory(lurl) if not ret['OK']: raise BackendException("Failed to list path '%s': %s", lurl, ret['Message']) for path, error in ret['Value']['Failed'].items(): if 'Directory does not' in error: # Dir does not exist, maybe a File? if self.fc.isFile(lurl): lst = [(lurl, None)] break else: raise DoesNotExistException("No such file or Directory.") else: raise BackendException(ret['Value']['Failed'][lurl]) else: # Sort items by keys, i.e. paths lst = sorted(ret['Value']['Successful'][lurl]['Files'].items() + ret['Value']['Successful'][lurl]['SubDirs'].items()) for item in lst: yield item # = path, dict def _ls(self, lurl, **kwargs): # Translate keyword arguments d = kwargs.pop('directory', False) if d: # Just the requested entry itself yield self._get_dir_entry(lurl) return for path, info in self._iter_directory(lurl): yield self._get_dir_entry(path, info) def _ls_se(self, surl, **kwargs): # Translate keyword arguments d = kwargs.pop('directory', False) args = [] if -d: args.append('-d') args.append('-l') args.append(surl) try: output = self._ls_se_cmd(*args, **kwargs) except sh.ErrorReturnCode as e: if 'No such file' in e.stderr: raise DoesNotExistException("No such file or Directory.") else: raise BackendException(e.stderr) for line in output: fields = line.split() mode, links, gid, uid, size = fields[:5] name = fields[-1] modified = ' '.join(fields[5:-1]) yield DirEntry(name, mode=mode, links=int(links), gid=gid, uid=uid, size=int(size), modified=modified) def _replicas(self, lurl, **kwargs): # Check the lurl actually exists self._ls(lurl, directory=True) rep = self.dirac.getReplicas(lurl) self._check_return_value(rep) rep = rep['Value']['Successful'][lurl] return rep.values() def _exists(self, surl, **kwargs): try: ret = self._ls_se_cmd(surl, '-d', '-l', **kwargs).strip() except sh.ErrorReturnCode as e: if 'No such file' in e.stderr: return False else: if len(e.stderr) == 0: raise BackendException(e.stdout) else: raise BackendException(e.stderr) else: return ret[0] != 'd' # Return `False` for directories def _register(self, surl, lurl, verbose=False, **kwargs): # Register an existing physical copy in the file catalogue se = storage.get_SE(surl).name # See if file already exists in DFC ret = self.fc.getFileMetadata(lurl) try: self._check_return_value(ret) except DoesNotExistException: # Add new file size = next(self._ls_se(surl, directory=True)).size checksum = self.checksum(surl) guid = str(uuid.uuid4()) # The guid does not seem to be important. Make it unique if possible. ret = self.dm.registerFile((lurl, surl, size, se, guid, checksum)) else: # Add new replica ret = self.dm.registerReplica((lurl, surl, se)) self._check_return_value(ret) if verbose: print_("Successfully registered replica %s of %s from %s."%(surl, lurl, se)) return True def _deregister(self, surl, lurl, verbose=False, **kwargs): # DIRAC only needs to know the SE name to deregister a replica se = storage.get_SE(surl).name ret = self.dm.removeReplicaFromCatalog(se, [lurl]) self._check_return_value(ret) if verbose: print_("Successfully deregistered replica of %s from %s."%(lurl, se)) return True def _state(self, surl, **kwargs): try: state = self._xattr_cmd(surl, 'user.status', **kwargs).strip() except sh.ErrorReturnCode as e: if "No such file" in e.stderr: raise DoesNotExistException("No such file or Directory.") state = '?' except sh.SignalException_SIGSEGV: state = '?' return state def _checksum(self, surl, **kwargs): try: checksum = self._replica_checksum_cmd(surl, 'ADLER32', **kwargs).split()[1] except sh.ErrorReturnCode: checksum = '?' except sh.SignalException_SIGSEGV: checksum = '?' except IndexError: checksum = '?' return checksum def _bringonline(self, surl, timeout, verbose=False, **kwargs): if verbose: out = sys.stdout else: out = None # gfal does not notice when files come online, it seems # Just send a single short request, then check regularly if verbose: out = sys.stdout else: out = None end = time.time() + timeout try: self._bringonline_cmd('-t', 10, surl, _out=out, **kwargs) except sh.ErrorReturnCode as e: # The command fails if the file is not online # To be expected after 10 seconds if "No such file" in e.stderr: # Except when the file does not actually exist on the tape storage raise DoesNotExistException("No such file or Directory.") wait = 5 while(True): if verbose: print_("Checking replica state...") if self.is_online(surl): if verbose: print_("Replica brought online.") return True time_left = end - time.time() if time_left <= 0: if verbose: print_("Could not bring replica online.") return False wait *= 2 if time_left < wait: wait = time_left if verbose: print_("Timeout remaining: %d s"%(time_left)) print_("Checking again in: %d s"%(wait)) time.sleep(wait) def _replicate(self, source_surl, destination_surl, lurl, verbose=False, **kwargs): if verbose: out = sys.stdout else: out = None source = storage.get_SE(source_surl).name destination = storage.get_SE(destination_surl).name try: self._replicate_cmd(lurl, destination, source, _out=out, **kwargs) except sh.ErrorReturnCode as e: if 'No such file' in e.stderr: raise DoesNotExistException("No such file or directory.") else: if len(e.stderr) == 0: raise BackendException(e.stdout) else: raise BackendException(e.stderr) return True def _get(self, surl, localpath, verbose=False, **kwargs): if verbose: out = sys.stdout else: out = None try: self._cp_cmd('-f', '--checksum', 'ADLER32', surl, localpath, _out=out, **kwargs) except sh.ErrorReturnCode as e: if 'No such file' in e.stderr: raise DoesNotExistException("No such file or directory.") else: if len(e.stderr) == 0: raise BackendException(e.stdout) else: raise BackendException(e.stderr) return os.path.isfile(localpath) def _put(self, localpath, surl, lurl, verbose=False, **kwargs): if verbose: out = sys.stdout else: out = None se = storage.get_SE(surl).name try: self._add_cmd(lurl, localpath, se, _out=out, **kwargs) except sh.ErrorReturnCode as e: if 'No such file' in e.stderr: raise DoesNotExistException("No such file or directory.") else: if len(e.stderr) == 0: raise BackendException(e.stdout) else: raise BackendException(e.stderr) return True def _remove(self, surl, lurl, last=False, verbose=False, **kwargs): se = storage.get_SE(surl).name if last: # Delete lfn if verbose: print_("Removing all replicas of %s."%(lurl,)) ret = self.dm.removeFile([lurl]) else: if verbose: print_("Removing replica of %s from %s."%(lurl, se)) ret = self.dm.removeReplica(se, [lurl]) if not ret['OK']: raise BackendException('Failed: %s'%(ret['Message'])) for lurl, error in ret['Value']['Failed'].items(): if 'No such file' in error: raise DoesNotExistException("No such file or directory.") else: raise BackendException(error) return True def _rmdir(self, lurl, verbose=False): """Remove the an empty directory from the catalogue.""" rep = self.fc.removeDirectory(lurl) self._check_return_value(rep) return True def _move_replica(self, surl, new_surl, verbose=False, **kwargs): if verbose: out = sys.stdout else: out = None try: folder = posixpath.dirname(new_surl) self._mkdir_cmd(folder, '-p', _out=out, **kwargs) self._move_cmd(surl, new_surl, _out=out, **kwargs) except sh.ErrorReturnCode as e: if 'No such file' in e.stderr: raise DoesNotExistException("No such file or directory.") else: if len(e.stderr) == 0: raise BackendException(e.stdout) else: raise BackendException(e.stderr) return True