def getGoodInputMics(self): inpMics = self._getInputMicrographs() goodMics = SetOfMicrographs() goodMicFns = self._getGoodMicFns('') for inpMic in inpMics: if inpMic.getFileName() in goodMicFns: goodMics.append(inpMic) return goodMics
def createOutputStep(self): input = self.input.get() output = SetOfMicrographs.create(self._getPath()) # output.copyInfo(input) output.setSamplingRate(input.getSamplingRate()) # For each tomogram for tomo in input: self.appendMicsFromTomogram(output, tomo) self._defineOutputs( **{ProtTomoToMicsOutput.outputMicrographs.name: output}) self._defineSourceRelation(self.input, output)
def test_micrographsToMd(self): """ Test the conversion of a SetOfMicrographs to Xmipp metadata. """ micSet = SetOfMicrographs( filename=self.getOutputPath("micrographs.sqlite")) n = 3 ctfs = [ CTFModel(defocusU=10000, defocusV=15000, defocusAngle=15), CTFModel(defocusU=20000, defocusV=25000, defocusAngle=25) ] acquisition = Acquisition(magnification=60000, voltage=300, sphericalAberration=2., amplitudeContrast=0.07) micSet.setAcquisition(acquisition) micSet.setSamplingRate(1.) mdXmipp = emlib.MetaData() for i in range(n): p = Micrograph() file = self.dataset.getFile("mic%s" % (i + 1)) p.setLocation(file) ctf = ctfs[i % 2] p.setCTF(ctf) micSet.append(p) id = mdXmipp.addObject() mdXmipp.setValue(emlib.MDL_ENABLED, 1, id) mdXmipp.setValue(emlib.MDL_ITEM_ID, int(i + 1), id) mdXmipp.setValue(emlib.MDL_MICROGRAPH, file, id) # set CTFModel params mdXmipp.setValue(emlib.MDL_CTF_DEFOCUSU, ctf.getDefocusU(), id) mdXmipp.setValue(emlib.MDL_CTF_DEFOCUSV, ctf.getDefocusV(), id) mdXmipp.setValue(emlib.MDL_CTF_DEFOCUS_ANGLE, ctf.getDefocusAngle(), id) # set Acquisition params mdXmipp.setValue(emlib.MDL_CTF_Q0, acquisition.getAmplitudeContrast(), id) mdXmipp.setValue(emlib.MDL_CTF_CS, acquisition.getSphericalAberration(), id) mdXmipp.setValue(emlib.MDL_CTF_VOLTAGE, acquisition.getVoltage(), id) mdScipion = emlib.MetaData() setOfMicrographsToMd(micSet, mdScipion) writeSetOfMicrographs(micSet, self.getOutputPath("micrographs.xmd")) self.assertEqual(mdScipion, mdXmipp, "metadata are not the same")
def _checkNewInput(self): # Check if there are new micrographs to process from the input set micsFile = self.inputMicrographs.get().getFileName() micsSet = SetOfMicrographs(filename=micsFile) micsSet.loadAllProperties() self.SetOfMicrographs = [m.clone() for m in micsSet] self.streamClosed = micsSet.isStreamClosed() micsSet.close() newMics = any(m.getObjId() not in self.insertedDict for m in self.inputMics) outputStep = self._getFirstJoinStep() if newMics: fDeps = self._insertNewMicsSteps(self.insertedDict, self.inputMics) if outputStep is not None: outputStep.addPrerequisites(*fDeps) self.updateSteps()
def test_micrographImport(self): """ Import an EMX file with micrographs and defocus """ emxFn = self.dataset.getFile('emxMicrographCtf1') protEmxImport = self.newProtocol( ProtImportMicrographs, objLabel='emx - import mics', importFrom=ProtImportMicrographs.IMPORT_FROM_EMX, emxFile=emxFn, magnification=10000, samplingRate=2.46, ) self.launchProtocol(protEmxImport) micFn = self.dataset.getFile('emxMicrographCtf1Gold') mics = SetOfMicrographs(filename=micFn) for mic1, mic2 in izip(mics, protEmxImport.outputMicrographs): # Remove the absolute path in the micrographs to # really check that the attributes should be equal mic1.setFileName(os.path.basename(mic1.getFileName())) mic2.setFileName(os.path.basename(mic2.getFileName())) self.assertTrue(mic1.equalAttributes(mic2, verbose=True))
def test_particleImportDefocus(self): """ Import an EMX file with a stack of particles that has defocus """ emxFn = self.dataset.getFile('defocusParticleT2') protEmxImport = self.newProtocol( ProtImportParticles, objLabel='emx - import ctf', importFrom=ProtImportParticles.IMPORT_FROM_EMX, emxFile=emxFn, alignType=3, magnification=10000, samplingRate=2.8) self.launchProtocol(protEmxImport) micFn = self.dataset.getFile('micrographsGoldT2') mics = SetOfMicrographs(filename=micFn) for mic1, mic2 in izip(mics, protEmxImport.outputMicrographs): # Remove the absolute path in the micrographs to # really check that the attributes should be equal mic1.setFileName(os.path.basename(mic1.getFileName())) mic2.setFileName(os.path.basename(mic2.getFileName())) self.assertTrue(mic1.equalAttributes(mic2, verbose=True))
def _createSetOfMics(self, n=10, nOptics=2): micName = 'BPV_13%02d.mrc' psdName = 'BPV_13%02d_PSD.ctf:mrc' ogName = 'opticsGroup%d' mtfFile = 'mtfFile%d.star' cleanPath(self.getOutputPath('micrographs.sqlite')) micsDb = self.getOutputPath('micrographs.sqlite') outputMics = SetOfMicrographs(filename=micsDb) outputMics.setSamplingRate(1.234) mic = SetOfMicrographs.ITEM_TYPE() acq = Acquisition(voltage=300, sphericalAberration=2, amplitudeContrast=0.1, magnification=60000) og = OpticsGroups.create(rlnMtfFileName='') fog = og.first() ctf = CTFModel(defocusU=10000, defocusV=15000, defocusAngle=15) outputMics.setAcquisition(acq) mic.setAcquisition(acq) mic.setCTF(ctf) itemsPerOptics = n // nOptics for i in range(1, n + 1): mic.setFileName(micName % i) ctf = mic.getCTF() ctf.setPsdFile(psdName % i) ctf.setFitQuality(np.random.uniform()) ctf.setResolution(np.random.uniform(3, 15)) ogNumber = (i - 1) // itemsPerOptics + 1 ogDict = { 'rlnOpticsGroup': ogNumber, 'rlnOpticsGroupName': ogName % ogNumber, 'rlnMtfFileName': mtfFile % ogNumber } if ogNumber in og: og.update(ogNumber, **ogDict) else: og.add(fog._replace(**ogDict)) mic.rlnOpticsGroup = ogNumber mic.setObjId(None) outputMics.append(mic) print(">>> Writing micrograph set to: ", micsDb) outputMics.write() return outputMics
def writeCtfStarStep(self): pwutils.cleanPath(self._getExportPath()) pwutils.makePath(self._getExportPath()) inputCTF = self.inputCTF.get() if self.micrographSource == 0: # same as CTF estimation ctfMicSet = inputCTF.getMicrographs() else: ctfMicSet = self.inputMicrographs.get() micSet = SetOfMicrographs(filename=':memory:') psd = inputCTF.getFirstItem().getPsdFile() hasPsd = psd and os.path.exists(psd) if hasPsd: psdPath = self._getExportPath('PSD') pwutils.makePath(psdPath) print("Writing PSD files to %s" % psdPath) for ctf in inputCTF: # Get the corresponding micrograph mic = ctfMicSet[ctf.getObjId()] if mic is None: print("Skipping CTF id: %s, it is missing from input " "micrographs. " % ctf.getObjId()) continue micFn = mic.getFileName() if not os.path.exists(micFn): print("Skipping micrograph %s, it does not exists. " % micFn) continue mic2 = mic.clone() mic2.setCTF(ctf) if hasPsd: psdFile = ctf.getPsdFile() newPsdFile = os.path.join( psdPath, '%s_psd.mrc' % pwutils.removeExt(mic.getMicName())) if not os.path.exists(psdFile): print("PSD file %s does not exits" % psdFile) print("Skipping micrograph %s" % micFn) continue pwutils.copyFile(psdFile, newPsdFile) # PSD path is relative to Export dir newPsdFile = os.path.relpath(newPsdFile, self._getExportPath()) ctf.setPsdFile(newPsdFile) else: # remove pointer to non-existing psd file ctf.setPsdFile(None) micSet.append(mic2) print("Writing set: %s to: %s" % (inputCTF, self._getStarFile())) micDir = self._getExportPath('Micrographs') pwutils.makePath(micDir) starWriter = convert.createWriter(rootDir=self._getExportPath(), outputDir=micDir, useBaseName=True) starWriter.writeSetOfMicrographs(micSet, self._getStarFile())
def testCtfConsensus1(self): # create one micrograph set fnMicSet = self.proj.getTmpPath("mics.sqlite") fnMic = self.proj.getTmpPath("mic.mrc") mic = Micrograph() mic.setFileName(fnMic) micSet = SetOfMicrographs(filename=fnMicSet) # create two CTFsets fnCTF1 = self.proj.getTmpPath("ctf1.sqlite") ctfSet1 = SetOfCTF(filename=fnCTF1) # create one fake micrographs image projSize = 32 img = emlib.Image() img.setDataType(emlib.DT_FLOAT) img.resize(projSize, projSize) img.write(fnMic) # fill the sets for i in range(1, 4): mic = Micrograph() mic.setFileName(fnMic) micSet.append(mic) defocusU = 4000 + 10 * i defocusV = 4000 + i defocusAngle = i * 10 resolution = 2 psdFile = "psd_1%04d" % i ctf = self._getCTFModel(defocusU, defocusV, defocusAngle, resolution, psdFile) ctf.setMicrograph(mic) ctfSet1.append(ctf) ctfSet1.write() micSet.write() # import micrograph set args = { 'importFrom': ProtImportMicrographs.IMPORT_FROM_SCIPION, 'sqliteFile': fnMicSet, 'amplitudConstrast': 0.1, 'sphericalAberration': 2., 'voltage': 100, 'samplingRate': 2.1 } protMicImport = self.newProtocol(ProtImportMicrographs, **args) protMicImport.setObjLabel('import micrographs from sqlite ') self.launchProtocol(protMicImport) # import ctfsets protCTF1 = \ self.newProtocol(ProtImportCTF, importFrom=ProtImportCTF.IMPORT_FROM_SCIPION, filesPath=fnCTF1) protCTF1.inputMicrographs.set(protMicImport.outputMicrographs) protCTF1.setObjLabel('import ctfs from scipion_1 ') self.launchProtocol(protCTF1) # launch CTF consensus protocol protCtfConsensus = self.newProtocol(XmippProtCTFConsensus) protCtfConsensus.inputCTF.set(protCTF1.outputCTF) protCtfConsensus.setObjLabel('ctf consensus') self.launchProtocol(protCtfConsensus) self.checkOutputSize(protCtfConsensus) ctf0 = protCtfConsensus.outputCTF.getFirstItem() resolution = int(ctf0.getResolution()) defocusU = int(ctf0.getDefocusU()) self.assertEqual(resolution, 2) self.assertEqual(defocusU, 4010)
def convertInputStep(self, inputCoordinates, scale, kfold): """ Converts a set of coordinates to box files and binaries to mrc if needed. It generates 2 folders 1 for the box files and another for the mrc files. """ micIds = [] coordSet = self.inputCoordinates.get() setFn = coordSet.getFileName() self.debug("Loading input db: %s" % setFn) # Load set of coordinates with a user determined number of coordinates for the training step enoughMicrographs = False while True: coordSet = SetOfCoordinates(filename=setFn) coordSet._xmippMd = params.String() coordSet.loadAllProperties() for micAgg in coordSet.aggregate(["MAX"], "_micId", ["_micId"]): micIds.append(micAgg["_micId"]) if len(micIds) == self.micsForTraining.get(): enoughMicrographs = True break if enoughMicrographs: break else: if coordSet.isStreamClosed(): raise Exception("We have a problem!!") self.info("Not yet there: %s" % len(micIds)) import time time.sleep(10) # Create input folder and pre-processed micrographs folder micDir = self._getFileName(TRAINING) pw.utils.makePath(micDir) prepDir = self._getFileName(TRAININGPREPROCESS) pw.utils.makePath(prepDir) ih = ImageHandler() # Get a refreshed set of micrographs micsFn = self.inputCoordinates.get().getMicrographs().getFileName() # not updating, refresh problem coordMics = SetOfMicrographs(filename=micsFn) coordMics.loadAllProperties() # Create a 0/1 list to mark micrographs for training/testing n = len(micIds) indexes = np.zeros(n, dtype='int8') testSetImages = int((kfold / float(100)) * n) # Both the training and the test data set should contain at least one micrograph if testSetImages < 1: requiredMinimumPercentage = (1 * 100 / n) + 1 testSetImages = int((requiredMinimumPercentage / float(100)) * n) elif testSetImages == n: testSetImages = int(0.99 * n) indexes[:testSetImages] = 1 np.random.shuffle(indexes) self.info('indexes: %s' % indexes) # Write micrographs files csvMics = [ CsvMicrographList(self._getFileName(TRAININGLIST), 'w'), CsvMicrographList(self._getFileName(TRAININGTEST), 'w') ] # Store the micId and indexes in micDict micDict = {} for i, micId in zip(indexes, micIds): mic = coordMics[micId] micFn = mic.getFileName() baseFn = pw.utils.removeBaseExt(micFn) inputFn = self._getFileName(TRAINING_MIC, **{"mic": baseFn}) if micFn.endswith('.mrc'): pwutils.createAbsLink(os.path.abspath(micFn), inputFn) else: ih.convert(micFn, inputFn) prepMicFn = self._getFileName(TRAININGPRE_MIC, **{"mic": baseFn}) csvMics[i].addMic(micId, prepMicFn) micDict[micId] = i # store if train or test for csv in csvMics: csv.close() # Write particles files csvParts = [ CsvCoordinateList(self._getFileName(PARTICLES_TRAIN_TXT), 'w'), CsvCoordinateList(self._getFileName(PARTICLES_TEST_TXT), 'w') ] for coord in coordSet.iterItems(orderBy='_micId'): micId = coord.getMicId() if micId in micDict: x = int(round(float(coord.getX()) / scale)) y = int(round(float(coord.getY()) / scale)) csvParts[micDict[micId]].addCoord(micId, x, y) for csv in csvParts: csv.close()
class ProtTomoToMicsOutput(enum.Enum): outputMicrographs = SetOfMicrographs()