def _run(self, scanObject, result, depth, args): moduleResult = [] try: vbaparser = olevba.VBA_Parser( scanObject.objectHash, data=scanObject.buffer) #load ole into olevba if vbaparser.detect_vba_macros(): #VBA Macro Found # Loop to parse VBA Macro for (filename, stream_path, vba_filename, vba_code ) in vbaparser.extract_macros(): # macro extraction macrofilesdict = {} macrofilesdict.update({ 'Type': vbaparser.type, 'VBA_project': vbaparser.vba_projects, 'OLE_stream': stream_path, 'VBA_filename': vba_filename }) scanObject.addMetadata(self.module_name, "Parsed_Macros_Metadata", macrofilesdict) explodevbafilename = 'e_vba_%s_%s' % ( scanObject.objectHash, vba_filename ) # Exploded file name contains source hash moduleResult.append( ModuleObject(buffer=vba_code, externalVars=ExternalVars( filename=explodevbafilename))) # Loop to parse VBA Forms combinedstring = "" formfilesdlist = set() for (filename, stream_path, form_string) in vbaparser.extract_form_strings(): formfilesdlist.add( stream_path ) #set because stream_path could be the same over and over again combinedstring += " %s" % form_string #combining all found forms text into a single variable if combinedstring: #form text found scanObject.addMetadata(self.module_name, "VBA_Forms_Found_Streams", formfilesdlist) explodeformsfilename = 'e_vba_%s_combined_forms.txt' % ( scanObject.objectHash) moduleResult.append( ModuleObject(buffer=combinedstring, externalVars=ExternalVars( filename=explodeformsfilename))) vbaparser.close() except olevba.OlevbaBaseException as e: # exceptions from olevba import will raise olevbaerror = 'e_vba:err:%s' % e #scanObject.addFlag(olevbaerror) log_module("MSG", self.module_name, 0, scanObject, result, olevbaerror) except (QuitScanException, GlobalScanTimeoutError, GlobalModuleTimeoutError): raise return moduleResult
def _run(self, scanObject, result, depth, args): moduleResult = [] rtfp = rtfobj.RtfObjParser(scanObject.buffer) #import reference rtfp.parse() for rtfobject in rtfp.objects: i = rtfp.objects.index(rtfobject) #index if rtfobject.is_package: objtypeis = "OLEPackage" typeolepackagedict = {} typeolepackagedict.update({ 'Type': objtypeis, 'Index': i, 'Filename': rtfobject.filename, 'Source Patch': rtfobject.src_path, 'Temp Path': rtfobject.temp_path }) scanObject.addMetadata(self.module_name, "Parsed_Objects_Metadata", typeolepackagedict) moduleResult.append( ModuleObject(buffer=rtfobject.olepkgdata, externalVars=ExternalVars( filename='e_rtf_object_%08X.olepackage' % rtfobject.start))) elif rtfobject.is_ole: objtypeis = "OLE" typeoledict = {} typeoledict.update({ 'Type': objtypeis, 'Index': i, 'Format_id': rtfobject.format_id, 'Class_name': rtfobject.class_name, 'Size': rtfobject.oledata_size }) scanObject.addMetadata(self.module_name, "Parsed_Objects_Metadata", typeoledict) moduleResult.append( ModuleObject(buffer=rtfobject.oledata, externalVars=ExternalVars( filename='e_rtf_object_%08X.ole' % rtfobject.start))) else: objtypeis = "RAW" #Not a well-formed OLE object. typerawdict = {} typerawdict.update({'Type': objtypeis, 'Index': i}) scanObject.addMetadata(self.module_name, "Parsed_Objects_Metadata", typerawdict) moduleResult.append( ModuleObject(buffer=rtfobject.rawdata, externalVars=ExternalVars( filename='e_rtf_object_%08X.raw' % rtfobject.start))) return moduleResult
def _run(self, scanObject, result, depth, args): moduleResult = [] e = email.message_from_string(scanObject.buffer) attachments = [] i = 1 for p in e.walk(): childBuffer = p.get_payload(decode=True) if childBuffer is not None: filename = p.get_filename() if filename is None: filename = 'e_email_%s_%s' % (p.get_content_type(), i) else: attachments.append(filename) logging.debug("explode email: found filename: %s" % (filename)) moduleResult.append( ModuleObject(buffer=childBuffer, externalVars=ExternalVars( filename=filename, contentType=p.get_content_maintype()))) i += 1 # If enabled, this will combine the email headers and all decoded # text portions contained in the email into a single object if strtobool( get_option(args, 'createhybrid', 'explodeemlcreatehybrid', 'False')): # First, grab the headers header_end = scanObject.buffer.find('\x0a\x0a') hybrid = scanObject.buffer[:header_end] + '\n\n' for mo in moduleResult: if 'text' in mo.externalVars.contentType: hybrid += mo.buffer + '\n\n' # Add the hybrid as another object with a special content type # for easy identification. moduleResult.append( ModuleObject( buffer=hybrid, externalVars=ExternalVars( filename='e_email_hybrid', contentType='application/x-laika-eml-hybrid'))) # Since we already gathered up the attachment names, we'll add them # on behalf of META_EMAIL scanObject.addMetadata('META_EMAIL', 'Attachments', attachments) return moduleResult
def _run(self, scanObject, result, depth, args): moduleResult = [] minFileSize = 0 #Explode everything! useUnvalidatedFilenames = 0 if 'minFileSize' in args: try: minFileSize = int(args['minFileSize']) except (QuitScanException, GlobalScanTimeoutError, GlobalModuleTimeoutError): raise except: pass if 'useUnvalidatedFilenames' in args: try: minFileSize = int(args['useUnvalidatedFilenames']) except (QuitScanException, GlobalScanTimeoutError, GlobalModuleTimeoutError): raise except: pass file = StringIO.StringIO() file.write(scanObject.buffer) file.flush() file.seek(0) ole = olefile.OleFileIO(file) lstStreams = ole.listdir() numStreams = 0 for stream in lstStreams: try: if ole.get_size(stream) >= minFileSize: numStreams += 1 streamF = ole.openstream(stream) childBuffer = streamF.read() if childBuffer: filename = "e_ole_stream_" + str(numStreams) try: u = unicode(str(stream), "utf-8") filename = u.encode("utf-8") except (QuitScanException, GlobalScanTimeoutError, GlobalModuleTimeoutError): raise except: pass #keep ole_stream_number as filename moduleResult.append( ModuleObject( buffer=childBuffer, externalVars=ExternalVars(filename=filename))) except (QuitScanException, GlobalScanTimeoutError, GlobalModuleTimeoutError): raise except: log_module("MSG", self.module_name, 0, scanObject, result, "ERROR EXTRACTING STREAM: " + str(stream)) ole.close() file.close() return moduleResult
def _unzip_file(self, moduleResult, file, scanObject, result, password, file_limit, byte_limit): """ Attempts to unzip the file, looping through the namelist and adding each object to the ModuleResult. We add the filename from the archive to the external variables so it is available during recursive scanning. If the file is encrypted (determined by an exception), add the flag and return Arguments: moduleResult -- an instance of the ModuleResult class created above file -- a file object created using the buffer passed into this _module scanObject -- an instance of the ScanObject class, created by the dispatcher result -- an instance of the ScanResult class, created by the caller password -- the password for the zipfile, if any file_limit -- the maximum number of files to explode, adds flag if exceeded byte_limit -- the maximum size in bytes for an exploded buffer, adds flag if exceeded Returns: Nothing, modification made directly moduleResult. """ try: zf = zipfile.ZipFile(file) if password: zf.setpassword(password) file_count = 0 #dir_depth_max = 0 #dir_count = 0 namelist = zf.namelist() scanObject.addMetadata(self.module_name, "Total_Files", len(namelist)) exceeded_byte_limit = False for name in namelist: if byte_limit: info = zf.getinfo(name) if info.file_size > byte_limit: logging.debug("EXPLODE_ZIP: skipping file due to byte limit") exceeded_byte_limit = True continue childBuffer = zf.read(name) if byte_limit and len(childBuffer) > byte_limit: logging.debug("EXPLODE_ZIP: skipping file due to byte limit") exceeded_byte_limit = True continue moduleResult.append(ModuleObject(buffer=childBuffer, externalVars=ExternalVars(filename='e_zip_%s' % name))) file_count += 1 if file_limit and file_count >= file_limit: scanObject.addFlag("zip:err:LIMIT_EXCEEDED") logging.debug("EXPLODE_ZIP: breaking due to file limit") break if exceeded_byte_limit: scanObject.addFlag("zip:err:BYTE_LIMIT_EXCEEDED") except RuntimeError as rte: if "encrypted" in rte.args[0]: scanObject.addFlag("ENCRYPTED_ZIP") else: raise
def _helloworld(buffer, moduleResult, helloworld_param): ''' An example of a worker function you may include in your module. Note the @staticmethod "decorator" on the top of the function. These private methods are set to static to ensure immutability since they may be called more than once in the lifetime of the class ''' flags = [] # Using the logging module is a great way to create debugging output during testing without generating anything during production. # The Laika framework does not use the logging module for its logging (it uses syslog underneath several helpers found it laikaboss.util), # so none of thses messages will clutter up Laika logs. logging.debug('Hello world!') logging.debug('HELLOWORLD invoked with helloworld_param value %i', helloworld_param) if helloworld_param < 10: flags.append('e_helloworld:nfo:helloworldsmall') else: logging.debug('HELLOWORLD(%i >= 10) setting flag', helloworld_param) flags.append('e_helloworld:nfo:helloworld') if helloworld_param > 20: logging.debug('HELLOWORLD(%i > 20) adding new object', helloworld_param) flags.append('e_helloworld:nfo:helloworldx2') if len(buffer) > helloworld_param: # Take the module buffer and trim the first helloworld_param size bytes. buff = buffer[helloworld_param:] object_name = 'e_helloworld_%s_%s' % ( len(buff), hashlib.md5(buff).hexdigest()) logging.debug('HELLOWORLD - New object: %s', object_name) # And we can create new objects that go back to the dispatcher and subsequent laika modules # Any modifications we make to the "moduleResult" variable here will go back to the main function # laikaboss/objectmodel.py defines the variables you can set for externalVars. Two most common to set are # contentType # filename moduleResult.append( ModuleObject( buffer=buff, externalVars=ExternalVars(filename=object_name))) else: logging.debug( 'HELLOWORLD - object is too small to carve (%i < %i)', len(buffer), helloworld_param) return set(flags)
def _run(self, scanObject, result, depth, args): moduleResult = [] try: decoded = base64.b64decode(scanObject.buffer) moduleResult.append( ModuleObject(buffer=decoded, externalVars=ExternalVars(filename="d_base64_%s" % len(decoded)))) return moduleResult except: raise
def _run(self, scanObject, result, depth, args): moduleResult = [] for index, obj_len, obj_data in rtfobj.rtf_iter_objects( scanObject.filename): # index location of the RTF object becomes the file name name = 'index_' + str(index) moduleResult.append( ModuleObject(buffer=obj_data, externalVars=ExternalVars(filename='e_rtf_%s' % name))) return moduleResult
def _run(self, scanObject, result, depth, args): ''' The core of your laika module. This is how your code will be invoked Requires: Package Dependencies Only Assumes: scanObject.buffer is a upx compressed executable Ensures: 1. No propagating errors 2. Decompressed buffer is returned as a new buffer to scanned Error Handling: 1. If upx decompress fails, output file will not be created attempt to open the decompressed file will throw file not exists exception silently passed Module Execution: 1. Dump the scanObject.buffer into a named temp file 2. Call upx decompresser outputting to the <input_filename>_output 3. Open the decompressed buffer file and read it into a buffer 4. Close and delete the decompressed buffer file 5. If length of the decompressed buffer is > the compressed buffer (decompression worked): True: Add the buffer to the result object False: Do nothing (future perhaps add failed to decompress metadata?) 6. Return ''' moduleResult = [] try: with tempfile.NamedTemporaryFile( dir=self.TEMP_DIR) as temp_file_input: temp_file_input_name = temp_file_input.name temp_file_input.write(scanObject.buffer) temp_file_input.flush() temp_file_output_name = temp_file_input_name + "_output" strCMD = "upx -d " + temp_file_input_name + " -o " + temp_file_output_name outputString = pexpect.run(strCMD) f = open( temp_file_output_name ) #if strCMD failed, this will throw a file not exists exception newbuf = f.read() f.close() os.remove(temp_file_output_name) if len(newbuf) > len(scanObject.buffer): moduleResult.append( ModuleObject( buffer=newbuf, externalVars=ExternalVars(filename="e_upx"))) except (QuitScanException, GlobalScanTimeoutError, GlobalModuleTimeoutError): raise except: pass return moduleResult
def _run(self, scanObject, result, depth, args): moduleResult = [] file = StringIO.StringIO(scanObject.buffer) gzip_file = gzip.GzipFile(fileobj=file) new_buffer = gzip_file.read() moduleResult.append( ModuleObject(buffer=new_buffer, externalVars=ExternalVars(filename="gzip_%s" % len(new_buffer)))) return moduleResult
def _run(self, scanObject, result, depth, args): moduleResult = [] flags = [] buffer = scanObject.buffer cert = None try: #scanObject.addMetadata(self.module_name, key, value) input_bio = BIO.MemoryBuffer(buffer) #Check for PEM or DER if buffer[:1] == "0": #DER p7 = SMIME.PKCS7(m2.pkcs7_read_bio_der(input_bio._ptr()), 1) else: #PEM p7 = SMIME.load_pkcs7_bio(input_bio) certs = p7.get0_signers(X509.X509_Stack()) #some pkcs7's should have more than one certificate in them. openssl can extract them handily. #openssl pkcs7 -inform der -print_certs -in MYKEY.DSA i = 0 for cert in certs: cert_filename = "%x.der" % (cert.get_serial_number()) moduleResult.append( ModuleObject( buffer=cert.as_der(), externalVars=ExternalVars(filename=cert_filename))) i = i + 1 except (QuitScanException, GlobalScanTimeoutError, GlobalModuleTimeoutError): raise except: exc_type, exc_value, exc_traceback = sys.exc_info() logging.exception("Error parsing cert in " + str(get_scanObjectUID(getRootObject(result)))) ugly_error_string = str(exc_value) #nicer_error_string = string.split(string.split(ugly_error_string,":")[4])[0] nicer_error_string = ugly_error_string.split(":")[4].split()[0] scanObject.addFlag("pkcs7:err:" + nicer_error_string) return moduleResult
def _run(self, scanObject, result, depth, args): moduleResult = [] pdfBuffer = cStringIO.StringIO(scanObject.buffer) try: pdfFile = PdfFileReader(pdfBuffer) docInfo = pdfFile.getDocumentInfo() for metaItem in docInfo: scanObject.addMetadata(self.module_name, metaItem[1:], str(docInfo[metaItem])) pdf = PDFParser(pdfBuffer) pdfDoc = PDFDocument(pdf) for xref in pdfDoc.xrefs: for objid in xref.get_objids(): try: obj = pdfDoc.getobj(objid) if isinstance(obj, dict): for (key, val) in obj.iteritems(): if key in ['AA', 'OpenAction']: scanObject.addFlag('pdf:nfo:auto_action') elif key in ['JS', 'Javascript']: scanObject.addFlag('pdf:nfo:js_embedded') if isinstance(obj, PDFStream): if 'Type' in obj.attrs and obj.attrs[ 'Type'] == LIT('EmbeddedFile'): moduleResult.append( ModuleObject( buffer=obj.get_data(), externalVars=ExternalVars( filename='e_pdf_stream_%s' % objid))) except PDFObjectNotFound: scanObject.addFlag('pdf:err:missing_object_%s' % objid) except ScanError: raise except PSEOF: scanObject.addFlag('pdf:err:unexpected_eof') except ScanError: raise return moduleResult
def _run(self, scanObject, result, depth, args): moduleResult = [] try: fstr = StringIO(scanObject.buffer) fstr.seek(4) swf_size = struct.unpack("<i", fstr.read(4))[0] logging.debug("swf size is %s" % swf_size) fstr.seek(0) fws = self._decompressSWF(fstr, swf_size) if fws != None and fws != "ERROR": moduleResult.append(ModuleObject(buffer=fws, externalVars=ExternalVars(filename='e_swf_%s' % swf_size))) return moduleResult except: raise finally: logging.debug("extract_swf - closing stringio handle in run") fstr.close()
def _zmqSendBuffer(self, milterContext, numRetries, REQUEST_TIMEOUT, SERVER_ENDPOINT): gotResponseFromScanner = -1 self.client = Client(SERVER_ENDPOINT) log = milterContext.uuid + " Sending " + str( milterContext.qid) + " to " + SERVER_ENDPOINT self.logger.writeLog(syslog.LOG_DEBUG, "%s" % (str(log))) myhostname = socket.gethostname() externalObject = ExternalObject( buffer=milterContext.fileBuffer, externalVars=ExternalVars( filename=milterContext.archiveFileName, source=milterContext.milterConfig.milterName+"-"+ \ myhostname.split(".")[0], ephID=milterContext.qid, uniqID=milterContext.messageID ), level=level_metadata ) result = self.client.send(externalObject, retry=numRetries, timeout=REQUEST_TIMEOUT) if result: self.match = flagRollup(result) if not self.match: self.match = [] self.attachements = ','.join(getAttachmentList(result)) strScanResult = finalDispositionFromResult(result) strScanResults = " ".join(dispositionFromResult(result)) if strScanResult: self.strScanResult = strScanResult try: self.dispositions = strScanResults except: self.logger.writeLog( syslog.LOG_ERR, milterContext.uuid + " ERROR getting dispositions via client lib") gotResponseFromScanner = 1 else: self.logger.writeLog( syslog.LOG_ERR, milterContext.uuid + " " + str(milterContext.qid) + "| no result object from scanner, returning SCAN ERROR") return gotResponseFromScanner
def _run(self, scanObject, result, depth, args): moduleResult = [] success = "Decompiled dotnet results available in dir: \n" try: outfile = "<insert_output_dir>/%s/decompiled_%s/" % ( scanObject.rootUID, scanObject.objectHash) filename = "e_decompiled_dotnet_%s" % md5.new( scanObject.filename).hexdigest() outname = "decompiled_%s/\n" % scanObject.objectHash subprocess.check_output([ 'mono', '<path_to_exe>/dnSpy.Console.exe', '--no-resx', '--no-sln', scanObject.filename, '-o', outfile ]) moduleResult.append( ModuleObject(buffer=success + outname, externalVars=ExternalVars(filename=filename))) except ScanError: raise return moduleResult
def main(): parser = OptionParser( usage="usage: %prog [options] (/path/to/file | stdin)") parser.add_option("-d", "--debug", action="store_true", dest="debug", help="enable debug messages to the console.") parser.add_option("-r", "--remove-limit", action="store_true", dest="nolimit", help="disable 20mb size limit (be careful!)") parser.add_option("-t", "--timeout", action="store", type="int", dest="timeout", help="adjust request timeout period (in seconds)") parser.add_option("-c", "--config-path", action="store", type="string", dest="config_path", help="specify a path to si-cloudscan.conf.") parser.add_option("-a", "--address", action="store", type="string", dest="broker_host", help="specify an IP and port to connect to the broker") parser.add_option("-f", "--file-list", action="store", type="string", dest="file_list", help="Specify a list of files to scan") parser.add_option("-s", "--ssh-host", action="store", type="string", dest="ssh_host", help="specify a host for the SSH tunnel") parser.add_option( "-p", "--num-procs", action="store", type="int", default=6, dest="num_procs", help="Specify the number of processors to use for recursion") parser.add_option("-u", "--source", action="store", type="string", dest="source", help="specify a custom source") parser.add_option("--ssh", action="store_true", default=False, dest="use_ssh", help="Use SSH tunneling") parser.add_option( "-l", "--level", action="store", type="string", dest="return_level", help="Return Level: minimal, metadata, full [default: metadata]") parser.add_option( "-o", "--out-path", action="store", type="string", dest="save_path", help="If Return Level Full has been specified, provide a path to " "save the results to [default: current directory]") parser.add_option( "-b", "--buffer", action="store_true", dest="stdin_buffer", help="Specify to allow a buffer to be collected by stdin.") parser.add_option("-e", "--ephID", action="store", type="string", dest="ephID", default="", help="Specify an ephID to send to Laika.") parser.add_option( "-m", "--ext-metadata", action="store", dest="ext_metadata", help="Specify external metadata to be passed into the scanner.") parser.add_option("-z", "--log", action="store_true", dest="log_db", help="Specify to turn on logging results.") parser.add_option( "-R", "--recursive", action="store_true", default=False, dest="recursive", help="Enable recursive directory scanning. If enabled, all files " "in the specified directory will be scanned. Results will " "be output to si-cloudscan.log in the current directory.") (options, args) = parser.parse_args() # Define default configuration location CONFIG_PATH = "/etc/si-cloudscan/si-cloudscan.conf" if options.config_path: CONFIG_PATH = options.config_path Config = ConfigParser.ConfigParser() Config.read(CONFIG_PATH) # Parse through the config file and append each section to a single dictionary global configs for section in Config.sections(): configs.update(dict(Config.items(section))) # Set the working path, this will be used for file ouput if another # path is not specified WORKING_PATH = os.getcwd() if options.use_ssh: USE_SSH = True else: if strtobool(getConfig('use_ssh')): USE_SSH = True else: USE_SSH = False if options.ssh_host: SSH_HOST = options.ssh_host else: SSH_HOST = getConfig('ssh_host') if options.broker_host: BROKER_HOST = options.broker_host else: BROKER_HOST = getConfig('broker_host') if options.debug: logging.basicConfig(level=logging.DEBUG) logging.debug("Host: %s" % BROKER_HOST) if options.return_level: RETURN_LEVEL = options.return_level else: RETURN_LEVEL = getConfig('return_level') if options.source: SOURCE = options.source else: SOURCE = "si-cloudscan" if not options.log_db: SOURCE += "-nolog" if options.save_path: SAVE_PATH = options.save_path else: SAVE_PATH = WORKING_PATH if options.num_procs: num_procs = int(options.num_procs) else: num_procs = int(getConfig('num_procs')) if options.timeout: logging.debug("default timeout changed to %i" % options.timeout) REQUEST_TIMEOUT = options.timeout * 1000 else: REQUEST_TIMEOUT = int(getConfig('request_timeout')) if options.ext_metadata: try: ext_metadata = json.loads(options.ext_metadata) assert isinstance(ext_metadata, dict) except: print "External Metadata must be a dictionary!" sys.exit(0) else: ext_metadata = dict() REQUEST_RETRIES = int(getConfig('request_retries')) # Attempt to get the hostname try: hostname = gethostname().split('.')[0] except: hostname = "none" # Attempt to set the return level, throw an error if it doesn't exist. try: return_level = globals()["level_%s" % RETURN_LEVEL] except KeyError as e: print "Please specify a valid return level: minimal, metadata or full" sys.exit(1) if not options.recursive: try: file_buffer = '' # Try to read the file if len(args) > 0: file_buffer = open(args[0], 'rb').read() file_len = len(file_buffer) logging.debug("opened file %s with len %i" % (args[0], file_len)) else: while sys.stdin in select.select([sys.stdin], [], [], 0)[0]: line = sys.stdin.readline() if not line: break else: file_buffer += line if not file_buffer: parser.print_usage() sys.exit(1) file_len = len(file_buffer) if file_len > 20971520 and not options.nolimit: print "You're trying to scan a file larger than 20mb.. Are you sure?" print "Use the --remove-limit flag if you really want to do this." sys.exit(1) except IOError as e: print "\nERROR: The file does not exist: %s\n" % (args[0], ) sys.exit(1) else: try: fileList = [] if options.file_list: fileList = open(options.file_list).read().splitlines() else: if len(args) > 0: rootdir = args[0] for root, subFolders, files in os.walk(rootdir): for fname in files: fileList.append(os.path.join(root, fname)) else: while sys.stdin in select.select([sys.stdin], [], [], 0)[0]: line = sys.stdin.readline() if not line: break else: fileList.append(line) if not fileList: parser.print_usage() sys.exit(1) if len(fileList) > 1000 and not options.nolimit: print "You're trying to scan over 1000 files... Are you sure?" print "Use the --remove-limit flag if you really want to do this." sys.exit(1) except IOError as e: print "\nERROR: Directory does not exist: %s\n" % (args[0], ) sys.exit(1) if not options.recursive: # Construct the object to be sent for scanning if args: filename = args[0] else: filename = "stdin" ext_metadata['server'] = hostname ext_metadata['user'] = getpass.getuser() externalObject = ExternalObject( buffer=file_buffer, externalVars=ExternalVars(filename=filename, ephID=options.ephID, extMetaData=ext_metadata, source="%s-%s-%s" % (SOURCE, hostname, getpass.getuser())), level=return_level) try: if not options.recursive: # Set up ZMQ context if USE_SSH: try: logging.debug( "attempting to connect to broker at %s and SSH host %s" % (BROKER_HOST, SSH_HOST)) client = Client(BROKER_HOST, useSSH=True, sshHost=SSH_HOST, useGevent=True) except RuntimeError as e: logging.exception("could not set up SSH tunnel to %s" % SSH_HOST) sys.exit(1) else: logging.debug("SSH has been disabled.") client = Client(BROKER_HOST, useGevent=True) starttime = time.time() result = client.send(externalObject, retry=REQUEST_RETRIES, timeout=REQUEST_TIMEOUT) logging.debug("got reply in %s seconds" % str(time.time() - starttime)) rootObject = getRootObject(result) try: jsonResult = getJSON(result) print jsonResult except: logging.exception("error occured collecting results") return if return_level == level_full: SAVE_PATH = "%s/%s" % (SAVE_PATH, get_scanObjectUID(rootObject)) if not os.path.exists(SAVE_PATH): try: os.makedirs(SAVE_PATH) print "\nWriting results to %s...\n" % SAVE_PATH except (OSError, IOError) as e: print "\nERROR: unable to write to %s...\n" % SAVE_PATH return else: print "\nOutput folder already exists! Skipping results output...\n" return for uid, scanObject in result.files.iteritems(): f = open("%s/%s" % (SAVE_PATH, uid), "wb") f.write(scanObject.buffer) f.close() try: if scanObject.filename and scanObject.parent: linkPath = "%s/%s" % (SAVE_PATH, scanObject.filename.replace( "/", "_")) if not os.path.lexists(linkPath): os.symlink("%s" % (uid), linkPath) elif scanObject.filename: filenameParts = scanObject.filename.split("/") os.symlink( "%s" % (uid), "%s/%s" % (SAVE_PATH, filenameParts[-1])) except: print "Unable to create symlink for %s" % (uid) f = open("%s/%s" % (SAVE_PATH, "results.log"), "wb") f.write(jsonResult) f.close() sys.exit(1) else: try: fh = open('si-cloudscan.log', 'w') fh.close() except: pass for fname in fileList: job_queue.put(fname) for i in range(num_procs): job_queue.put("STOP") print "File list length: %s" % len(fileList) for i in range(num_procs): Process(target=worker, args=( options.nolimit, REQUEST_RETRIES, REQUEST_TIMEOUT, SAVE_PATH, SOURCE, return_level, hostname, USE_SSH, BROKER_HOST, SSH_HOST, ext_metadata, options.ephID, )).start() results_processed = 0 while results_processed < len(fileList): logging.debug("Files left: %s" % ((len(fileList) - results_processed))) resultText = result_queue.get() try: # Process results fh = open('si-cloudscan.log', 'ab') fh.write('%s\n' % resultText) fh.close() results_processed += 1 except Exception as e: raise print 'Wrote results to si-cloudscan.log' except KeyboardInterrupt: print "Interrupted by user, exiting..." sys.exit(1)
def _run(self, scanObject, result, depth, args): moduleResult = [] imports = {} sections = {} dllChars = [] imgChars = [] exports = [] cpu = [] res_type = "" try: pe = pefile.PE(data=scanObject.buffer) # Reference: http://msdn.microsoft.com/en-us/library/windows/desktop/ms680341%28v=vs.85%29.aspx for section in pe.sections: secAttrs = [] secName = section.Name.strip('\0') secData = { 'Virtual Address': '0x%08X' % section.VirtualAddress, 'Virtual Size': section.Misc_VirtualSize, 'Raw Size': section.SizeOfRawData, 'MD5': section.get_hash_md5() } if secData['MD5'] != scanObject.objectHash: moduleResult.append( ModuleObject( buffer=section.get_data(), externalVars=ExternalVars(filename=secName))) secChar = section.Characteristics if secChar & 0x20: secAttrs.append('CNT_CODE') if secChar & 0x40: secAttrs.append('CNT_INITIALIZED_DATA') if secChar & 0x80: secAttrs.append('CNT_UNINITIALIZED_DATA') if secChar & 0x200: secAttrs.append('LNK_INFO') if secChar & 0x2000000: secAttrs.append('MEM_DISCARDABLE') if secChar & 0x4000000: secAttrs.append('MEM_NOT_CACHED') if secChar & 0x8000000: secAttrs.append('MEM_NOT_PAGED') if secChar & 0x10000000: secAttrs.append('MEM_SHARED') if secChar & 0x20000000: secAttrs.append('MEM_EXECUTE') if secChar & 0x40000000: secAttrs.append('MEM_READ') if secChar & 0x80000000: secAttrs.append('MEM_WRITE') secData['Section Characteristics'] = secAttrs sections[secName] = secData sections['Total'] = pe.FILE_HEADER.NumberOfSections scanObject.addMetadata(self.module_name, 'Sections', sections) try: for exp in pe.DIRECTORY_ENTRY_EXPORT.symbols: exports.append(exp.name) scanObject.addMetadata(self.module_name, 'Exports', exports) except (QuitScanException, GlobalScanTimeoutError, GlobalModuleTimeoutError): raise except: logging.debug('No export entries') try: for entry in pe.DIRECTORY_ENTRY_IMPORT: api = [] for imp in entry.imports: api.append(imp.name) imports[entry.dll] = filter(None, api) except (QuitScanException, GlobalScanTimeoutError, GlobalModuleTimeoutError): raise except: logging.debug('No import entries') scanObject.addMetadata(self.module_name, 'Imports', imports) # Reference: http://msdn.microsoft.com/en-us/library/windows/desktop/ms680313%28v=vs.85%29.aspx imgChar = pe.FILE_HEADER.Characteristics if imgChar & 0x1: imgChars.append('RELOCS_STRIPPED') if imgChar & 0x2: imgChars.append('EXECUTABLE_IMAGE') if imgChar & 0x4: imgChars.append('LINE_NUMS_STRIPPED') if imgChar & 0x8: imgChars.append('LOCAL_SYMS_STRIPPED') if imgChar & 0x10: imgChars.append('AGGRESIVE_WS_TRIM') if imgChar & 0x20: imgChars.append('LARGE_ADDRESS_AWARE') if imgChar & 0x80: imgChars.append('BYTES_REVERSED_LO') if imgChar & 0x100: imgChars.append('32BIT_MACHINE') if imgChar & 0x200: imgChars.append('DEBUG_STRIPPED') if imgChar & 0x400: imgChars.append('REMOVABLE_RUN_FROM_SWAP') if imgChar & 0x800: imgChars.append('NET_RUN_FROM_SWAP') if imgChar & 0x1000: imgChars.append('SYSTEM_FILE') if imgChar & 0x2000: imgChars.append('DLL_FILE') if imgChar & 0x4000: imgChars.append('UP_SYSTEM_ONLY') if imgChar & 0x8000: imgChars.append('BYTES_REVERSED_HI') scanObject.addMetadata(self.module_name, 'Image Characteristics', imgChars) scanObject.addMetadata( self.module_name, 'Date', datetime.fromtimestamp( pe.FILE_HEADER.TimeDateStamp).isoformat()) scanObject.addMetadata(self.module_name, 'Timestamp', pe.FILE_HEADER.TimeDateStamp) machine = pe.FILE_HEADER.Machine cpu.append(machine) # Reference: http://en.wikibooks.org/wiki/X86_Disassembly/Windows_Executable_Files#COFF_Header if machine == 0x14c: cpu.append('Intel 386') if machine == 0x14d: cpu.append('Intel i860') if machine == 0x162: cpu.append('MIPS R3000') if machine == 0x166: cpu.append('MIPS little endian (R4000)') if machine == 0x168: cpu.append('MIPS R10000') if machine == 0x169: cpu.append('MIPS little endian WCI v2') if machine == 0x183: cpu.append('old Alpha AXP') if machine == 0x184: cpu.append('Alpha AXP') if machine == 0x1a2: cpu.append('Hitachi SH3') if machine == 0x1a3: cpu.append('Hitachi SH3 DSP') if machine == 0x1a6: cpu.append('Hitachi SH4') if machine == 0x1a8: cpu.append('Hitachi SH5') if machine == 0x1c0: cpu.append('ARM little endian') if machine == 0x1c2: cpu.append('Thumb') if machine == 0x1d3: cpu.append('Matsushita AM33') if machine == 0x1f0: cpu.append('PowerPC little endian') if machine == 0x1f1: cpu.append('PowerPC with floating point support') if machine == 0x200: cpu.append('Intel IA64') if machine == 0x266: cpu.append('MIPS16') if machine == 0x268: cpu.append('Motorola 68000 series') if machine == 0x284: cpu.append('Alpha AXP 64-bit') if machine == 0x366: cpu.append('MIPS with FPU') if machine == 0x466: cpu.append('MIPS16 with FPU') if machine == 0xebc: cpu.append('EFI Byte Code') if machine == 0x8664: cpu.append('AMD AMD64') if machine == 0x9041: cpu.append('Mitsubishi M32R little endian') if machine == 0xc0ee: cpu.append('clr pure MSIL') # Reference: http://msdn.microsoft.com/en-us/library/windows/desktop/ms680339%28v=vs.85%29.aspx magic = pe.OPTIONAL_HEADER.Magic if magic == 0x10b: cpu.append('32_BIT') if magic == 0x20b: cpu.append('64_BIT') if magic == 0x107: cpu.append('ROM_IMAGE') cpu.append("0x%04X" % magic) scanObject.addMetadata(self.module_name, 'CPU', cpu) dllChar = pe.OPTIONAL_HEADER.DllCharacteristics if dllChar & 0x40: dllChars.append('DYNAMIC_BASE') if dllChar & 0x80: dllChars.append('FORCE_INTEGRITY') if dllChar & 0x100: dllChars.append('NX_COMPAT') if dllChar & 0x200: dllChars.append('NO_ISOLATION') if dllChar & 0x400: dllChars.append('NO_SEH') if dllChar & 0x800: dllChars.append('NO_BIND') if dllChar & 0x2000: dllChars.append('WDM_DRIVER') if dllChar & 0x8000: dllChars.append('TERMINAL_SERVER_AWARE') scanObject.addMetadata(self.module_name, 'DLL Characteristics', dllChars) subsystem = pe.OPTIONAL_HEADER.Subsystem if subsystem == 0: scanObject.addMetadata(self.module_name, 'Subsystem', 'UNKNOWN') if subsystem == 1: scanObject.addMetadata(self.module_name, 'Subsystem', 'NATIVE') if subsystem == 2: scanObject.addMetadata(self.module_name, 'Subsystem', 'WINDOWS_GUI') if subsystem == 3: scanObject.addMetadata(self.module_name, 'Subsystem', 'WINDOWS_CUI') if subsystem == 5: scanObject.addMetadata(self.module_name, 'Subsystem', 'OS2_CUI') if subsystem == 7: scanObject.addMetadata(self.module_name, 'Subsystem', 'POSIX_CUI') if subsystem == 9: scanObject.addMetadata(self.module_name, 'Subsystem', 'WINDOWS_CE_GUI') if subsystem == 10: scanObject.addMetadata(self.module_name, 'Subsystem', 'EFI_APPLICATION') if subsystem == 11: scanObject.addMetadata(self.module_name, 'Subsystem', 'EFI_BOOT_SERVICE_DRIVER') if subsystem == 12: scanObject.addMetadata(self.module_name, 'Subsystem', 'EFI_RUNTIME_DRIVER') if subsystem == 13: scanObject.addMetadata(self.module_name, 'Subsystem', 'EFI_ROM') if subsystem == 14: scanObject.addMetadata(self.module_name, 'Subsystem', 'XBOX') if subsystem == 16: scanObject.addMetadata(self.module_name, 'Subsystem', 'BOOT_APPLICATION') # Reference: http://msdn.microsoft.com/en-us/library/windows/desktop/ms648009%28v=vs.85%29.aspx try: for resource in pe.DIRECTORY_ENTRY_RESOURCE.entries: if resource.id == 9: res_type = "RT_ACCELERATOR" if resource.id == 21: res_type = "RT_ANICURSOR" if resource.id == 22: res_type = "RT_ANIICON" if resource.id == 2: res_type = "RT_BITMAP" if resource.id == 1: res_type = "RT_CURSOR" if resource.id == 5: res_type = "RT_DIALOG" if resource.id == 17: res_type = "RT_DLGINCLUDE" if resource.id == 8: res_type = "RT_FONT" if resource.id == 7: res_type = "RT_FONTDIR" if resource.id == 12: res_type = "RT_GROUP_CURSOR" if resource.id == 14: res_type = "RT_GROUP_ICON" if resource.id == 23: res_type = "RT_HTML" if resource.id == 3: res_type = "RT_ICON" if resource.id == 24: res_type = "RT_MANIFEST" if resource.id == 4: res_type = "RT_MENU" if resource.id == 11: res_type = "RT_MESSAGETABLE" if resource.id == 19: res_type = "RT_PLUGPLAY" if resource.id == 10: res_type = "RT_RCDATA" if resource.id == 6: res_type = "RT_STRING" if resource.id == 16: res_type = "RT_VERSION" if resource.id == 20: res_type = "RT_VXD" for entry in resource.directory.entries: scanObject.addMetadata(self.module_name, 'Resources', res_type + "_%s" % entry.id) except (QuitScanException, GlobalScanTimeoutError, GlobalModuleTimeoutError): raise except: logging.debug('No resources') scanObject.addMetadata(self.module_name, 'Stack Reserve Size', pe.OPTIONAL_HEADER.SizeOfStackReserve) scanObject.addMetadata(self.module_name, 'Stack Commit Size', pe.OPTIONAL_HEADER.SizeOfStackCommit) scanObject.addMetadata(self.module_name, 'Heap Reserve Size', pe.OPTIONAL_HEADER.SizeOfHeapReserve) scanObject.addMetadata(self.module_name, 'Heap Commit Size', pe.OPTIONAL_HEADER.SizeOfHeapCommit) scanObject.addMetadata(self.module_name, 'EntryPoint', hex(pe.OPTIONAL_HEADER.AddressOfEntryPoint)) scanObject.addMetadata(self.module_name, 'ImageBase', hex(pe.OPTIONAL_HEADER.ImageBase)) # Parse RSDS & Rich scanObject.addMetadata(self.module_name, 'RSDS', self.parseRSDS(scanObject)) scanObject.addMetadata(self.module_name, 'Rich', self.parseRich(pe)) except pefile.PEFormatError: logging.debug("Invalid PE format") return moduleResult
def perform_scan(self, poll_timeout): ''' Wait for work from broker then perform the scan. If timeout occurs, no scan is performed and no result is returned. Arguments: poll_timeout -- The amount of time to wait for work. Returns: The result of the scan or None if no scan was performed. ''' from laikaboss.dispatch import Dispatch from laikaboss.objectmodel import ScanResult, ExternalObject, ExternalVars from laikaboss.util import log_result # If task is found, perform scan try: logging.debug("Worker (%s): checking for work", self.identity) tasks = dict(self.broker_poller.poll(poll_timeout)) if tasks.get(self.broker) == zmq.POLLIN: logging.debug("Worker (%s): performing scan", self.identity) # task should be in the following format # ['', client_id, '', request_type, '', request] # where: # client_id -- ZMQ identifier of the client socket # request_type -- The type of request (json/pickle/zlib) # request -- Object to be scanned task = self.broker.recv_multipart() client_id = task[1] if len(task) == 6: request_type = task[3] request = task[5] if request_type in [REQ_TYPE_PICKLE, REQ_TYPE_PICKLE_ZLIB]: #logging.debug("Worker: received work %s", str(task)) if request_type == REQ_TYPE_PICKLE_ZLIB: externalObject = pickle.loads(zlib.decompress(request)) else: externalObject = pickle.loads(request) elif request_type in [REQ_TYPE_JSON, REQ_TYPE_JSON_ZLIB]: if request_type == REQ_TYPE_JSON_ZLIB: jsonRequest = json.loads(zlib.decompress(request)) else: jsonRequest = json.loads(request) # Set default values for our request just in case some were omitted if not 'buffer' in jsonRequest: jsonRequest['buffer'] = '' else: try: jsonRequest['buffer'] = base64.b64decode(jsonRequest['buffer']) except: # This should never happen unless invalid input is given jsonRequest['buffer'] = '' if not 'filename' in jsonRequest: jsonRequest['filename'] = '' if not 'ephID' in jsonRequest: jsonRequest['ephID'] = '' if not 'uniqID' in jsonRequest: jsonRequest['uniqID'] = '' if not 'contentType' in jsonRequest: jsonRequest['contentType'] = [] if not 'timestamp' in jsonRequest: jsonRequest['timestamp'] = '' if not 'source' in jsonRequest: jsonRequest['source'] = '' if not 'origRootUID' in jsonRequest: jsonRequest['origRootUID'] = '' if not 'extMetaData' in jsonRequest: jsonRequest['extMetaData'] = {} if not 'level' in jsonRequest: jsonRequest['level'] = 2 externalVars = ExternalVars(filename=jsonRequest['filename'], ephID=jsonRequest['ephID'], uniqID=jsonRequest['uniqID'], contentType=jsonRequest['contentType'], timestamp=jsonRequest['timestamp'], source=jsonRequest['source'], origRootUID=jsonRequest['origRootUID'], extMetaData=jsonRequest['extMetaData']) externalObject = ExternalObject(buffer=jsonRequest['buffer'], level=jsonRequest['level'], externalVars=externalVars) else: return [client_id, '', 'INVALID REQUEST'] result = ScanResult( source=externalObject.externalVars.source, level=externalObject.level) result.startTime = time.time() try: Dispatch(externalObject.buffer, result, 0, externalVars=externalObject.externalVars) except QuitScanException: raise except: exc_type, exc_value, exc_traceback = sys.exc_info() log_debug( "exception on file: %s, detailed exception: %s" % ( externalObject.externalVars.filename, repr(traceback.format_exception( exc_type, exc_value, exc_traceback)))) if self.logresult: log_result(result) if request_type == REQ_TYPE_PICKLE_ZLIB: result = zlib.compress( pickle.dumps(result, pickle.HIGHEST_PROTOCOL)) elif request_type == REQ_TYPE_PICKLE: result = pickle.dumps(result, pickle.HIGHEST_PROTOCOL) elif request_type == REQ_TYPE_JSON_ZLIB: result = zlib.compress( json.dumps(result, cls=ResultEncoder)) elif request_type == REQ_TYPE_JSON: result = json.dumps(result, cls=ResultEncoder) return [client_id, '', result] else: return [client_id, '', 'INVALID REQUEST'] except zmq.ZMQError as zmqerror: if "Interrupted system call" not in str(zmqerror): logging.exception("Worker (%s): Received ZMQError", self.identity) else: logging.debug("Worker (%s): ZMQ interrupted by shutdown signal", self.identity) return None
def _run(self, scanObject, result, depth, args): '''Laika framework module logic execution''' moduleResult = [] file_limit = int(get_option(args, 'filelimit', 'rarfilelimit', 0)) byte_limit = int(get_option(args, 'bytelimit', 'rarbytelimit', 0)) password = get_option(args, 'password', 'rarpassword') attempt_decrypt = strtobool( get_option(args, 'attemptdecrypt', 'rarattemptdecrypt', 'false')) temp_dir = get_option(args, 'tempdir', 'tempdir', '/tmp/laikaboss_tmp') if not os.path.isdir(temp_dir): os.mkdir(temp_dir) os.chmod(temp_dir, 0777) # A temp file must be created as UnRAR2 does not accept buffers with tempfile.NamedTemporaryFile(dir=temp_dir) as temp_file: temp_file.write(scanObject.buffer) temp_file.flush() # RAR can be password protected, which encrypts the headers headers_are_encrypted = False # RAR can encrypt the files while leaving the headers decrypted files_are_encrypted = False rar = None # list of the file info objects infos = [] try: logging.debug('%s: Attempting to open rar file', self.module_name) # If headers are encrypted, the following will raise IncorrectRARPassword rar = UnRAR2.RarFile(temp_file.name) infos = rar.infolist() logging.debug('%s: Succeeded opening rar file', self.module_name) # If files are encrypted, the filename will be prefixed with a '*' for info in infos: if info.filename.startswith('*'): logging.debug('%s: Rar files are encrypted', self.module_name) scanObject.addFlag('ENCRYPTED_RAR') scanObject.addMetadata(self.module_name, "Encrypted", "Protected Files") files_are_encrypted = True break except IncorrectRARPassword: logging.debug('%s: Rar headers are encrypted', self.module_name) scanObject.addFlag('ENCRYPTED_RAR') scanObject.addMetadata(self.module_name, "Encrypted", "Protected Header") headers_are_encrypted = True except InvalidRARArchive: logging.debug('%s: Invalid Rar file') if (headers_are_encrypted or files_are_encrypted) and attempt_decrypt: logging.debug('%s: Attempting to decrypt', self.module_name) possible_passwords = [] # Passwords are sometimes sent in the email content. Use the content of the parent # object as the list of possible passwords parent_object = getParentObject(result, scanObject) if parent_object: possible_passwords = _create_word_list( parent_object.buffer) if password: possible_passwords.insert(0, password) explode_temp_dir = os.path.join(temp_dir, 'exploderar') for possible_password in possible_passwords: try: logging.debug("EXPLODE_RAR: Attempting password '%s'", possible_password) rar = UnRAR2.RarFile(temp_file.name, password=possible_password) # Extraction is needed to force the exception on encrypted files if files_are_encrypted: rar.extract(path=explode_temp_dir) infos = rar.infolist() logging.debug("EXPLODE_RAR: Found password '%s'", possible_password) scanObject.addFlag('rar:decrypted') scanObject.addMetadata(self.module_name, 'Password', possible_password) break except IncorrectRARPassword: continue if os.path.exists(explode_temp_dir): remove_dir(explode_temp_dir) scanObject.addMetadata(self.module_name, "Total_Files", len(infos)) file_count = 0 exceeded_byte_limit = False for info in infos: if byte_limit and info.size > byte_limit: logging.debug( "EXPLODE_RAR: skipping file due to byte limit") exceeded_byte_limit = True continue try: content = rar.read_files(info.filename)[0][1] if byte_limit and len(content) > byte_limit: logging.debug( "EXPLODE_RAR: skipping file due to byte limit") exceeded_byte_limit = True continue moduleResult.append( ModuleObject( buffer=content, externalVars=ExternalVars(filename=info.filename))) except IndexError: pass file_count += 1 if file_limit and file_count >= file_limit: scanObject.addFlag("rar:err:LIMIT_EXCEEDED") logging.debug("EXPLODE_RAR: breaking due to file limit") break if exceeded_byte_limit: scanObject.addFlag("rar:err:BYTE_LIMIT_EXCEEDED") scanObject.addMetadata(self.module_name, "Unzipped", len(moduleResult)) return moduleResult
def _run(self, scanObject, result, depth, args): logging.debug("tactical: args: %s" % repr(args)) moduleResult = [] output = '' script_path = None timeout = "30" if 'timeout' in args: timeout = args['timeout'] # Option to remove directory containing temp files unlinkDir = False if 'unlinkDir' in args: if args['unlinkDir'].upper() == 'TRUE': unlinkDir = True #only do something if script is defined in dispatcher--without external script this does nothing if 'script' in args: script_path = args['script'] #temp_file_h, temp_file_name = tempfile.mkstemp() with tempfile.NamedTemporaryFile(dir=self.TEMP_DIR) as temp_file: temp_file_name = temp_file.name temp_file.write(scanObject.buffer) temp_file.flush() #use timeout command in the command, if available on the system? output = self._collect( "timeout %s %s %s %s" % (timeout, script_path, temp_file_name, self.TEMP_DIR), shell=True) logging.debug(output) tmp_dirs = [] for line in output.splitlines(): #need to process the lines line_type = line[:5] line_value = line[5:].strip() if line_type == "FLAG:": #not doing any validation on the flags, but truncating on length scanObject.addFlag(line_value[:20]) elif line_type == "META:": (meta_key, meta_sep, meta_value) = line_value.partition('=') scanObject.addMetadata(self.module_name, meta_key, meta_value) elif line_type == "FILE:": # Check to see if the file is actually a directory (silly 7zip) if os.path.isdir(line_value): # If the file is a directory and we don't already know about it, add it to the list if line_value not in tmp_dirs: tmp_dirs.append(line_value) # Skip this since it's a directory continue # If we don't already know about this directory, add it to the list if os.path.dirname(line_value) not in tmp_dirs: file_path = os.path.dirname(line_value) tmp_dirs.append(file_path) try: with open(line_value, 'r') as result_file: moduleResult.append( ModuleObject(buffer=result_file.read(), externalVars=ExternalVars( filename=os.path.basename( line_value)))) except: raise finally: #make sure the incoming file is deleted, or at least we try.... logging.debug("Trying to unlink file: %s" % (line_value)) os.unlink(line_value) else: pass if unlinkDir: logging.debug("Attempting to remove temp directories: %s" % (tmp_dirs)) # Loop through the directories and remove them, starting with the deepest level (by length) for tmp_dir in sorted(tmp_dirs, key=len, reverse=True): try: rmtree(tmp_dir) except (QuitScanException, GlobalScanTimeoutError, GlobalModuleTimeoutError): raise except: log_module( "MSG", self.module_name, 0, scanObject, result, "Could not remove tmp dir %s" % (tmp_dir)) logging.exception( "Unable to remove temp directory: %s" % (tmp_dir)) return moduleResult
def _run(self, scanObject, result, depth, args): moduleResult = [] try: # Create a temp file so isoparser has a file to analyze with tempfile.NamedTemporaryFile(dir=self.TEMP_DIR) as temp_file_input: temp_file_input_name = temp_file_input.name temp_file_input.write(scanObject.buffer) temp_file_input.flush() # Create an iso object iso = isoparser.parse(temp_file_input_name) # Loop through iso and identify child object. Write each child object to output directory for child in iso.root.children: child_md5 = hashlib.md5(child.content).hexdigest() moduleResult.append(ModuleObject(buffer=child.content, externalVars=ExternalVars(filename='e_iso_%s' % child_md5))) except ScanError: raise return moduleResult
def main(laika_broker, redis_host, redis_port): # Register signal handler signal.signal(signal.SIGINT, handler) signal.signal(signal.SIGTERM, handler) # Connect to Redis r = redis.StrictRedis(host=redis_host, port=redis_port) # Create Laika BOSS client object client = Client(laika_broker, async=True) while True: # pop next item off queue q_item = r.blpop('suricata_queue', timeout=0) key = q_item[1] print("Popped object: %s" % key) # look up file buffer file_buffer = r.get("%s_buf" % key) # look up file file meta file_meta = r.get("%s_meta" % key) if not file_buffer or not file_meta: print( "File buffer or meta for key: %s not found. Skipping this object." % key) delete_keys(r, key) continue try: file_meta_dict = json.loads(file_meta) except: print("JSON decode error for key: %s. Skipping this object." % key) delete_keys(r, key) continue # Extract File Name # Note: this is best effort - it will not always work filename = os.path.basename(file_meta_dict['http_request'].get( 'request', "")) filename = filename.split('?')[0] # Get respective content type http_direction = file_meta_dict['http_direction'] if http_direction == 'request': content_type = file_meta_dict['http_request'].get( 'Content-Type', []) elif http_direction == 'response': content_type = file_meta_dict['http_response'].get( 'Content-Type', []) else: content_type = [] externalObject = ExternalObject( buffer=file_buffer, externalVars=ExternalVars(filename=filename, source="%s-%s" % ("suricata", "redis"), extMetaData=file_meta_dict, contentType=content_type), level=level_minimal) # send to Laika BOSS for async scanning - no response expected client.send(externalObject) print("Sent %s for scanning...\n" % key) # cleanup delete_keys(r, key)
def worker(nolimit, REQUEST_RETRIES, REQUEST_TIMEOUT, SAVE_PATH, SOURCE, return_level, hostname, USE_SSH, BROKER_HOST, SSH_HOST, ext_metadata, ephID): # Set up ZMQ context if USE_SSH: try: logging.debug( "attempting to connect to broker at %s and SSH host %s" % (BROKER_HOST, SSH_HOST)) client = Client(BROKER_HOST, useSSH=True, sshHost=SSH_HOST) except RuntimeError as e: logging.exception("could not set up SSH tunnel to %s" % SSH_HOST) sys.exit(1) else: logging.debug("SSH has been disabled.") client = Client(BROKER_HOST) randNum = randint(1, 10000) for fname in iter(job_queue.get, 'STOP'): print "Worker %s: Starting new request" % randNum try: # Try to read the file file_buffer = open(fname, 'rb').read() file_len = len(file_buffer) logging.debug("opened file %s with len %i" % (fname, file_len)) if file_len > 20971520 and not nolimit: print "You're trying to scan a file larger than 20mb.. Are you sure?" print "Use the --remove-limit flag if you really want to do this." print "File has not been scanned: %s" % fname result_queue.put( "~~~~~~~~~~~~~~~~~~~~\nFile has not been scanned due to size: %s\n~~~~~~~~~~~~~~~~~~~~" % fname) continue except IOError as e: print "\nERROR: The file does not exist: %s\n" % (fname, ) print "Moving to next file..." result_queue.put( "~~~~~~~~~~~~~~~~~~~~\nFile has not been scanned due to an IO Error: %s\n~~~~~~~~~~~~~~~~~~~~" % fname) continue try: # Construct the object to be sent for scanning externalObject = ExternalObject( buffer=file_buffer, externalVars=ExternalVars( filename=fname, ephID=ephID, extMetaData=ext_metadata, source="%s-%s-%s" % (SOURCE, hostname, getpass.getuser())), level=return_level) starttime = time.time() result = client.send(externalObject, retry=REQUEST_RETRIES, timeout=REQUEST_TIMEOUT) if not result: result_queue.put( "~~~~~~~~~~~~~~~~~~~~\nFile timed out in the scanner: %s\n~~~~~~~~~~~~~~~~~~~~" % fname) continue logging.debug("got reply in %s seconds" % str(time.time() - starttime)) rootObject = getRootObject(result) jsonResult = getJSON(result) resultText = '%s\n' % jsonResult if return_level == level_full: FILE_SAVE_PATH = "%s/%s" % (SAVE_PATH, get_scanObjectUID(rootObject)) if not os.path.exists(FILE_SAVE_PATH): try: os.makedirs(FILE_SAVE_PATH) print "Writing results to %s..." % FILE_SAVE_PATH except (OSError, IOError) as e: print "\nERROR: unable to write to %s...\n" % FILE_SAVE_PATH return else: print "\nOutput folder already exists! Skipping results output...\n" return for uid, scanObject in result.files.iteritems(): f = open("%s/%s" % (FILE_SAVE_PATH, uid), "wb") f.write(scanObject.buffer) f.close() if scanObject.filename and scanObject.depth != 0: linkPath = "%s/%s" % (FILE_SAVE_PATH, scanObject.filename.replace( "/", "_")) if not os.path.lexists(linkPath): os.symlink("%s" % (uid), linkPath) elif scanObject.filename: filenameParts = scanObject.filename.split("/") linkPath = "%s/%s" % (FILE_SAVE_PATH, filenameParts[-1]) if not os.path.lexists(linkPath): os.symlink("%s" % (uid), linkPath) f = open("%s/%s" % (FILE_SAVE_PATH, "results.json"), "wb") f.write(jsonResult) f.close() result_queue.put(resultText) except: #logging.exception("error occured collecting results") result_queue.put( "~~~~~~~~~~~~~~~~~~~~\nUNKNOWN ERROR OCCURRED: %s\n~~~~~~~~~~~~~~~~~~~~" % fname) continue
def run(self): global CONFIG_PATH config.init(path=CONFIG_PATH) init_logging() ret_value = 0 # Loop and accept messages from both channels, acting accordingly while True: next_task = self.task_queue.get() if next_task is None: # Poison pill means shutdown self.task_queue.task_done() logging.debug("%s Got poison pill" % (os.getpid())) break try: with open(next_task) as nextfile: file_buffer = nextfile.read() except IOError: logging.debug("Error opening: %s" % (next_task)) self.task_queue.task_done() self.result_queue.put(answer) continue resultJSON = "" try: # perform the work result = ScanResult() result.source = SOURCE result.startTime = time.time() result.level = level_metadata myexternalVars = ExternalVars(filename=next_task, source=SOURCE, ephID=EPHID, extMetaData=EXT_METADATA) Dispatch(file_buffer, result, 0, externalVars=myexternalVars, extScanModules=SCAN_MODULES) resultJSON = getJSON(result) if SAVE_PATH: rootObject = getRootObject(result) UID_SAVE_PATH = "%s/%s" % (SAVE_PATH, get_scanObjectUID(rootObject)) if not os.path.exists(UID_SAVE_PATH): try: os.makedirs(UID_SAVE_PATH) except (OSError, IOError) as e: error("\nERROR: unable to write to %s...\n" % (UID_SAVE_PATH)) raise for uid, scanObject in result.files.iteritems(): with open("%s/%s" % (UID_SAVE_PATH, uid), "wb") as f: f.write(scanObject.buffer) if scanObject.filename and scanObject.depth != 0: linkPath = "%s/%s" % (UID_SAVE_PATH, scanObject.filename.replace("/","_")) if not os.path.lexists(linkPath): os.symlink("%s" % (uid), linkPath) elif scanObject.filename: filenameParts = scanObject.filename.split("/") os.symlink("%s" % (uid), "%s/%s" % (UID_SAVE_PATH, filenameParts[-1])) with open("%s/%s" % (UID_SAVE_PATH, "result.json"), "wb") as f: f.write(resultJSON) if LOG_RESULT: log_result(result) except: logging.exception("Scan worker died, shutting down") ret_value = 1 break finally: self.task_queue.task_done() self.result_queue.put(zlib.compress(resultJSON)) close_modules() return ret_value
def _run(self, scanObject, result, depth, args): moduleResult = [] imports = {} sections = {} exports = [] try: pe = pefile.PE(data=scanObject.buffer) dump_dict = pe.dump_dict() # Parse sections for section in dump_dict.get('PE Sections', []): secName = section.get('Name', {}).get('Value', '').strip('\0') ptr = section.get('PointerToRawData', {}).get('Value') virtAddress = section.get('VirtualAddress', {}).get('Value') virtSize = section.get('Misc_VirtualSize', {}).get('Value') size = section.get('SizeOfRawData', {}).get('Value') secData = pe.get_data(ptr, size) secInfo = { 'Virtual Address': '0x%08X' % virtAddress, 'Virtual Size': virtSize, 'Raw Size': size, 'MD5': section.get('MD5', ''), 'SHA1': section.get('SHA1', ''), 'SHA256': section.get('SHA256', ''), 'Entropy': section.get('Entropy', ''), 'Section Characteristics': section.get('Flags', []), 'Structure': section.get('Structure', ''), } if secInfo['MD5'] != scanObject.objectHash: moduleResult.append(ModuleObject( buffer=secData, externalVars=ExternalVars(filename=secName))) sections[secName] = secInfo sections['Total'] = pe.FILE_HEADER.NumberOfSections scanObject.addMetadata(self.module_name, 'Sections', sections) # Parse imports and exports try: for exp in pe.DIRECTORY_ENTRY_EXPORT.symbols: exports.append(exp.name) scanObject.addMetadata(self.module_name, 'Exports', exports) except ScanError: raise except: logging.debug('No export entries') for imp_symbol in dump_dict.get('Imported symbols',[]): for imp in imp_symbol: if imp.get('DLL'): dll = imp.get('DLL') imports.setdefault(dll, []) # Imports can be identified by ordinal or name if imp.get('Ordinal'): ordinal = imp.get('Ordinal') imports[dll].append(ordinal) if imp.get('Name'): name = imp.get('Name') imports[dll].append(name) scanObject.addMetadata(self.module_name, 'Imports', imports) # Parse resources try: for resource in pe.DIRECTORY_ENTRY_RESOURCE.entries: res_type = pefile.RESOURCE_TYPE.get(resource.id, 'Unknown') for entry in resource.directory.entries: for e_entry in entry.directory.entries: sublang = pefile.get_sublang_name_for_lang( e_entry.data.lang, e_entry.data.sublang, ) offset = e_entry.data.struct.OffsetToData size = e_entry.data.struct.Size r_data = pe.get_data(offset, size) language = pefile.LANG.get( e_entry.data.lang, 'Unknown') data = { 'Type': res_type, 'Id': e_entry.id, 'Name': e_entry.data.struct.name, 'Offset': offset, 'Size': size, 'SHA256': hashlib.sha256(r_data).hexdigest(), 'SHA1': hashlib.sha1(r_data).hexdigest(), 'MD5': hashlib.md5(r_data).hexdigest(), 'Language': language, 'Sub Language': sublang, } scanObject.addMetadata( self.module_name, 'Resources', data) except ScanError: raise except: logging.debug('No resources') # Gather miscellaneous stuff try: scanObject.addMetadata(self.module_name, 'Imphash', pe.get_imphash()) except ScanError: raise except: logging.debug('Unable to identify imphash') imgChars = dump_dict.get('Flags', []) scanObject.addMetadata( self.module_name, 'Image Characteristics', imgChars) # Make a pretty date format date = datetime.fromtimestamp(pe.FILE_HEADER.TimeDateStamp) isoDate = date.isoformat() scanObject.addMetadata(self.module_name, 'Date', isoDate) scanObject.addMetadata( self.module_name, 'Timestamp', pe.FILE_HEADER.TimeDateStamp) machine = pe.FILE_HEADER.Machine machineData = { 'Id': machine, 'Type': pefile.MACHINE_TYPE.get(machine) } scanObject.addMetadata( self.module_name, 'Machine Type', machineData) # Reference: http://msdn.microsoft.com/en-us/library/windows/desktop/ms680339%28v=vs.85%29.aspx scanObject.addMetadata( self.module_name, 'Image Magic', IMAGE_MAGIC_LOOKUP.get(pe.OPTIONAL_HEADER.Magic, 'Unknown')) dllChars = dump_dict.get('DllCharacteristics', []) scanObject.addMetadata( self.module_name, 'DLL Characteristics', dllChars) subsystem = pe.OPTIONAL_HEADER.Subsystem subName = pefile.SUBSYSTEM_TYPE.get(subsystem) scanObject.addMetadata(self.module_name, 'Subsystem', subName) # Reference: http://msdn.microsoft.com/en-us/library/windows/desktop/ms648009%28v=vs.85%29.aspx scanObject.addMetadata( self.module_name, 'Stack Reserve Size', pe.OPTIONAL_HEADER.SizeOfStackReserve) scanObject.addMetadata( self.module_name, 'Stack Commit Size', pe.OPTIONAL_HEADER.SizeOfStackCommit) scanObject.addMetadata( self.module_name, 'Heap Reserve Size', pe.OPTIONAL_HEADER.SizeOfHeapReserve) scanObject.addMetadata( self.module_name, 'Heap Commit Size', pe.OPTIONAL_HEADER.SizeOfHeapCommit) scanObject.addMetadata( self.module_name, 'EntryPoint', hex(pe.OPTIONAL_HEADER.AddressOfEntryPoint)) scanObject.addMetadata( self.module_name, 'ImageBase', hex(pe.OPTIONAL_HEADER.ImageBase)) # Parse RSDS & Rich scanObject.addMetadata( self.module_name, 'Rich Header', self.parseRich(pe)) if hasattr(pe, 'DIRECTORY_ENTRY_DEBUG'): debug = dict() for e in pe.DIRECTORY_ENTRY_DEBUG: rawData = pe.get_data(e.struct.AddressOfRawData, e.struct.SizeOfData) if rawData.find('RSDS') != -1 and len(rawData) > 24: pdb = rawData[rawData.find('RSDS'):] debug["guid"] = "%s-%s-%s-%s" % ( binascii.hexlify(pdb[4:8]), binascii.hexlify(pdb[8:10]), binascii.hexlify(pdb[10:12]), binascii.hexlify(pdb[12:20])) debug["age"] = struct.unpack('<L', pdb[20:24])[0] debug["pdb"] = pdb[24:].rstrip('\x00') scanObject.addMetadata(self.module_name, 'RSDS', debug) elif rawData.find('NB10') != -1 and len(rawData) > 16: pdb = rawData[rawData.find('NB10')+8:] debug["created"] = datetime.fromtimestamp(struct.unpack('<L', pdb[0:4])[0]).isoformat() debug["age"] = struct.unpack('<L', pdb[4:8])[0] debug["pdb"] = pdb[8:].rstrip('\x00') scanObject.addMetadata(self.module_name, 'NB10', debug) except pefile.PEFormatError: logging.debug("Invalid PE format") return moduleResult