def scanviz(self, id, gexf="0"): """Export entities from scan results for visualising Args: id (str): scan ID gexf (str): TBD Returns: string: GEXF data """ if not id: return None dbh = SpiderFootDb(self.config) data = dbh.scanResultEvent(id, filterFp=True) scan = dbh.scanInstanceGet(id) if not scan: return None root = scan[1] if gexf == "0": return SpiderFootHelpers.buildGraphJson([root], data) cherrypy.response.headers['Content-Disposition'] = "attachment; filename=SpiderFoot.gexf" cherrypy.response.headers['Content-Type'] = "application/gexf" cherrypy.response.headers['Pragma'] = "no-cache" return SpiderFootHelpers.buildGraphGexf([root], "SpiderFoot Export", data)
def scanelementtypediscovery(self, id, eventType): """Scan element type discovery. Args: id (str): scan ID eventType (str): filter by event type Returns: str: JSON """ dbh = SpiderFootDb(self.config) pc = dict() datamap = dict() retdata = dict() # Get the events we will be tracing back from try: leafSet = dbh.scanResultEvent(id, eventType) [datamap, pc] = dbh.scanElementSourcesAll(id, leafSet) except Exception: return retdata # Delete the ROOT key as it adds no value from a viz perspective del pc['ROOT'] retdata['tree'] = SpiderFootHelpers.dataParentChildToTree(pc) retdata['data'] = datamap return retdata
def scansummary(self, id, by): """Summary of scan results. Args: id (str): scan ID by: TBD Returns: str: scan summary as JSON """ retdata = [] dbh = SpiderFootDb(self.config) try: scandata = dbh.scanResultSummary(id, by) except Exception: return retdata try: statusdata = dbh.scanInstanceGet(id) except Exception: return retdata for row in scandata: if row[0] == "ROOT": continue lastseen = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(row[2])) retdata.append([row[0], row[1], lastseen, row[3], row[4], statusdata[5]]) return retdata
def scanlist(self): """Produce a list of scans. Returns: str: scan list as JSON """ dbh = SpiderFootDb(self.config) data = dbh.scanInstanceList() retdata = [] for row in data: created = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(row[3])) if row[4] == 0: started = "Not yet" else: started = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(row[4])) if row[5] == 0: finished = "Not yet" else: finished = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(row[5])) retdata.append([row[0], row[1], row[2], created, started, finished, row[6], row[7]]) return retdata
def scanvizmulti(self, ids, gexf="1"): """Export entities results from multiple scans in GEXF format Args: ids (str): scan IDs gexf (str): TBD Returns: string: GEXF data """ dbh = SpiderFootDb(self.config) data = list() roots = list() if not ids: return None for id in ids.split(','): data = data + dbh.scanResultEvent(id, filterFp=True) scan = dbh.scanInstanceGet(id) if scan: roots.append(scan[1]) if gexf == "0": # Not implemented yet return None cherrypy.response.headers['Content-Disposition'] = "attachment; filename=SpiderFoot.gexf" cherrypy.response.headers['Content-Type'] = "application/gexf" cherrypy.response.headers['Pragma'] = "no-cache" return SpiderFootHelpers.buildGraphGexf(roots, "SpiderFoot Export", data)
def scaneventresultexport(self, id, type, dialect="excel"): """Get scan event result data in CSV format Args: id (str): scan ID type (str): TBD dialect (str): TBD Returns: string: results in CSV format """ dbh = SpiderFootDb(self.config) data = dbh.scanResultEvent(id, type) fileobj = StringIO() parser = csv.writer(fileobj, dialect=dialect) parser.writerow(["Updated", "Type", "Module", "Source", "F/P", "Data"]) for row in data: if row[4] == "ROOT": continue lastseen = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(row[0])) datafield = str(row[1]).replace("<SFURL>", "").replace("</SFURL>", "") parser.writerow([lastseen, str(row[4]), str(row[3]), str(row[2]), row[13], datafield]) cherrypy.response.headers['Content-Disposition'] = "attachment; filename=SpiderFoot.csv" cherrypy.response.headers['Content-Type'] = "application/csv" cherrypy.response.headers['Pragma'] = "no-cache" return fileobj.getvalue().encode('utf-8')
def stopscan(self, id): """Stop a scan Args: id (str): comma separated list of scan IDs Returns: str: JSON response """ if not id: return self.jsonify_error('404', "No scan specified") dbh = SpiderFootDb(self.config) ids = id.split(',') for scan_id in ids: res = dbh.scanInstanceGet(scan_id) if not res: return self.jsonify_error('404', f"Scan {id} does not exist") scan_status = res[5] if scan_status == "FINISHED": return self.jsonify_error('400', f"Scan {id} has already finished.") if scan_status == "ABORTED": return self.jsonify_error('400', f"Scan {id} has already aborted.") if scan_status != "RUNNING": return self.jsonify_error('400', f"The running scan is currently in the state '{scan_status}', please try again later or restart SpiderFoot.") for scan_id in ids: dbh.scanInstanceSet(scan_id, status="ABORT-REQUESTED") return b""
def scaneventresultexportmulti(self, ids, dialect="excel"): """Get scan event result data in CSV format for multiple scans Args: ids (str): comma separated list of scan IDs dialect (str): TBD Returns: string: results in CSV format """ dbh = SpiderFootDb(self.config) scaninfo = dict() data = list() for id in ids.split(','): scaninfo[id] = dbh.scanInstanceGet(id) data = data + dbh.scanResultEvent(id) fileobj = StringIO() parser = csv.writer(fileobj, dialect=dialect) parser.writerow(["Scan Name", "Updated", "Type", "Module", "Source", "F/P", "Data"]) for row in data: if row[4] == "ROOT": continue lastseen = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(row[0])) datafield = str(row[1]).replace("<SFURL>", "").replace("</SFURL>", "") parser.writerow([scaninfo[row[12]][0], lastseen, str(row[4]), str(row[3]), str(row[2]), row[13], datafield]) cherrypy.response.headers['Content-Disposition'] = "attachment; filename=SpiderFoot.csv" cherrypy.response.headers['Content-Type'] = "application/csv" cherrypy.response.headers['Pragma'] = "no-cache" return fileobj.getvalue().encode('utf-8')
def scanelementtypediscovery(self, id, eventType): """scan element type discovery Args: id: TBD eventType (str): filter by event type """ cherrypy.response.headers['Content-Type'] = "application/json; charset=utf-8" sf = SpiderFoot(self.config) dbh = SpiderFootDb(self.config) pc = dict() datamap = dict() # Get the events we will be tracing back from leafSet = dbh.scanResultEvent(id, eventType) [datamap, pc] = dbh.scanElementSourcesAll(id, leafSet) # Delete the ROOT key as it adds no value from a viz perspective del pc['ROOT'] retdata = dict() retdata['tree'] = sf.dataParentChildToTree(pc) retdata['data'] = datamap return json.dumps(retdata).encode('utf-8')
def scanlog(self, id, limit=None, rowId=None, reverse=None): """Scan log data Args: id: TBD limit: TBD rowId: TBD reverse: TBD Returns: str: JSON """ dbh = SpiderFootDb(self.config) retdata = [] try: data = dbh.scanLogs(id, limit, rowId, reverse) except Exception: return retdata for row in data: generated = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(row[0] / 1000)) retdata.append([generated, row[1], row[2], html.escape(row[3]), row[4]]) return retdata
def scansummary(self, id, by): """Summary of scan results Args: id (str): scan ID by: TBD """ cherrypy.response.headers['Content-Type'] = "application/json; charset=utf-8" retdata = [] dbh = SpiderFootDb(self.config) try: scandata = dbh.scanResultSummary(id, by) except Exception: return json.dumps(retdata).encode('utf-8') try: statusdata = dbh.scanInstanceGet(id) except Exception: return json.dumps(retdata).encode('utf-8') for row in scandata: if row[0] == "ROOT": continue lastseen = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(row[2])) retdata.append([row[0], row[1], lastseen, row[3], row[4], statusdata[5]]) return json.dumps(retdata).encode('utf-8')
def scaneventresults(self, id, eventType, filterfp=False): """Event results for a scan Args: id (str): scan ID eventType (str): filter by event type filterfp: TBD """ cherrypy.response.headers['Content-Type'] = "application/json; charset=utf-8" retdata = [] dbh = SpiderFootDb(self.config) try: data = dbh.scanResultEvent(id, eventType, filterfp) except Exception: return json.dumps(retdata).encode('utf-8') for row in data: lastseen = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(row[0])) escapeddata = html.escape(row[1]) escapedsrc = html.escape(row[2]) retdata.append([lastseen, escapeddata, escapedsrc, row[3], row[5], row[6], row[7], row[8], row[13], row[14], row[4]]) return json.dumps(retdata).encode('utf-8')
def scanlog(self, id, limit=None, rowId=None, reverse=None): """Scan log data Args: id: TBD limit: TBD rowId: TBD reverse: TBD """ cherrypy.response.headers['Content-Type'] = "application/json; charset=utf-8" dbh = SpiderFootDb(self.config) retdata = [] try: data = dbh.scanLogs(id, limit, rowId, reverse) except Exception: return json.dumps(retdata).encode('utf-8') for row in data: generated = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(row[0] / 1000)) retdata.append([generated, row[1], row[2], html.escape(row[3]), row[4]]) return json.dumps(retdata).encode('utf-8')
def stopscanmulti(self, ids): """Stop a scan Args: ids (str): comma separated list of scan IDs Note: Unnecessary for now given that only one simultaneous scan is permitted """ dbh = SpiderFootDb(self.config) error = list() for id in ids.split(","): scaninfo = dbh.scanInstanceGet(id) if not scaninfo: return self.error("Invalid scan ID: %s" % id) scanname = str(scaninfo[0]) scanstatus = scaninfo[5] if scanstatus == "FINISHED": error.append("Scan '%s' is in a finished state. <a href='/scandelete?id=%s&confirm=1'>Maybe you want to delete it instead?</a>" % (scanname, id)) continue if scanstatus == "ABORTED": error.append("Scan '" + scanname + "' is already aborted.") continue dbh.scanInstanceSet(id, status="ABORT-REQUESTED") raise cherrypy.HTTPRedirect("/")
def scandelete(self, id): """Delete scan(s) Args: id (str): comma separated list of scan IDs Returns: str: JSON response """ if not id: return self.jsonify_error('404', "No scan specified") dbh = SpiderFootDb(self.config) ids = id.split(',') for scan_id in ids: res = dbh.scanInstanceGet(scan_id) if not res: return self.jsonify_error('404', f"Scan {id} does not exist") if res[5] in ["RUNNING", "STARTING", "STARTED"]: return self.jsonify_error('400', f"Scan {id} is {res[5]}. You cannot delete running scans.") for scan_id in ids: dbh.scanInstanceDelete(scan_id) return b""
def test_scanEventStore_argument_sfEvent_with_empty_risk_property_value_should_raise_ValueError( self): """ Test scanEventStore(self, instanceId, sfEvent, truncateSize=0) """ sfdb = SpiderFootDb(self.default_options, False) event_type = 'ROOT' event_data = 'example data' module = '' source_event = '' source_event = SpiderFootEvent(event_type, event_data, module, source_event) event_type = 'example event type' event_data = 'example event data' module = 'example module' event = SpiderFootEvent(event_type, event_data, module, source_event) instance_id = "example instance id" invalid_values = [-1, 101] for invalid_value in invalid_values: with self.subTest(invalid_value=invalid_value): with self.assertRaises(ValueError): event = SpiderFootEvent(event_type, event_data, module, source_event) event.risk = invalid_value sfdb.scanEventStore(instance_id, event)
def scandelete(self, id, confirm=None): """Delete a scan Args: id (str): scan ID confirm (str): specify any value (except None) to confirm deletion of the scan """ dbh = SpiderFootDb(self.config) res = dbh.scanInstanceGet(id) if res is None: if cherrypy.request.headers and 'application/json' in cherrypy.request.headers.get('Accept'): cherrypy.response.headers['Content-Type'] = "application/json; charset=utf-8" return json.dumps(["ERROR", "Scan ID not found."]).encode('utf-8') return self.error("Scan ID not found.") if confirm: dbh.scanInstanceDelete(id) if cherrypy.request.headers and 'application/json' in cherrypy.request.headers.get('Accept'): cherrypy.response.headers['Content-Type'] = "application/json; charset=utf-8" return json.dumps(["SUCCESS", ""]).encode('utf-8') raise cherrypy.HTTPRedirect("/") templ = Template(filename='dyn/scandelete.tmpl', lookup=self.lookup) return templ.render(id=id, name=str(res[0]), names=list(), ids=list(), pageid="SCANLIST", docroot=self.docroot)
def test_scanEventStore_argument_sfEvent_with_invalid_visibility_property_type_should_raise_TypeError( self): """ Test scanEventStore(self, instanceId, sfEvent, truncateSize=0) """ sfdb = SpiderFootDb(self.default_options, False) event_type = 'ROOT' event_data = 'example data' module = '' source_event = '' source_event = SpiderFootEvent(event_type, event_data, module, source_event) event_type = 'example event type' event_data = 'example event data' module = 'example module' event = SpiderFootEvent(event_type, event_data, module, source_event) instance_id = "example instance id" invalid_types = [None, list(), dict()] for invalid_type in invalid_types: with self.subTest(invalid_type=invalid_type): with self.assertRaises(TypeError): event = SpiderFootEvent(event_type, event_data, module, source_event) event.visibility = invalid_type sfdb.scanEventStore(instance_id, event)
def scandeletemulti(self, ids, confirm=None): """Delete a scan Args: ids (str): comma separated list of scan IDs confirm: TBD """ dbh = SpiderFootDb(self.config) names = list() for id in ids.split(','): res = dbh.scanInstanceGet(id) names.append(str(res[0])) if res is None: return self.error("Scan ID not found (" + id + ").") if res[5] in ["RUNNING", "STARTING", "STARTED"]: return self.error("You cannot delete running scans.") if confirm: for id in ids.split(','): dbh.scanInstanceDelete(id) raise cherrypy.HTTPRedirect("/") templ = Template(filename='dyn/scandelete.tmpl', lookup=self.lookup) return templ.render(id=None, name=None, ids=ids.split(','), names=names, pageid="SCANLIST", docroot=self.docroot)
def test_create_should_create_database_schema(self): """ Test create(self) """ sfdb = SpiderFootDb(self.default_options, False) sfdb.create() self.assertEqual('TBD', 'TBD')
def test_configGet_should_return_a_dict(self): """ Test configGet(self) """ sfdb = SpiderFootDb(self.default_options, False) config = sfdb.configGet() self.assertIsInstance(config, dict)
def test_eventTypes_should_return_a_list(self): """ Test eventTypes(self) """ sfdb = SpiderFootDb(self.default_options, False) event_types = sfdb.eventTypes() self.assertIsInstance(event_types, list)
def test_scanInstanceGet_should_return_scan_info(self): """ Test scanInstanceGet(self, instanceId) """ sfdb = SpiderFootDb(self.default_options, False) instance_id = "example instance id" scan_name = "example scan name" scan_target = "example scan target" sfdb.scanInstanceCreate(instance_id, scan_name, scan_target) scan_instance_get = sfdb.scanInstanceGet(instance_id) self.assertEqual(len(scan_instance_get), 6) self.assertIsInstance(scan_instance_get[0], str) self.assertEqual(scan_instance_get[0], scan_name) self.assertIsInstance(scan_instance_get[1], str) self.assertEqual(scan_instance_get[1], scan_target) self.assertIsInstance(scan_instance_get[2], float) self.assertIsInstance(scan_instance_get[3], float) self.assertIsInstance(scan_instance_get[4], float) self.assertIsInstance(scan_instance_get[5], str) self.assertEqual(scan_instance_get[5], 'CREATED')
def threadWorker(self): try: # create new database handle since we're in our own thread from spiderfoot import SpiderFootDb self.setDbh(SpiderFootDb(self.opts)) self.sf = copy(self.sf) self.sf._dbh = self.__sfdb__ if not (self.incomingEventQueue and self.outgoingEventQueue): self.log.error("Please set up queues before starting module as thread") return while not self.checkForStop(): try: sfEvent = self.incomingEventQueue.get_nowait() self.log.debug(f"{self.__name__}.threadWorker() got event, {sfEvent.eventType}, from incomingEventQueue.") self.running = True self.handleEvent(sfEvent) self.running = False except queue.Empty: sleep(.3) continue except KeyboardInterrupt: self.log.warning(f"Interrupted module {self.__name__}.") self._stopScanning = True except Exception as e: import traceback self.log.error(f"Exception ({e.__class__.__name__}) in module {self.__name__}." + traceback.format_exc()) self.errorState = True finally: self.running = False
def test_notifyListeners_event_type_and_data_same_as_source_event_source_event_should_story_only( self): """ Test notifyListeners(self, sfEvent) """ sfp = SpiderFootPlugin() sfdb = SpiderFootDb(self.default_options, False) sfp.setDbh(sfdb) event_type = 'ROOT' event_data = 'test data' module = 'test module' source_event = None evt = SpiderFootEvent(event_type, event_data, module, source_event) event_type = 'test event type' event_data = 'test data' module = 'test module' source_event = evt evt = SpiderFootEvent(event_type, event_data, module, source_event) source_event = evt evt = SpiderFootEvent(event_type, event_data, module, source_event) source_event = evt evt = SpiderFootEvent(event_type, event_data, module, source_event) sfp.notifyListeners(evt) self.assertEqual('TBD', 'TBD')
def test_notifyListeners_output_filter_unmatched_should_not_notify_listener_modules( self): """ Test notifyListeners(self, sfEvent) """ sfp = SpiderFootPlugin() sfdb = SpiderFootDb(self.default_options, False) sfp.setDbh(sfdb) target = SpiderFootTarget("spiderfoot.net", "INTERNET_NAME") sfp.setTarget(target) event_type = 'ROOT' event_data = 'test data' module = 'test module' source_event = None evt = SpiderFootEvent(event_type, event_data, module, source_event) event_type = 'test event type' event_data = 'test data' module = 'test module' source_event = evt evt = SpiderFootEvent(event_type, event_data, module, source_event) sfp.__outputFilter__ = "example unmatched event type" sfp.notifyListeners(evt) self.assertEqual('TBD', 'TBD')
def test_run_correlations_invalid_scan_instance_should_raise_ValueError( self): sfdb = SpiderFootDb(self.default_options, False) correlator = SpiderFootCorrelator(sfdb, {}, 'example scan id') with self.assertRaises(ValueError): correlator.run_correlations()
def rerunscan(self, id): """Rerun a scan Args: id (str): scan ID Returns: None Raises: HTTPRedirect: redirect to info page for new scan """ # Snapshot the current configuration to be used by the scan cfg = deepcopy(self.config) modlist = list() dbh = SpiderFootDb(cfg) info = dbh.scanInstanceGet(id) if not info: return self.error("Invalid scan ID.") scanname = info[0] scantarget = info[1] scanconfig = dbh.scanConfigGet(id) if not scanconfig: return self.error(f"Error loading config from scan: {id}") modlist = scanconfig['_modulesenabled'].split(',') if "sfp__stor_stdout" in modlist: modlist.remove("sfp__stor_stdout") targetType = SpiderFootHelpers.targetTypeFromString(scantarget) if not targetType: # It must then be a name, as a re-run scan should always have a clean # target. Put quotes around the target value and try to determine the # target type again. targetType = SpiderFootHelpers.targetTypeFromString(f'"{scantarget}"') if targetType not in ["HUMAN_NAME", "BITCOIN_ADDRESS"]: scantarget = scantarget.lower() # Start running a new scan scanId = SpiderFootHelpers.genScanInstanceId() try: p = mp.Process(target=SpiderFootScanner, args=(scanname, scanId, scantarget, targetType, modlist, cfg)) p.daemon = True p.start() except Exception as e: self.log.error(f"[-] Scan [{scanId}] failed: {e}") return self.error(f"[-] Scan [{scanId}] failed: {e}") # Wait until the scan has initialized while dbh.scanInstanceGet(scanId) is None: self.log.info("Waiting for the scan to initialize...") time.sleep(1) raise cherrypy.HTTPRedirect(f"{self.docroot}/scaninfo?id={scanId}", status=302)
def __init__(self, web_config, config): """Initialize web server Args: web_config: config settings for web interface (interface, port, root path) config: SpiderFoot config Raises: TypeError: arg type is invalid ValueError: arg value is invalid """ if not isinstance(config, dict): raise TypeError(f"config is {type(config)}; expected dict()") if not config: raise ValueError("config is empty") if not isinstance(web_config, dict): raise TypeError(f"web_config is {type(web_config)}; expected dict()") if not config: raise ValueError("web_config is empty") self.docroot = web_config.get('root', '/').rstrip('/') # 'config' supplied will be the defaults, let's supplement them # now with any configuration which may have previously been saved. self.defaultConfig = deepcopy(config) dbh = SpiderFootDb(self.defaultConfig) sf = SpiderFoot(self.defaultConfig) self.config = sf.configUnserialize(dbh.configGet(), self.defaultConfig) cherrypy.config.update({ 'error_page.401': self.error_page_401, 'error_page.404': self.error_page_404, 'request.error_response': self.error_page }) csp = ( secure.ContentSecurityPolicy() .default_src("'self'") .script_src("'self'", "'unsafe-inline'", "blob:") .style_src("'self'", "'unsafe-inline'") .base_uri("'self'") .connect_src("'self'", "data:") .frame_src("'self'", 'data:') .img_src("'self'", "data:") ) secure_headers = secure.Secure( server=secure.Server().set("server"), cache=secure.CacheControl().must_revalidate(), csp=csp, referrer=secure.ReferrerPolicy().no_referrer(), ) cherrypy.config.update({ "tools.response_headers.on": True, "tools.response_headers.headers": secure_headers.framework.cherrypy() })
def test_scanErrors_should_return_a_list(self): """ Test scanErrors(self, instanceId, limit=None) """ sfdb = SpiderFootDb(self.default_options, False) instance_id = "example instance id" scan_instance = sfdb.scanErrors(instance_id, None) self.assertIsInstance(scan_instance, list)