def compute_sp(self): from queue import Queue queue = Queue() datalen = len(self.D['coords']) self(queue,0,datalen, True, False) self(queue,0,datalen, False, False) return queue.get() + queue.get()
def compute_sp(self): from Queue import Queue queue = Queue() datalen = len(self.D['coords']) self(queue, 0, datalen, True, False) self(queue, 0, datalen, False, False) return queue.get() + queue.get()
def put(self, contents: QT, block=True, timeout=None): self.lock.acquire() while not self.empty(): Queue.get(self, block=False) # NOTE/TODO: this, because multiprocessing Queues are stupid, is # necessary. Explained in short, if you try to q.put_nowait() too # quickly, it breaks. For example, say you were in ipython, # and you typed to following # - q = MonoQueue() # - q.put_nowait(2) # - q.put_nowait(3) # - q.put_nowait(4) # - q.put_nowait(5) # EVEN THOUGH there is a Lock() to atomize the access to the Queue, # one of the non-first 'put_nowait()' calls will acquire the lock, # the 'self.empty()' call is apparently True, even though something is # actually in the queue, and then it will not '.get()' it and try to # put something in the queue, raise a 'Full' exception. # So basically, apparently if something tries to put in the queue too # quickly, everything breaks. And yes, I made a pytest to test this, # guess what, if you try to run a trace (debugger), aka you jus step # through, it works fine, but as soon as you just run it, it breaks. # UGH, maybe I'm dumb and am doing something wrong with suppress(Full): Queue.put(self, contents, block=block, timeout=timeout) self.lock.release()
def process(self, input_queue: Queue, output_queue: Queue) -> None: database = FileDatabase(self.database_path) chunk = [] file = input_queue.get() while not isinstance(file, TerminateOperand): correct_input, message = self.is_correct_input(file) if not correct_input: print(message) continue if self.process_item(file): chunk.append(file) if len(chunk) >= self.chunk_size: self.process_chunk(database, chunk) for chunklet in chunk: output_queue.put(chunklet) chunk = [] file = input_queue.get() input_queue.put(file) self.process_chunk(database, chunk) for chunklet in chunk: output_queue.put(chunklet)
def empty_queue(queue_: Queue) -> None: while True: try: queue_.get(block=False) except queue.Empty: break queue_.close()
class RunningListenerTestCase(TestCase): def setUp(self): pg_connector = PostgresConnector(_POSTGRES_DSN) self.notif_queue = Queue(1) self.listener = PostgresNotificationListener( pg_connector, _NOTIF_CHANNEL, self.notif_queue, CommonErrorStrategy(), Queue(), fire_on_start=False ) self.listener.log = MagicMock() def tearDown(self): self.listener.terminate() self.listener.join() def test_notification_listener(self): self.assertTrue(self.notif_queue.empty()) # Start the listener process self.listener.start() sleep(1) send_notification() try: error = self.listener.error_queue.get(timeout=1) self.assertTrue(False, error) except Empty: pass notif = self.notif_queue.get(timeout=2) self.assertIsNotNone(notif) def test_notification_stops_listening_on_terminate(self): self.assertTrue(self.notif_queue.empty()) self.listener.start() sleep(3) self.listener.terminate() # Send a notification and ensure the listener is NOT listening send_notification() exception_raised = False try: self.notif_queue.get(timeout=5) except Empty: exception_raised = True self.assertTrue(exception_raised) def test_notification_listener_terminates(self): self.assertTrue(self.notif_queue.empty()) self.listener.start() sleep(1) self.listener.terminate() sleep(1) self.assertFalse(self.listener.is_alive())
def launchSequentialProcess(worldRunner, observableQueue, updateQueue, worldNumber, runNumber, useGUI): """ Run world in the same process since processes run sequentially. """ forkQueue = Queue() worldRunner(manager, observableQueue, updateQueue, forkQueue, worldNumber, runNumber, useGUI) # Empty fork queue try: while True: forkQueue.get(False) except: pass
def main(): result_queue = Queue() crawler = CrawlerWorker(CanberraWealtherSpider(), result_queue) crawler.start() for item in result_queue.get(): #print datetime.datetime.now(),item print item
def get_nowait(self) -> QT: self.lock.acquire() contents = None with suppress(Empty): contents = Queue.get(self, block=False) self.lock.release() return contents
def _calculate_rmse_mp(self, population, process_count): i = 0 process_pop = dict() while i < len(population): for j in range(process_count): if str(j) not in process_pop.keys(): process_pop[str(j)] = [] if i < len(population): process_pop[str(j)].append(population[i]) i += 1 final_population = [] queue = Queue() processes = [] for i in range(process_count): pop = process_pop[str(i)] process = Process(target=self._calculate_rmse, name="%d" % i, args=(pop, queue)) process.start() processes.append(process) for i in range(process_count): final_population += queue.get() for process in processes: process.join() return final_population
def test_spy(): """Test the measure spy working. """ q = Queue() data = TaskDatabase() spy = MeasureSpy(queue=q, observed_database=data, observed_entries=('test',)) data.notifier(('test', 1)) assert q.get() data.notifier(('test2', 1)) assert q.empty() spy.close() assert q.get() == ('', '')
def test_spy(): """Test the measure spy working. """ q = Queue() data = TaskDatabase() spy = MeasureSpy(queue=q, observed_database=data, observed_entries=('test', )) data.notifier(('test', 1)) assert q.get() data.notifier(('test2', 1)) assert q.empty() spy.close() assert q.get() == ('', '')
class UDPServer(Process): def __init__(self, queue): Process.__init__(self, name = "UDPServer") #self.daemon = True self.queue = queue self.shutdownQueue = Queue() self.start() def __checkShutdown(self): try: self.shutdownQueue.get(block = False) except Empty, _e: return self.reactor.callLater(1, self.__checkShutdown) self.shutdownQueue.close() if self.reactor.running: self.reactor.stop() self.queue.close()
def test_multiprocess_tasks(): wait_until_convenient() TAG = "message_q" def fetch_task(queue): pid = os.getpid() count = 0 for dq in q.listen(TAG, timeout=1): s = { 'pid': pid, 'data': dq } if dq: count += 1 queue.put(s) sleep(uniform(0.1, 0.5)) # sleep 0.1~0.5 seconds randomly elif q.count(TAG) == 0: return count # the number of tasks done by this process test_items = range(0, 10000) # enqueue 10000 tasks for i in test_items: q.enqueue(TAG, i + 1) while q.count(TAG) != len(test_items): # wait until test data is ready wait_until_convenient() jobs = [] wait_until_convenient() queue = Queue() start = timer() num_p = 30 # the number of processes to use for i in range(0, num_p): job = Process(target=fetch_task, args=(queue,)) jobs.append(job) job.start() # start task process remaining = q.count(TAG) while remaining > 0: # wait until the queue is consumed completely remaining = q.count(TAG) sys.stdout.write('\rRunning test_multiprocess_tasks - remaining %5d/%5d' % (remaining, len(test_items),)) sys.stdout.flush() wait_until_convenient() processed_data = set() qsize = 0 while not queue.empty(): item = queue.get() data = item.get('data') qsize += 1 assert data not in processed_data, "failed test_multiprocess_tasks - data %s has been processed already" % (data, ) processed_data.add(item.get('data')) queue.close() queue.join_thread() for j in jobs: j.join() assert qsize == len(test_items), "failed test_multiprocess_tasks - tasks are not complete %d/%d" % (qsize, len(test_items), ) end = timer() print("\rOK test_multiprocess_tasks - %d done in %5d seconds" % (qsize, end - start))
def start(self,url): # raise BadFormatError items = [] # The part below can be called as often as you want results = Queue() crawler = CrawlerWorker(LinkedinSpider(url), results) crawler.start() for item in results.get(): items.append(dict(item)) return items
def get(self, block=True, timeout=None): ret = Queue.get(self, block, timeout) if self.qsize() == 0: self.cond_empty.acquire() try: self.cond_empty.notify_all() finally: self.cond_empty.release() return ret
class ServerSink(iMockDebuggerSink): def __init__(self, peerName, theTime, details, quiet): self._peerName = peerName self._methods = [] methods = iMockDebuggerSink()._getMethods() self._methods = methods self._terminate = False self._details = details self._qw = None self._startMutex = Semaphore(0) self._q = Queue() self.quiet= quiet self._marshaller = MarshallerFactory.get(MarshallerFactory.DEFAULT, quiet=quiet) self._qw = QueueWriter(target=details, autoConnect=True, marshaller=self._marshaller, quiet=quiet) self._qw.start() self.thread = None def start(self): t = threading.Thread(target=self.run, args=[self._startMutex]) t.setName("ServerSink.%(P)s"%{"P":self._peerName}) t.setDaemon(True) self.thread = t self.thread.start() return "server.sink.started" def close(self): self._terminate = True try: self.thread.join() except: pass try: self._qw.close() except: pass try: self._q.close() except: pass return "server.sink.closed" def waitUntilRunning(self, block=True, timeout=None): self._startMutex.acquire(block=block, timeout=timeout) return self def __getattribute__(self, name): if name in object.__getattribute__(self, "_methods"): q = self._q def wrapper(self, *args, **kwargs): ServerSink._testPickleability((name, args, kwargs)) q.put((name, args, kwargs)) return wrapper return object.__getattribute__(self, name) def run(self, startMutex): startMutex.release() while self._terminate==False: try: data = self._q.get(block=True, timeout=1) except Empty: pass else: ServerSink._testPickleability(data) try: self._qw.put(data, block=True, timeout=10) except Exception, _e: break
class BasicActor(ABC): name: Optional[str] in_queue: Queue out_queue: Queue alive: Value _loop_task: Optional[Task] def __init__(self, name=None): self.name = name self._state = None ctx = SpawnContext() self.alive = Value('b', True) self.in_queue = Queue(ctx=ctx, maxsize=120) self.out_queue = Queue(ctx=ctx, maxsize=110) async def runner(self): loop = get_running_loop() self._state = await self.handle_started() self._loop_task = loop.create_task(self._do_loop(loop)) async def _do_loop(self, loop: AbstractEventLoop): while loop.is_running(): try: sent_from, message = self.in_queue.get(timeout=0.1) loop.create_task( self._handle_message(message, sent_from, self._state)) except Empty: pass await asyncio.sleep(0) def send_message(self, to, message): if self.out_queue.qsize() > 100: logger.warning("Shedding excess outgoing message") return self.out_queue.put((to, message)) async def _handle_message(self, message, sent_from, state): try: self._state = await self.handle_message(message, sent_from, state) except: self.stop() raise def stop(self): self.alive.value = False if self._loop_task: self._loop_task.cancel() @abstractmethod async def handle_message(self, message, sent_from, state) -> Any: pass @abstractmethod async def handle_started(self) -> Any: pass
def profile(self, queue: Queue) -> None: """ Handle profiling clients as they come into the queue """ frame = queue.get() oui_manuf, capabilities = self.analyze_assoc_req(frame) analysis_hash = hash(f"{frame.addr2}: {capabilities}") if analysis_hash in self.analyzed_hash.keys(): self.log.debug( "already seen %s (capabilities hash=%s) this session, ignoring...", frame.addr2, analysis_hash, ) else: if self.is_randomized(frame.addr2): if oui_manuf is None: oui_manuf = "Randomized MAC" else: oui_manuf = "{0} (Randomized MAC)".format(oui_manuf) self.last_manuf = oui_manuf self.log.debug("%s oui lookup matched to %s", frame.addr2, oui_manuf) self.analyzed_hash[analysis_hash] = frame text_report = self.generate_text_report( oui_manuf, capabilities, frame.addr2, self.channel ) self.log.info("text report\n%s", text_report) if self.channel < 15: band = "2.4GHz" elif self.channel > 30 and self.channel < 170: band = "5.8GHz" else: band = "unknown" self.log.debug( "writing text and csv report for %s (capabilities hash=%s)", frame.addr2, analysis_hash, ) self.write_analysis_to_file_system( text_report, capabilities, frame, oui_manuf, band ) self.client_profiled_count += 1 self.log.debug("%s clients profiled", self.client_profiled_count) # if we end up sending multiple frames from pcap for profiling - this will need changed if self.pcap_analysis: self.log.info( "exiting because we were told to only analyze %s", self.pcap_analysis, ) sys.exit()
def getResult(self,key): "This method return the result in imdb for given key" spider = ImdbSpider(key) result_queue = Queue() crawler = CrawlerWorker(spider, result_queue) crawler.start() results = result_queue.get() if len(results)>self.maxResult : del results[self.maxResult:] logging.debug('%s results', len(results)) return results
def findCalibrationChessboard(image): findTimeout = 10 patternSize = (7, 7) # Internal corners of 8x8 chessboard grayImg = cv.CreateMat(image.rows, image.cols, cv.CV_8UC1) cv.CvtColor(image, grayImg, cv.CV_RGB2GRAY) cv.AddWeighted(grayImg, -1, grayImg, 0, 255, grayImg) cornerListQueue = Queue() def getCorners(idx, inImg, cornersQueue): """Search for corners in image and put them in the queue""" print "{} Searching".format(idx) _, corners = cv.FindChessboardCorners(inImg, patternSize) print "{} found {} corners".format(idx, len(corners)) saveimg(inImg, name="Chessboard_Search_{}".format(idx)) cornersQueue.put(corners) for i in range(0, 12, 3): img = cv.CloneMat(grayImg) cv.Erode(img, img, iterations=i) cv.Dilate(img, img, iterations=i) p = multiprocessing.Process(target=lambda: getCorners(i, img, cornerListQueue)) p.daemon = True p.start() corners = [] while len(corners) != 49 and i > 0: corners = cornerListQueue.get(True) print "Got Result {}".format(i) i -= 1 if len(corners) == 49: # Debug Image debugImg = cv.CreateMat(grayImg.rows, grayImg.cols, cv.CV_8UC3) cv.CvtColor(grayImg, debugImg, cv.CV_GRAY2RGB) for pt in corners: pt = (int(pt[0]), int(pt[1])) cv.Circle(debugImg, pt, 4, (255, 0, 0)) saveimg(debugImg, name="Corners_Found") # //Debug Image # Figure out the correct corner mapping points = sorted([corners[42], corners[0], corners[6], corners[48]], key=lambda pt: pt[0] + pt[1]) if points[1][0] < points[2][0]: points[1], points[2] = points[2], points[1] # swap tr/bl as needed (tl, tr, bl, br) = points warpCorners = [tl, tr, br, bl] else: print "Could not find corners" warpCorners = [] return warpCorners
def yk_monitor(self, mon_l): # forming command to run parallel monitoring processes mon_cmd = ' & '.join(["xinput test {}".format(y_id) for y_id in mon_l]) monitor = subprocess.Popen(mon_cmd, shell=True, stdout=subprocess.PIPE) stdout_queue = Queue() stdout_reader = AsynchronousFileReader(monitor.stdout, stdout_queue) stdout_reader.start() triggered = False timestamp = time.time() while not stdout_reader.eof and time.time() - timestamp < TIMEOUT: while stdout_queue.qsize() > 0: stdout_queue.get() # emptying queue triggered = True time.sleep(.01) if triggered: print('YubiKey triggered. Now disabling.') break time.sleep(.001) if not triggered: print('No YubiKey triggered. Timeout.')
def yk_monitor(self, mon_l): # forming command to run parallel monitoring processes mon_cmd = ' & '.join(["xinput test {}".format(y_id) for y_id in mon_l]) monitor = subprocess.Popen(mon_cmd, shell=True, stdout=subprocess.PIPE) stdout_queue = Queue() stdout_reader = AsynchronousFileReader(monitor.stdout, stdout_queue) stdout_reader.start() triggered = False timestamp = time.time() while not stdout_reader.eof and time.time() - timestamp < TIMEOUT: while stdout_queue.qsize() > 0: stdout_queue.get() # emptying queue triggered = True time.sleep(.04) if triggered: print('YubiKey triggered. Now disabling.') break time.sleep(.001) if not triggered: print('No YubiKey triggered. Timeout.')
def _centrallogger_worker(logging_q: mpq.Queue): """Entry for a thread providing central logging from subprocesses via a shared queue""" threadname = threading.currentThread().getName() thread_logger = LOGGER.getChild(threadname) thread_logger.info('{} started'.format(threadname)) while True: # only calling process terminates - keeping logging alive record = logging_q.get(block=True) if record == 'TER': thread_logger.info('received {}'.format(record)) break LOGGER.handle(record) qsize = logging_q.qsize() if qsize > 10: thread_logger.warning(f"logging_q has size {qsize}") if logging_q.full(): thread_logger.warning(f"logging_q is full")
def embedding_writer(queue: Queue): meta = get_connection().Paragraphs_Meta while True: batch_result = queue.get() if batch_result is None: break meta_ids, _, hidden_states = batch_result[:3] hidden_states = hidden_states[:, 0] for meta_id, hs in zip(meta_ids, hidden_states): meta.update_one( {'_id': meta_id}, {'$set': { ('paragraph_embedding_' + version): hs.tolist(), }} ) print(meta_id, hs.shape)
class AsynchronousSolver(threading.Thread): tId = itertools.count() r""" @summary: Manages a suit of solver workers via queues, we could manage via a queue-like interface (PyRQ !). """ def __init__(self, solvers, lock): self._solvers = solvers self._lock = lock self._q = Queue() self._qDistributor = Queue() self._queues = {} self._type = SolverImplType.THREADED super(AsynchronousSolver, self).__init__() self._go() def _go(self): model = self._solvers._getModel() for _ in xrange(self._solvers._count): q = Queue() tId = AsynchronousSolver.tId.next() asyncSolverImpl = AsyncSolverFactory(self._type, tId, self._qDistributor, q, model.clone()) self._queues[tId] = (asyncSolverImpl, q) asyncSolverImpl.start() self.setName("AsynchronousSolver") self.setDaemon(True) self.start() def _terminate(self): self._informAbort() for context in self._queues.values(): (thread, q) = context try: q.put(Abort()) except: pass try: thread.terminate() except: pass self._queues = {} def get(self, *args, **kwargs): try: data = self._q.get(*args, **kwargs) except Empty: raise except Exception, e: print "AsynchronousSolver get: error:\n%(T)s"%{"T":traceback.format_exc()} raise e else:
def get(self): ''' Get the element in the queue Raises an exception if it's empty or if too many errors are encountered ''' dt = 1e-3 while dt < 1: try: element = Queue.get(self) return element except IOError: logger.warning('IOError encountered in SafeQueue get()') try: time.sleep(dt) except:pass dt *= 2 e = IOError('Unrecoverable error') raise e
def wav_worker(q: Queue, uid: str, ): root = os.path.join(os.path.dirname(__file__), 'upload_waves') os.makedirs(root, exist_ok=True) filename = os.path.join(root, f'{uid}_{time.time()}.wav') try: wav = wave.open(filename, mode='wb') wav.setframerate(16000) wav.setnchannels(1) wav.setsampwidth(2) while True: data_bytes = q.get() wav.writeframes(data_bytes) print(f'q.get {len(data_bytes)}') except Exception as e: logging.debug(e) finally: wav.close() logging.info('leave wav_worker')
def get(self): ''' Get the element in the queue Raises an exception if it's empty or if too many errors are encountered ''' dt = 1e-3 while dt < 1: try: element = Queue.get(self) return element except IOError: logger.warning('IOError encountered in SafeQueue get()') try: time.sleep(dt) except: pass dt *= 2 e = IOError('Unrecoverable error') raise e
def parallel_sort(bam, out, n_workers): lb = BGZFReader(bam) mem = lb.uncompressed_size buf = RawArray(ctypes.c_char, mem) q = Queue() procs = [] block_allocs = chunk(lb.blocks, n_workers) offsets = [0] + list(accumulate(sum(b.offset for b in blocks) for blocks in block_allocs))[:-1] ary_szs = [sum([b.size_u for b in blocks]) for blocks in block_allocs] bufs = [RawArray(ctypes.c_char,mem) for mem in ary_szs] z = zip(chunk(lb.blocks, n_workers), offsets, bufs) for i,(blocks,off,buf) in enumerate(z): args = (i, bam, blocks, off, buf, q) p = Process(target=sort_read_ary, args=args) procs.append(p) p.start() combined = [] for _ in procs: combined += q.get(True) logging.debug("Starting combined sort on %i reads" % len(combined)) combined.sort() logging.debug("Finished combined sort") for p in procs: p.join() logging.debug("Returned from " + str(p)) hdr = RawBAM(gzip.GzipFile(bam), header=True).rawheader with open(out, 'wb') as f: write_bgzf_block(f, hdr) for creads in grouper(READS_PER_BLOCK, combined): data = "" for i,cr in enumerate(creads): data += bufs[cr.worker_num][cr.ptr:(cr.ptr+cr.bs+4)] write_bgzf_block(f, data) write_bam_eof(f)
def _monitoring_worker(self, inbox_q: mpq.Queue): """Entry for a thread collecting data and signals for monitoring sent from different processes""" threadname = threading.currentThread().getName() logger = LOGGER.getChild(threadname) logger.setLevel(logging.DEBUG) logger.info(f"{threadname} started") while True: data = inbox_q.get(block=True) if data == 'TER': logger.info(f"received {data}") break else: logger.debug(f"received {data}") timestamp, sender, label, content = data if label == gymdata.MonitoringLabel.LIVESIGNAL: filepath = f"{self.path.monitoring_folder}/livesignal.stats" with open(filepath, 'a+') as f: f.write(f"{timestamp}|{json.dumps(content)}\n") if label == gymdata.MonitoringLabel.SELFPLAYSTATS: filepath = f"{self.path.monitoring_folder}/{sender}_stats.stats" with open(filepath, 'a+') as f: f.write(f"{timestamp}|{json.dumps(content)}\n")
def read_queue(queue_: Queue) -> List[RunValue]: stack: List[RunValue] = [] while True: try: rval: RunValue = queue_.get(timeout=1) except queue.Empty: break # Check if there is a special placeholder value which tells us that # we don't have to wait until the queue times out in order to # retrieve the final value! if 'final_queue_element' in rval: del rval['final_queue_element'] do_break = True else: do_break = False stack.append(rval) if do_break: break if len(stack) == 0: raise queue.Empty else: return stack
def mp_factorizer(nums, nprocs): def worker(nums, out_q): """ The worker function, invoked in a process. 'nums' is a list of numbers to factor. The results are placed in a dictionary that's pushed to a queue. """ outdict = {} for n in nums: outdict[n] = factorize_naive(n) out_q.put(outdict) # Each process will get 'chunksize' nums and a queue to put his out # dict into out_q = Queue() chunksize = int(math.ceil(len(nums) / float(nprocs))) procs = [] for i in range(nprocs): p = multiprocessing.Process( target=worker, args=(nums[chunksize * i:chunksize * (i + 1)], out_q)) procs.append(p) p.start() # Collect all results into a single result dict. We know how many dicts # with results to expect. resultdict = {} for i in range(nprocs): resultdict.update(out_q.get()) # Wait for all worker processes to finish for p in procs: p.join() return resultdict
def main(): result_queue = Queue() crawler = CrawlerWorker(chnausSpider(), result_queue) crawler.start() for item in result_queue.get(): print datetime.datetime.now(),item
spider = AuctionViewItemPageSpider(startUrl = auctionUrl, itemno = itemno, kindOf="auction") if "gmarket.co.kr" in startUrl: itemno = re.search(r"goodscode=[0-9]+",startUrl.lower()).group().replace("goodscode=", "") gmarketUrl = "http://mitem.gmarket.co.kr/Item?goodsCode=" + itemno spider = GmarketViewItemPageSpider(startUrl = gmarketUrl, itemno = itemno, kindOf="gmarket") if "g9.co.kr" in startUrl: itemno = re.search(r"[0-9]+",startUrl).group() spider = G9ViewItemPageSpider(startUrl = startUrl.encode('utf-8'), itemno = itemno, kindOf="g9") if "coupang.com" in startUrl: itemno = re.search(r"[0-9]+",startUrl).group() spider = CoupangViewItemPageSpider(startUrl = startUrl.encode('utf-8'), itemno = itemno, kindOf="coupang") if "ticketmonster.co.kr" in startUrl: itemno = re.search(r"[0-9]+",startUrl).group() spider = TmonViewItemPageSpider(startUrl = startUrl.encode('utf-8'), itemno = itemno, kindOf="tmon") resultQueue = Queue() crawler = CrawlerWorker(spider, resultQueue) crawler.start() items = resultQueue.get() body = {} if(len(items) > 0): body = json.dumps(items[0].__dict__.get('_values')) print "Content-Type: application/json" print "Length:", len(body) print "" print body
class SmtpMessageServer(object): """ This class can start an SMTP debugging server, configure LinOTP to talk to it and read the results back to the parent tester. On open, an SMTP server is set up to listen locally. Derived classes can define a hook to set the LinOTP configuration to point to this server. Example usage: with SmtpMessageServer(testcase) as smtp: get_otp() """ def __init__(self, testcase, message_timeout): self.testcase = testcase # We need a minimum version of 2.9.2 to set the SMTP port number, so # skip if testing an earlier version self.testcase.need_linotp_version('2.9.2') self.timeout = message_timeout self.set_config = SetConfig(testcase.http_protocol, testcase.http_host, testcase.http_port, testcase.http_username, testcase.http_password) # We advertise the local SMTP server hostname # using the IP address that connects to LinOTP self.addr = self._get_local_ip() self.msg_payload = None def __enter__(self): self.smtp_process_queue = Queue() self.smtp_process = Process(target=get_otp_mail, args=(self.smtp_process_queue, self.timeout)) self.smtp_process.start() self.port = self.smtp_process_queue.get(True, 5) self._do_lintop_config() return self def _do_lintop_config(self): parameters = self.get_config_parameters() logger.debug("Configuration parameters: %s", parameters) result = self.set_config.setConfig(parameters) assert result, "It was not possible to set the config. Result:%s" % result def get_config_parameters(self): # This function can be overridden to provide configuration parameters to configure # specific parts of LinOTP assert False, "This function should be overridden" def get_otp(self): messagestr = self.smtp_process_queue.get(True, 10) msg = email.message_from_string(messagestr) otp = msg.get_payload() logger.debug("Received email message payload:%s", otp) return otp def __exit__(self, *args): self.smtp_process_queue.close() self.smtp_process.terminate() self.smtp_process.join(5) def _get_local_ip(self): """ Get the IP address of the interface that connects to LinOTP """ with closing( socket.create_connection( (self.testcase.http_host, int(self.testcase.http_port)), 10)) as s: addr = s.getsockname()[0] return addr
class NonBlockSubprocess(object): CHUNK_SIZE_DEFAULT = 4096 STOP_THREAD = b"STOP_THREAD_SYNTAX" """ NonBlockSubprocess support read, write data via Queue. Parameters ---------- process: Subprocess Sub process still alive chunk_size: int Size of chunk of reader process. (Default) CHUNK_SIZE_DEFAULT=4096 Raises --------- ValueError: chunk size <= 0. Because read process will blocking if chunk_size <= 0. TypeError: process wrong type. RuntimeError: Process haven't any IO. """ def __init__(self, process: Subprocess, chunk_size=CHUNK_SIZE_DEFAULT): if not isinstance(process, Subprocess): raise TypeError("process must be Subprocess") if not process.is_alive(): raise ValueError("Process wasn't working.") if chunk_size <= 0: raise ValueError("Chunk size must be > 0.") if self.process.stdout is None and self.process.stdin is None: raise RuntimeError("Process IO are unavailable.") self.process = process self.chunk_size = chunk_size self.read_buffer_cache = b"" if self.process.stdin is not None: self.queue_write = Queue() self.thread_write = Thread(target=self._write) self.thread_write.start() else: self.queue_write = None self.thread_write = None if self.process.stdout is not None: self.queue_read = Queue() self.thread_read = Thread(target=self._read) self.thread_read.start() else: self.queue_read = None self.thread_read = None def _write(self): if self.queue_write is None: return while 1: data = self.queue_write.get() if data == self.STOP_THREAD: break self.process.write(data) def write(self, data): if self.queue_write is None: raise AttributeError("Write data unavailable!") self.queue_write.put(data) def _read(self): if self.queue_read is None: return while 1: data = self.process.read(self.chunk_size) self.queue_read.put(data) def read(self, chunk_size=-1, timeout=None): """ Read :param chunk_size: :param timeout: :return: """ if self.queue_read is None: raise AttributeError("Read data unavailable!") chunk = self.read_buffer_cache while chunk.__len__() < chunk_size: chunk += self.queue_read.get(timeout=timeout) if chunk.__len__() > chunk_size: self.read_buffer_cache = chunk[chunk_size:] chunk = chunk_size[:chunk_size] return chunk def stop(self): """ Stop read/write via queue. Not handle process. :return: """ if self.queue_read is not None: self.queue_read.put(self.STOP_THREAD) if self.queue_write is not None: self.queue_read.put(self.STOP_THREAD) self.process.terminate()
class PmakeManager(Manager): """ Specialization of Manager for local multiprocessing, using an adhoc implementation of "pool" because of bugs of the Python 2.7 implementation of pool multiprocessing. """ queues = {} @contract(num_processes='int') def __init__(self, context, cq, num_processes, recurse=False, new_process=False, show_output=False): Manager.__init__(self, context=context, cq=cq, recurse=recurse) self.num_processes = num_processes self.last_accepted = 0 self.new_process = new_process self.show_output = show_output if new_process and show_output: msg = ('Compmake does not yet support echoing stdout/stderr ' 'when jobs are run in a new process.') warning(msg) self.cleaned = False def process_init(self): self.event_queue = Queue(1000) self.event_queue_name = str(id(self)) PmakeManager.queues[self.event_queue_name] = self.event_queue # info('Starting %d processes' % self.num_processes) self.subs = {} # name -> sub # available + processing + aborted = subs.keys self.sub_available = set() self.sub_processing = set() self.sub_aborted = set() db = self.context.get_compmake_db() storage = db.basepath # XXX: logs = os.path.join(storage, 'logs') #self.signal_queue = Queue() for i in range(self.num_processes): name = 'parmake_sub_%02d' % i write_log = os.path.join(logs, '%s.log' % name) make_sure_dir_exists(write_log) signal_token = name self.subs[name] = PmakeSub(name=name, signal_queue=None, signal_token=signal_token, write_log=write_log) self.job2subname = {} # all are available self.sub_available.update(self.subs) self.max_num_processing = self.num_processes # XXX: boiler plate def get_resources_status(self): resource_available = {} assert len(self.sub_processing) == len(self.processing) if not self.sub_available: msg = 'already %d nproc' % len(self.sub_processing) if self.sub_aborted: msg += ' (%d workers aborted)' % len(self.sub_aborted) resource_available['nproc'] = (False, msg) # this is enough to continue return resource_available else: resource_available['nproc'] = (True, '') return resource_available @contract(reasons_why_not=dict) def can_accept_job(self, reasons_why_not): if len(self.sub_available) == 0 and len(self.sub_processing) == 0: # all have failed msg = 'All workers have aborted.' raise MakeHostFailed(msg) resources = self.get_resources_status() some_missing = False for k, v in resources.items(): if not v[0]: some_missing = True reasons_why_not[k] = v[1] if some_missing: return False return True def instance_job(self, job_id): publish(self.context, 'worker-status', job_id=job_id, status='apply_async') assert len(self.sub_available) > 0 name = sorted(self.sub_available)[0] self.sub_available.remove(name) assert not name in self.sub_processing self.sub_processing.add(name) sub = self.subs[name] self.job2subname[job_id] = name if self.new_process: f = parmake_job2_new_process args = (job_id, self.context) else: f = parmake_job2 args = (job_id, self.context, self.event_queue_name, self.show_output) async_result = sub.apply_async(f, args) return async_result def event_check(self): if not self.show_output: return while True: try: event = self.event_queue.get(block=False) # @UndefinedVariable event.kwargs['remote'] = True broadcast_event(self.context, event) except Empty: break def process_finished(self): if self.cleaned: return self.cleaned = True # print('process_finished()') for name in self.sub_processing: self.subs[name].proc.terminate() for name in self.sub_available: self.subs[name].terminate() # XXX: in practice this never works well if False: # print('joining') timeout = 1 for name in self.sub_available: self.subs[name].proc.join(timeout) # XXX: ... so we just kill them mercilessly if True: # print('killing') for name in self.sub_processing: pid = self.subs[name].proc.pid os.kill(pid, signal.SIGKILL) #print('process_finished() finished') self.event_queue.close() del PmakeManager.queues[self.event_queue_name] # Normal outcomes def job_failed(self, job_id, deleted_jobs): Manager.job_failed(self, job_id, deleted_jobs) self._clear(job_id) def job_succeeded(self, job_id): Manager.job_succeeded(self, job_id) self._clear(job_id) def _clear(self, job_id): assert job_id in self.job2subname name = self.job2subname[job_id] del self.job2subname[job_id] assert name in self.sub_processing assert name not in self.sub_available self.sub_processing.remove(name) self.sub_available.add(name) def host_failed(self, job_id): Manager.host_failed(self, job_id) assert job_id in self.job2subname name = self.job2subname[job_id] del self.job2subname[job_id] assert name in self.sub_processing assert name not in self.sub_available self.sub_processing.remove(name) # put in sub_aborted self.sub_aborted.add(name) def cleanup(self): self.process_finished()
class Agent(object): def __init__(self, pull_interval=5): self.input = None self.filter = None self.output = None # for input write and filter read self.iqueue = Queue() # for filter write and output read self.oqueue = Queue() self.pull_interval = pull_interval self.__init_all() def __init_all(self): self.__set_filter() self.__set_output() self.__set_input() def __set_input(self): input_ins.set_res_queue(self.iqueue) def target(): while True: # pull_data must be realized in input handler task_list = Task.create_from_conf(input_ins, AGENT_INPUT, 'pull_data') if not task_list: time.sleep(self.pull_interval) list_task = [] for task in task_list: t = threading.Thread(target=task) t.setDaemon(True) t.start() list_task.append(t) for task in list_task: task.join() time.sleep(self.pull_interval) p = multiprocessing.Process(target=target) p.daemon = True p.start() logging.debug('{0} start input handlers ...'.format( self.__class__.__name__)) def __set_output(self): def target(): while True: data = self.oqueue.get() # push_data must be realized in output handler task_list = Task.create_from_conf(output_ins, AGENT_OUTPUT, 'push_data') if not task_list: continue list_task = [] for task in task_list: t = threading.Thread(target=task, args=(data, )) t.setDaemon(True) t.start() for task in list_task: task.join() p = multiprocessing.Process(target=target) p.daemon = True p.start() logging.debug('{0} start out handlers ...'.format( self.__class__.__name__)) def __set_filter(self): def target(): while True: data = self.iqueue.get() task_list = Task.create_from_conf(filter_ins, AGENT_FILTER, 'filter_data') filtered_data = data for task in task_list: filtered_data = task(filtered_data) self.oqueue.put(filtered_data) p = multiprocessing.Process(target=target) p.daemon = True p.start() logging.debug('{0} start filter handlers ...'.format( self.__class__.__name__)) def loop(self): logging.debug('{0} start successfully!'.format( self.__class__.__name__)) # as main block process while True: time.sleep(1)
class CrawlerWorker(multiprocessing.Process): def __init__(self, spider, result_queue): multiprocessing.Process.__init__(self) self.result_queue = result_queue self.crawler = CrawlerProcess(settings) if not hasattr(project, 'crawler'): self.crawler.install() self.crawler.configure() self.items = [] self.spider = spider dispatcher.connect(self._item_passed, signals.item_passed) def _item_passed(self, item): self.items.append(item) def run(self): self.crawler.crawl(self.spider) self.crawler.start() self.crawler.stop() self.result_queue.put(self.items) result_queue = Queue() crawler = CrawlerWorker(HBRSpider, result_queue) crawler.start() for item in result_queue.get(): print item
class MVacManager(Manager): """ Multyvac backend. """ @contract(num_processes='int') def __init__(self, context, cq, num_processes, recurse=False, show_output=False, new_process=False, volumes=[], rdb=False, rdb_vol=None, rdb_db=None): Manager.__init__(self, context=context, cq=cq, recurse=recurse) self.num_processes = num_processes self.last_accepted = 0 self.cleaned = False self.show_output = show_output self.new_process = new_process self.volumes = volumes self.rdb = rdb self.rdb_db = rdb_db self.rdb_vol = rdb_vol def process_init(self): self.event_queue = Queue() self.event_queue_name = str(id(self)) from compmake.plugins.backend_pmake.pmake_manager import PmakeManager PmakeManager.queues[self.event_queue_name] = self.event_queue # info('Starting %d processes' % self.num_processes) self.subs = {} # name -> sub # available + processing + aborted = subs.keys self.sub_available = set() self.sub_processing = set() self.sub_aborted = set() self.signal_queue = Queue() db = self.context.get_compmake_db() storage = db.basepath # XXX: logs = os.path.join(storage, 'logs') for i in range(self.num_processes): name = 'w%02d' % i write_log = os.path.join(logs, '%s.log' % name) make_sure_dir_exists(write_log) signal_token = name self.subs[name] = PmakeSub(name, signal_queue=self.signal_queue, signal_token=signal_token, write_log=write_log) self.job2subname = {} self.subname2job = {} # all are available at the beginning self.sub_available.update(self.subs) self.max_num_processing = self.num_processes def check_any_finished(self): # We make a copy because processing is updated during the loop try: token = self.signal_queue.get(block=False) except Empty: return False #print('received %r' % token) job_id = self.subname2job[token] self.subs[token].last self.check_job_finished(job_id, assume_ready=True) return True # XXX: boiler plate def get_resources_status(self): resource_available = {} assert len(self.sub_processing) == len(self.processing) if not self.sub_available: msg = 'already %d nproc' % len(self.sub_processing) if self.sub_aborted: msg += ' (%d workers aborted)' % len(self.sub_aborted) resource_available['nproc'] = (False, msg) # this is enough to continue return resource_available else: resource_available['nproc'] = (True, '') return resource_available @contract(reasons_why_not=dict) def can_accept_job(self, reasons_why_not): if len(self.sub_available) == 0 and len(self.sub_processing) == 0: # all have failed msg = 'All workers have aborted.' raise MakeHostFailed(msg) resources = self.get_resources_status() some_missing = False for k, v in resources.items(): if not v[0]: some_missing = True reasons_why_not[k] = v[1] if some_missing: return False return True def instance_job(self, job_id): publish(self.context, 'worker-status', job_id=job_id, status='apply_async') assert len(self.sub_available) > 0 name = sorted(self.sub_available)[0] self.sub_available.remove(name) assert not name in self.sub_processing self.sub_processing.add(name) sub = self.subs[name] self.job2subname[job_id] = name self.subname2job[name] = job_id job = get_job(job_id, self.db) if self.rdb: f = mvac_job_rdb args = (job_id, self.context, self.event_queue_name, self.show_output, self.volumes, self.rdb_vol.name, self.rdb_db, os.getcwd()) else: if job.needs_context: # if self.new_process: # f = parmake_job2_new_process # args = (job_id, self.context) # # else: f = parmake_job2 args = (job_id, self.context, self.event_queue_name, self.show_output) else: f = mvac_job args = (job_id, self.context, self.event_queue_name, self.show_output, self.volumes, os.getcwd()) if True: async_result = sub.apply_async(f, args) else: warnings.warn('Debugging synchronously') async_result = f(args) return async_result def event_check(self): if not self.show_output: return while True: try: event = self.event_queue.get(block=False) # @UndefinedVariable event.kwargs['remote'] = True broadcast_event(self.context, event) except Empty: break def process_finished(self): if self.cleaned: return self.cleaned = True #print('Clean up...') for name in self.sub_processing: self.subs[name].proc.terminate() for name in self.sub_available: self.subs[name].terminate() elegant = False # XXX: in practice this never works well if elegant: timeout = 1 for name in self.sub_available: self.subs[name].proc.join(timeout) # XXX: ... so we just kill them mercilessly else: # print('killing') for name in self.sub_processing: pid = self.subs[name].proc.pid os.kill(pid, signal.SIGKILL) self.event_queue.close() self.signal_queue.close() from compmake.plugins.backend_pmake.pmake_manager import PmakeManager del PmakeManager.queues[self.event_queue_name] # Normal outcomes def job_failed(self, job_id, deleted_jobs): Manager.job_failed(self, job_id, deleted_jobs) self._clear(job_id) def job_succeeded(self, job_id): Manager.job_succeeded(self, job_id) self._clear(job_id) def _clear(self, job_id): assert job_id in self.job2subname name = self.job2subname[job_id] del self.job2subname[job_id] del self.subname2job[name] assert name in self.sub_processing assert name not in self.sub_available self.sub_processing.remove(name) self.sub_available.add(name) def host_failed(self, job_id): Manager.host_failed(self, job_id) assert job_id in self.job2subname name = self.job2subname[job_id] del self.job2subname[job_id] del self.subname2job[name] assert name in self.sub_processing assert name not in self.sub_available self.sub_processing.remove(name) # put in sub_aborted self.sub_aborted.add(name) def cleanup(self): self.process_finished()
def startSpider(self,start_position=0): print "start scrap!" spider = Spider() result_queue = Queue() md5_queue = [] #用来存储已经爬下来的文章的md5Code URL_Xpath_queue = XpathXmlParse().parseXML("Configuration.xml",start_position) keywordtrainFile = open(self.TRANING_PATH,'a') print "loading md5 code!" self.__getMd5HashMap() print "finish loading md5 code!" while not URL_Xpath_queue.empty(): url_xpath = URL_Xpath_queue.get() #print url_xpath.allow_domains,url_xpath.content,url_xpath.start_urls,url_xpath.title,url_xpath.url_list,URL_Xpath_queue.qsize() spider.set_url_xpath(url_xpath) crawler = CrawlerWorker(spider, result_queue) crawler.start() title = "" content = " " judger = False #判断是否为第一次运行 needToTrain = False md5Code = " " #文章特征码,用于文章查重 classification="" #文章类型 image_url = "" address = "" for item in result_queue.get(): if 'title' in item: if judger and content.strip()!="": md5Code = self.__getMd5Code(content) if md5Code in self.md5_hashMap: print "Article repeat over!" content = " " title="" classification="" image_url = "" judger=False continue; #print title #print "---------------以上是标题------------------" #print content tags = jieba.analyse.extract_tags(content,10)#提取关键词 classification = url_xpath.classification if url_xpath.classification != None :#进行文章分类训练 print "writing training document!" self.__writeTrainingDocument(title,tags, keywordtrainFile, url_xpath) needToTrain = True #print "+++++++++++++++进行训练 文章类型:"+classification+"+++++++++++++++++" else: print "doing classification!" classification = self.__getClassification(tags) #print "+++++++++++++++进行分类 文章类型:"+classification+"+++++++++++++++++" artical = Artical() artical.set_title(title) artical.set_content(content) artical.set_address(address) artical.set_md_5_code(md5Code) if image_url.strip() == "": image_url = "None" artical.set_image_url(image_url) #-----------Bug-Patch----------# mykeywords = artical.getKeyWord() if mykeywords.strip() == "": continue; ''' 排除只有标点的文章 ''' #------------------------------# md5_queue.append([md5Code,mykeywords.strip(),classification]) artical.set_classification(classification) print "++++++++++++++++++++++++++++++ writing"+artical.Title +"to database!++++++++++++++++++++++++++++++++++" #在此处决定是插入还是更新 artical.insertNewArtical() content = " " image_url = "" judger = True title = item['title'] elif 'content' in item : content += " "+item['content']+'\n' elif 'image_url' in item: #content += "<$$img/>" image_url += '$'+str(item['image_url']) #print "ooooooooooooooooooo" + str(image_url) elif 'address' in item: address = item['address'] keywordtrainFile.close() if needToTrain : print "start training!" ArticleCategoriesTrain().training(self.TRANING_PATH,self.XML_PATH)#训练 print "finish training!" print "进行相关文章匹配" for md5 in md5_queue: RelatedReading().insertRelatedReading(md5[0],md5[1],md5[2])
class YubiGuard: def __init__(self, scrlck_mode=False): self.scrlck_mode = scrlck_mode self.id_q = Queue() self.on_q = Queue() self.pi_q = Queue() # init processes gi_proc = Process(target=self.get_ids) gi_proc.daemon = True cs_proc = Process(target=self.change_state) # no daemon, or main program will terminate before Keys can be unlocked cs_proc.daemon = False zmq_lis = ZmqListener( self.on_q) # somehow works ony with threads not processes zmq_lis_thr = Thread(target=zmq_lis.start_listener) zmq_lis_thr.setDaemon(True) pi = PanelIndicator(self.pi_q, self.on_q) # starting processes and catching exceptions: try: gi_proc.start() cs_proc.start() zmq_lis_thr.start() pi.run_pi() # main loop of root process except (KeyboardInterrupt, SystemExit): print('Caught exit event.') finally: # send exit signal, will reactivate YubiKey slots print('Sending EXIT_SIGNAL') self.on_q.put(EXIT_SIGNAL) def get_ids(self): old_id_l = [] no_key = True pat = re.compile(r"(?:Yubikey.*?id=)(\d+)", re.IGNORECASE) while True: new_id_l = [] # get list of xinput device ids and extract those of YubiKeys: xinput = shell_this('xinput list') matches = re.findall(pat, xinput) new_id_l.extend(matches) new_id_l.sort() if not new_id_l and not no_key: self.pi_q.put(NOKEY_SIGNAL) print('No YubiKey(s) detected.') no_key = True elif new_id_l and no_key: self.pi_q.put(OFF_SIGNAL) print('YubiKey(s) detected.') no_key = False # notify: msg_cmd = """notify-send --expire-time=2000 \ 'YubiKey(s) detected.'""" shell_this(msg_cmd) if new_id_l != old_id_l: print('Change in YubiKey ids detected. From {} to {}.'.format( old_id_l, new_id_l)) self.id_q.put(new_id_l) # lock screen if screenlock and YubiKey is removed: if self.scrlck_mode and len(new_id_l) < len(old_id_l): print('Locking screen.') shell_this(get_scrlck_cmd()) # execute screen lock command old_id_l = new_id_l time.sleep(.1) def turn_keys(self, id_l, lock=True ): # problem of value loss of cs_id_l found in this function tk_id_l = id_l if lock: print('Locking YubiKey(s).') state_flag = '0' self.pi_q.put(OFF_SIGNAL) else: print('Unlocking YubiKey(s).') state_flag = '1' self.pi_q.put(ON_SIGNAL) shell_this('; '.join(["xinput set-int-prop {} \"Device Enabled\" 8 {}". format(tk_id, state_flag) for tk_id in tk_id_l])) def check_state(self, check_id_l): # check if all states have indeed changed: pat = re.compile(r"(?:Device Enabled.+?:).?([01])", re.IGNORECASE) # check if state has indeed changed: for tk_id in check_id_l: sh_out = shell_this('xinput list-props {}'.format(tk_id)) match = re.search(pat, sh_out) if match: if match.group(1) != '0': return False def change_state(self): cs_id_l = [] cs_signal = '' while True: # retrieve input from queues while self.id_q.qsize() > 0: cs_id_l = self.id_q.get() while self.on_q.qsize() > 0: cs_signal = self.on_q.get() # not accepting any more signals if cs_signal == EXIT_SIGNAL: self.turn_keys(cs_id_l, lock=False) sys.exit(0) # lock/unlock if cs_id_l: if cs_signal == ON_SIGNAL: self.turn_keys(cs_id_l, lock=False) mon_thread = Thread( target=self.yk_monitor, args=(cs_id_l, )) mon_thread.start() mon_thread.join() # putting in separator, nullifying all preceding ON_SIGNALS # to prevent possible over-triggering: self.on_q.put('') elif self.check_state( cs_id_l) is False: # lock keys if they are unlocked self.turn_keys(cs_id_l, lock=True) # reset state to prevent continued unlocking/locking cs_signal = '' time.sleep(.01) def yk_monitor(self, mon_l): # forming command to run parallel monitoring processes mon_cmd = ' & '.join(["xinput test {}".format(y_id) for y_id in mon_l]) monitor = subprocess.Popen(mon_cmd, shell=True, stdout=subprocess.PIPE) stdout_queue = Queue() stdout_reader = AsynchronousFileReader(monitor.stdout, stdout_queue) stdout_reader.start() triggered = False timestamp = time.time() while not stdout_reader.eof and time.time() - timestamp < TIMEOUT: while stdout_queue.qsize() > 0: stdout_queue.get() # emptying queue triggered = True time.sleep(.01) if triggered: print('YubiKey triggered. Now disabling.') break time.sleep(.001) if not triggered: print('No YubiKey triggered. Timeout.')
class SmtpMessageServer(object): """ This class can start an SMTP debugging server, configure LinOTP to talk to it and read the results back to the parent tester. On open, an SMTP server is set up to listen locally. Derived classes can define a hook to set the LinOTP configuration to point to this server. Example usage: with SmtpMessageServer(testcase) as smtp: get_otp() """ def __init__(self, testcase, message_timeout): self.testcase = testcase # We need a minimum version of 2.9.2 to set the SMTP port number, so # skip if testing an earlier version self.testcase.need_linotp_version('2.9.2') self.timeout = message_timeout self.set_config = SetConfig(testcase.http_protocol, testcase.http_host, testcase.http_port, testcase.http_username, testcase.http_password) # We advertise the local SMTP server hostname # using the IP address that connects to LinOTP self.addr = self._get_local_ip() self.msg_payload = None def __enter__(self): self.smtp_process_queue = Queue() self.smtp_process = Process( target=get_otp_mail, args=(self.smtp_process_queue, self.timeout)) self.smtp_process.start() self.port = self.smtp_process_queue.get(True, 5) self._do_lintop_config() return self def _do_lintop_config(self): parameters = self.get_config_parameters() logger.debug("Configuration parameters: %s", parameters) result = self.set_config.setConfig(parameters) assert result, "It was not possible to set the config. Result:%s" % result def get_config_parameters(self): # This function can be overridden to provide configuration parameters to configure # specific parts of LinOTP assert False, "This function should be overridden" def get_otp(self): messagestr = self.smtp_process_queue.get(True, 10) msg = email.message_from_string(messagestr) otp = msg.get_payload() logger.debug("Received email message payload:%s", otp) return otp def __exit__(self, *args): self.smtp_process_queue.close() self.smtp_process.terminate() self.smtp_process.join(5) def _get_local_ip(self): """ Get the IP address of the interface that connects to LinOTP """ with closing(socket.create_connection((self.testcase.http_host, int(self.testcase.http_port)), 10)) as s: addr = s.getsockname()[0] return addr
class MultiCoreEngine(): _mapred = None _out_queue = None _in_queue = None _log_queue = None _processes = None def __init__(self,mapred): self._mapred = mapred def _start(self,name,cpu, module_name, class_name, params): fn = None self._processes = [] self._in_queue = Queue() self._out_queue = Queue() self._log_queue = Queue() if name == "mapper": fn = q_run_mapper elif name == "reducer": fn = q_run_reducer for i in range(cpu): process = Process(target=fn,args=(module_name, class_name ,params, self._in_queue, self._out_queue, self._log_queue)) self._processes.append(process) process.start() def _stop(self): for process in self._processes: self._in_queue.put("STOP") while not self._log_queue.empty(): print self._log_queue.get() def _get_data_chunks(self): chunks = [] for process in self._processes: chunks.append(self._out_queue.get()) return chunks def _set_data_chunks(self, chunks): map(self._in_queue.put,chunks) def _send_lines(self,lines, cpu, lines_len ): line_splits = [lines[i* lines_len / cpu : (i+1)* lines_len / cpu] for i in range(cpu) ] for i in range(cpu): self._in_queue.put(line_splits[i]) def _terminate(self): for process in self._processes: process.join() process.terminate() self._in_queue.close() self._out_queue.close() self._processes = None def _force_terminate(self): for process in self._processes: process.terminate() def _merge_data(self, data): self._mapred.data = merge_kv_dict(self._mapred.data,data) def _merge_reduced_data(self, data): self._mapred.data_reduced = merge_kv_dict(self._mapred.data_reduced,data) def _split_data(self, num_splits): splits = [] index = 0 len_data = len(self._mapred.data) chunk_len = int(math.ceil(len_data / float(num_splits))) if chunk_len == 0: splits.append(self._mapred.data) else: for i in range(int(math.ceil(len_data/float(chunk_len)))): splits.append({}) for (key, value) in self._mapred.data.items(): i = int(math.floor(index / float(chunk_len))) splits[i][key]=value index = index + 1 return splits def _run_map(self,cpu,cache_line,input_reader ): self._start("mapper",cpu, self._mapred.__class__.__module__,self._mapred.__class__.__name__ ,self._mapred.params) try: map_len = 0 lines = [] lines_len = 0 f = input_reader.read() for line in f: if lines_len > 0 and lines_len % cache_line == 0: self._send_lines(lines, cpu, lines_len) lines = [] lines_len = 0 lines.append(line) lines_len += 1 map_len += 1 input_reader.close() self._send_lines(lines, cpu, lines_len) self._stop() map(self._merge_data, self._get_data_chunks()) self._terminate() except Exception,e: print "ERROR: Exception while mapping : %s\n%s" % (e,traceback.print_exc()) self._force_terminate() return map_len
class FileSink(iMockDebuggerSink): def __init__(self, peerName, theTime, filename, quiet): self._peerName = peerName self._fp = open(filename, "w") self.fp.write("File debugger started at: %(T)s for client: %(C)s"%{"T":theTime, "C":peerName}) self.fp.flush() self._methods = [] methods = iMockDebuggerSink()._getMethods() self._methods = methods self._terminate = False self.quiet= quiet self._startMutex = Semaphore(0) self._q = Queue() self.thread = None def start(self): t = threading.Thread(target=self.run, args=[self._startMutex]) t.setName("FileSink.%(P)s"%{"P":self._peerName}) t.setDaemon(True) self.thread = t self.thread.start() return "file.sink.started" def close(self): self._terminate = True try: self.thread.join() except: pass try: self._fp.close() except: pass try: self._fp.close() except: pass try: self._q.close() except: pass self._fp = None return "file.sink.closed" def waitUntilRunning(self, block=True, timeout=None): self._startMutex.acquire(block=block, timeout=timeout) return self def __getattribute__(self, name): if name in object.__getattribute__(self, "_methods"): q = self._q def wrapper(self, *args, **kwargs): q.put((name, args, kwargs)) return wrapper return object.__getattribute__(self, name) def run(self, startMutex): startMutex.release() while self._terminate==False: try: data = self._q.get(block=True, timeout=1) except Empty: pass else: try: (methodName, args, kwargs) = data peerName = args[0] relativeTime = args[1] args = args[2:] ss = ["PEER:", peerName, "REL-TIME:", relativeTime, "METHOD", methodName, "ARGS:", str(args), "KWARGS", str(kwargs)] s = "\n".join(ss) except: pass else: try: self._fp.write(s) except: break
md = RRQDebugger() a = md.finish_end("peerName", "relativeTime") print a md.start(filename="mock.file") a = md.setup_start("peerName123", "relativeTime123") print a # Start a dummyQueueServer: q = Queue() m = MarshallerFactory.get(MarshallerFactory.DEFAULT) QS = QueueServer(port=22334, target=q, quiet=True, marshaller=m) QS.start() details = QS.details() md.start(server=details) a = md.setup_start("peerName123", "relativeTime123") print a data = q.get(block=True, timeout=10) QS.close()
class TestDLTMessageHandler(unittest.TestCase): def setUp(self): if six.PY2: self.filter_queue = Queue() self.message_queue = Queue() else: self.ctx = get_context() self.filter_queue = Queue(ctx=self.ctx) self.message_queue = Queue(ctx=self.ctx) self.client_cfg = { "ip_address": "127.0.0.1", "filename": "/dev/null", "verbose": 0, "port": "1234" } self.stop_event = Event() self.handler = DLTMessageHandler(self.filter_queue, self.message_queue, self.stop_event, self.client_cfg) def test_init(self): self.assertFalse(self.handler.mp_stop_flag.is_set()) self.assertFalse(self.handler.is_alive()) self.assertTrue(self.handler.filter_queue.empty()) self.assertTrue(self.handler.message_queue.empty()) def test_run_basic(self): self.assertFalse(self.handler.is_alive()) self.handler.start() self.assertTrue(self.handler.is_alive()) self.assertNotEqual(self.handler.pid, os.getpid()) self.stop_event.set() self.handler.join() self.assertFalse(self.handler.is_alive()) def test_handle_add_new_filter(self): self.handler.filter_queue.put(("queue_id", [("SYS", "JOUR")], True)) time.sleep(0.01) self.handler.handle(None) self.assertIn(("SYS", "JOUR"), self.handler.context_map) self.assertEqual(self.handler.context_map[("SYS", "JOUR")], ["queue_id"]) def test_handle_remove_filter_single_entry(self): self.handler.filter_queue.put(("queue_id", [("SYS", "JOUR")], True)) time.sleep(0.01) self.handler.handle(None) self.assertIn(("SYS", "JOUR"), self.handler.context_map) self.assertEqual(self.handler.context_map[("SYS", "JOUR")], ["queue_id"]) self.handler.filter_queue.put(("queue_id", [("SYS", "JOUR")], False)) time.sleep(0.01) self.handler.handle(None) self.assertNotIn(("SYS", "JOUR"), self.handler.context_map) def test_handle_remove_filter_multiple_entries(self): self.handler.filter_queue.put(("queue_id1", [("SYS", "JOUR")], True)) self.handler.filter_queue.put(("queue_id2", [("SYS", "JOUR")], True)) time.sleep(0.01) self.handler.handle(None) self.assertIn(("SYS", "JOUR"), self.handler.context_map) self.assertEqual(self.handler.context_map[("SYS", "JOUR")], ["queue_id1", "queue_id2"]) self.handler.filter_queue.put(("queue_id1", [("SYS", "JOUR")], False)) time.sleep(0.01) self.handler.handle(None) self.assertIn(("SYS", "JOUR"), self.handler.context_map) self.assertEqual(self.handler.context_map[("SYS", "JOUR")], ["queue_id2"]) def test_handle_multiple_similar_filters(self): self.handler.filter_queue.put(("queue_id0", [("SYS", "JOUR")], True)) self.handler.filter_queue.put(("queue_id1", [("SYS", "JOUR")], True)) time.sleep(0.01) self.handler.handle(None) self.assertIn(("SYS", "JOUR"), self.handler.context_map) self.assertEqual(self.handler.context_map[("SYS", "JOUR")], ["queue_id0", "queue_id1"]) def test_handle_multiple_different_filters(self): self.filter_queue.put(("queue_id0", [("SYS", "JOUR")], True)) self.filter_queue.put(("queue_id1", [("DA1", "DC1")], True)) time.sleep(0.01) self.handler.handle(None) self.assertIn(("SYS", "JOUR"), self.handler.context_map) self.assertIn(("DA1", "DC1"), self.handler.context_map) self.assertEqual(self.handler.context_map[("SYS", "JOUR")], ["queue_id0"]) self.assertEqual(self.handler.context_map[("DA1", "DC1")], ["queue_id1"]) def test_handle_message_tag_and_distribute(self): self.filter_queue.put(("queue_id0", [("SYS", "JOUR")], True)) self.filter_queue.put(("queue_id1", [("DA1", "DC1")], True)) self.filter_queue.put(("queue_id2", [("SYS", None)], True)) self.filter_queue.put(("queue_id3", [(None, "DC1")], True)) self.filter_queue.put(("queue_id4", [(None, None)], True)) time.sleep(0.01) # - simulate receiving of messages for _ in range(10): for message in create_messages(stream_multiple, from_file=True): self.handler.handle(message) self.assertIn(("SYS", "JOUR"), self.handler.context_map) self.assertIn(("DA1", "DC1"), self.handler.context_map) self.assertIn((None, None), self.handler.context_map) self.assertIn(("SYS", None), self.handler.context_map) self.assertIn((None, "DC1"), self.handler.context_map) try: # 60 == 10 messages of each for SYS, JOUR and None combinations + # 10 for (None,None) messages = [ self.message_queue.get(timeout=0.01) for _ in range(60) ] # these queues should not get any messages from other queues self.assertEqual( len([msg for qid, msg in messages if qid == 'queue_id0']), 10) self.assertEqual( len([msg for qid, msg in messages if qid == 'queue_id1']), 10) self.assertEqual( len([msg for qid, msg in messages if qid == 'queue_id2']), 10) self.assertEqual( len([msg for qid, msg in messages if qid == 'queue_id3']), 10) # this queue should get all messages self.assertEqual( len([msg for qid, msg in messages if qid == 'queue_id4']), 20) except Empty: # - we should not get an Empty for at least 40 messages self.fail()
class SSHClient(Process): TIMEOUT = 10 PING_RECEIVED = re.compile("1 received") def __init__(self, username, password, host, cmdsToSend, port = 22, exitCmd = "exit", timeout = None): Process.__init__(self, name = "SSHClient") self.logger = LogManager().getLogger('SSHClient-%s' % host) self.username = username self.password = password self.host = host self.port = int(port) self.cmdsToSend = cmdsToSend if isinstance(cmdsToSend, list) else [cmdsToSend] self.exitCmd = exitCmd self.queue = Queue() self.msg = "" self.status = Status.FAILURE self.startTime = Value('d', 0.0) self.endTime = Value('d', 0.0) self.timeout = timeout or SSHClient.TIMEOUT self.cmdsSend = False self.start() def isFinished(self): """ True if the process has finished """ return not self.is_alive() def updateOutput(self): """ Update the msg to include the latest output from the given commands """ try: while True: msg = self.queue.get(timeout = 0.5) self.msg += msg except Empty: pass except IOError: pass if self.isFinished(): self.queue.close() return self.msg def run(self): factory = SSHFactory(self) factory.protocol = ClientTransport reactor.connectTCP(self.host, self.port, factory) self.startTime.value = time.time() check = task.LoopingCall(self.__ping) check.start(2.0) reactor.callLater(self.timeout, self.__timeout) log.defaultObserver.stop() reactor.run() self.endTime.value = time.time() self.queue.close() sys.exit(self.status) def __timeout(self): """ Timeout checker """ if self.status != Status.FAILURE: return self.logger.error('Connection timeout to peer %s:%s' %(self.host, self.port)) reactor.stop() def __ping(self): with open('/dev/null') as null: ping = subprocess.Popen(["ping", "-c1", "-W1", self.host], stdout = null, stderr = null) ping.wait() if ping.returncode != 0 and reactor.running: if self.cmdsSend == False: self.status = Status.FAILURE reactor.stop() def cleanup(self): self.queue.close() def shutdown(self): """ Terminate the SSH process """ self.terminate() self.join() self.endTime.value = time.time()
def returnQueue(): result=Queue() crawler=CrawlerWorker(result,"http://allrecipes.com/Recipe/Apple-Pie-2/") crawler.run() for item in result.get(): print("a")
class PmakeManager(Manager): """ Specialization of Manager for local multiprocessing, using an adhoc implementation of "pool" because of bugs of the Python 2.7 implementation of pool multiprocessing. """ queues = {} @contract(num_processes='int') def __init__(self, context, cq, num_processes, recurse=False, new_process=False, show_output=False): Manager.__init__(self, context=context, cq=cq, recurse=recurse) self.num_processes = num_processes self.last_accepted = 0 self.new_process = new_process self.show_output = show_output if new_process and show_output: msg = ('Compmake does not yet support echoing stdout/stderr ' 'when jobs are run in a new process.') warning(msg) self.cleaned = False def process_init(self): self.event_queue = Queue(1000) self.event_queue_name = str(id(self)) PmakeManager.queues[self.event_queue_name] = self.event_queue # info('Starting %d processes' % self.num_processes) self.subs = {} # name -> sub # available + processing + aborted = subs.keys self.sub_available = set() self.sub_processing = set() self.sub_aborted = set() db = self.context.get_compmake_db() storage = db.basepath # XXX: logs = os.path.join(storage, 'logs') #self.signal_queue = Queue() for i in range(self.num_processes): name = 'parmake_sub_%02d' % i write_log = os.path.join(logs, '%s.log' % name) make_sure_dir_exists(write_log) signal_token = name self.subs[name] = PmakeSub(name=name, signal_queue=None, signal_token=signal_token, write_log=write_log) self.job2subname = {} # all are available self.sub_available.update(self.subs) self.max_num_processing = self.num_processes # XXX: boiler plate def get_resources_status(self): resource_available = {} assert len(self.sub_processing) == len(self.processing) if not self.sub_available: msg = 'already %d processing' % len(self.sub_processing) if self.sub_aborted: msg += ' (%d workers aborted)' % len(self.sub_aborted) resource_available['nproc'] = (False, msg) # this is enough to continue return resource_available else: resource_available['nproc'] = (True, '') return resource_available @contract(reasons_why_not=dict) def can_accept_job(self, reasons_why_not): if len(self.sub_available) == 0 and len(self.sub_processing) == 0: # all have failed msg = 'All workers have aborted.' raise MakeHostFailed(msg) resources = self.get_resources_status() some_missing = False for k, v in resources.items(): if not v[0]: some_missing = True reasons_why_not[k] = v[1] if some_missing: return False return True def instance_job(self, job_id): publish(self.context, 'worker-status', job_id=job_id, status='apply_async') assert len(self.sub_available) > 0 name = sorted(self.sub_available)[0] self.sub_available.remove(name) assert not name in self.sub_processing self.sub_processing.add(name) sub = self.subs[name] self.job2subname[job_id] = name if self.new_process: f = parmake_job2_new_process args = (job_id, self.context) else: f = parmake_job2 args = (job_id, self.context, self.event_queue_name, self.show_output) async_result = sub.apply_async(f, args) return async_result def event_check(self): if not self.show_output: return while True: try: event = self.event_queue.get(block=False) # @UndefinedVariable event.kwargs['remote'] = True broadcast_event(self.context, event) except Empty: break def process_finished(self): if self.cleaned: return self.cleaned = True # print('process_finished()') for name in self.sub_processing: self.subs[name].proc.terminate() for name in self.sub_available: self.subs[name].terminate() # XXX: in practice this never works well if False: # print('joining') timeout = 1 for name in self.sub_available: self.subs[name].proc.join(timeout) # XXX: ... so we just kill them mercilessly if True: # print('killing') for name in self.sub_processing: pid = self.subs[name].proc.pid os.kill(pid, signal.SIGKILL) #print('process_finished() finished') self.event_queue.close() del PmakeManager.queues[self.event_queue_name] # Normal outcomes def job_failed(self, job_id, deleted_jobs): Manager.job_failed(self, job_id, deleted_jobs) self._clear(job_id) def job_succeeded(self, job_id): Manager.job_succeeded(self, job_id) self._clear(job_id) def _clear(self, job_id): assert job_id in self.job2subname name = self.job2subname[job_id] del self.job2subname[job_id] assert name in self.sub_processing assert name not in self.sub_available self.sub_processing.remove(name) self.sub_available.add(name) def host_failed(self, job_id): Manager.host_failed(self, job_id) assert job_id in self.job2subname name = self.job2subname[job_id] del self.job2subname[job_id] assert name in self.sub_processing assert name not in self.sub_available self.sub_processing.remove(name) # put in sub_aborted self.sub_aborted.add(name) def cleanup(self): self.process_finished()
def get_results_from_crawler(spider, **kwargs): results = Queue() worker = CrawlerWorker(spider, results, **kwargs) worker.start() return results.get()
def get(self, *args, **kwargs): try: return Queue.get(self, *args, **kwargs) except KeyboardInterrupt: return None
self.crawler.crawl(self.spider,self.query) self.crawler.start() self.crawler.stop() self.results.put(self.items) def RunSpider(query): results = Queue() crawler = CrawlerWorker(NewsSpider(''),query, results) crawler.start() return results.get() if __name__ == '__main__': results = Queue() crawler = CrawlerWorker(NewsSpider(''),'%D3%C5%D2%C2%BF%E2', results) crawler.start() docs = sds.StructuralDocs().GetDocsStruct(results.get()) docs = rds.RelatedDocs().GetDocs(docs) tr4s = tr.TextRank4Sentence(stop_words_file='abstract/stopword.txt') for doc_simi in docs: abstracts = u'' for doc in doc_simi: print doc['title'] tr4s.train_weight(doc=doc) (sentence,weight) = tr4s.get_key_sentences(num=4) abstracts+=u'\u3002'.join(sentence) tr4s.Clear() tr4s.train(text=abstracts, speech_tag_filter=True, lower=True, source = 'all_filters') (sentence,weight) = tr4s.get_key_sentences(num=4) print u'\u3002'.join(sentence) print '\n\n'
def RunSpider(query): results = Queue() crawler = CrawlerWorker(NewsSpider(''),query, results) crawler.start() return results.get()
def get(self, *args, **kwargs): # If the get fails, the exception will prevent us from decrementing the counter val = Queue.get(self, *args, **kwargs) with self._lock: self._counter.value -= 1 return val