Python Extraction примеры использования

Язык программирования: Python

Пространство имен/Пакет: app.extraction.models

Класс/Тип: Extraction

Примеров на hotexamples.com: 3

Python Extraction - 3 примера найдено. Это лучшие примеры Python кода для app.extraction.models.Extraction, полученные из open source проектов. Вы можете ставить оценку каждому примеру, чтобы помочь нам улучшить качество примеров.

Основные методы

Показать Скрыть

get_extraction_by_page_id(2)

add_extraction(1)

get_extraction(1)

Пример #1

Показать файл

Файл: tasks.py Проект: prabuuce/corpusbuilder

def retrieve_extraction(page_id,add_if_not_found=True):
    print "retrieving Extraction for ....%s" % (page_id)
    with app.test_request_context('/'): # this is to adjust for the fact that we are in celery content and not Flask context 
        app.preprocess_request()
    extraction = Extraction.get_extraction_by_page_id(page_id)
    if extraction is None:
        if add_if_not_found: # add a page
            extraction = Extraction.add_extraction(page_id)
        else:
            return extractionnotfound
    else:
        pass # do nothing
    #<-->We will not do boilerpipe extraction here... We are going to simply put an extraction page
    # But we will run a separate process that activates boilerpipe taking page.id and extraction.id.
    #boilerpipe_extract_and_populate.delay(page_id,extraction.id)
    
    #Using Rest API
    '''rExt = requests.get("http://127.0.0.1:5000/extractions", params={"page_id":page_id})

Пример #2

Показать файл

Файл: boilerpipe_wrapper.py Проект: prabuuce/corpusbuilder

 def extract_content(page_id, ext_id, htmlReturn=False): # htmlReturn=False: by default returns text content
     if (page_id is None or "") or (ext_id is None or ""): return badrequest()
     page = Page.get_page(page_id)
     if page is None: return documentnotfound()
     extraction = Extraction.get_extraction(ext_id)
     if extraction is None: return documentnotfound()
     original_content = page.content
     if original_content is None or original_content is "": return nocontent()
     
     if not jpype.isThreadAttachedToJVM():
         jpype.attachThreadToJVM()
     extractor = Extractor(extractor='DefaultExtractor', html=original_content)
     if not htmlReturn:
         bp_content = extractor.getText()
     else:
         bp_content = extractor.getHTML()
     if bp_content is None: nocontent()
     
     extraction.update(bp_content=bp_content)
     return success()

Пример #3

Показать файл

Файл: tasks.py Проект: prabuuce/corpusbuilder

def boilerpipe_extract_and_populate(page_id=None, ext_id=None):
    print "extracting using boilerpipe..."
    
    # For some reason this approach of directly calling the static method is not working
    '''with app.test_request_context('/'): # this is to adjust for the fact that we are in celery content and not Flask context 
        app.preprocess_request()
    BoilerpipeExtraction.extract_content(page_id, ext_id)'''
    
    # Therefore, switching to calling the REST API. This seems to be working 
    #Using Rest API
    #return requests.get("http://127.0.0.1:5000/extractions/bp/%s,%s"%(page_id,ext_id))
    
    # approach 2:
    with app.test_request_context('/'): # this is to adjust for the fact that we are in celery content and not Flask context 
        app.preprocess_request()
    for page in Page.get_all_pages():
        if page is not None:
            extraction = Extraction.get_extraction_by_page_id(page.id)
            requests.get("http://127.0.0.1:5000/extractions/bp/%s,%s"%(page.id,extraction.id))
        else:
            pass
    return