Python Document.select 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: osp.corpus.models

클래스/타입: Document

메소드/함수: select

hotexamples.com에서의 예제들: 9

Python Document.select - 9개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 osp.corpus.models.Document.select에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

create(7)

select(4)

get(2)

insert_documents(1)

page_cursor(1)

예제 #1

파일 보기

파일: test_insert_documents.py 프로젝트: tollycoast/open-syllabus-project

def test_insert_documents(mock_osp):

    """
    Corpus.insert_documents() should create a row for each syllabus.
    """

    # 10 segments x 10 files.
    for s in segment_range(10):
        for i in range(10):
            mock_osp.add_file(segment=s, name=s + "-" + str(i))

    # Insert document rows.
    Document.insert_documents()

    # Should create 100 rows.
    assert Document.select().count() == 100

    # All docs should have rows.
    for s in segment_range(10):
        for i in range(10):

            # Path is [segment]/[file]
            path = s + "/" + s + "-" + str(i)

            # Query for the document path.
            query = Document.select().where(Document.path == path)
            assert query.count() == 1

예제 #2

파일 보기

def queue_text():
    """
    Queue text extraction tasks in the worker.
    """

    for doc in query_bar(Document.select()):
        config.rq.enqueue(ext_text, doc.id)

예제 #3

파일 보기

파일: corpus.py 프로젝트: MichaelEdage/open-syllabus-project

def queue_text():

    """
    Queue text extraction tasks in the worker.
    """

    for doc in query_bar(Document.select()):
        config.rq.enqueue(ext_text, doc.id)

예제 #4

파일 보기

파일: fields.py 프로젝트: MichaelEdage/open-syllabus-project

def run_doc_to_fields():

    """
    Match documents -> fields.
    """

    for doc in query_bar(Document.select()):
        try: doc_to_fields(doc.id)
        except: pass

예제 #5

파일 보기

파일: inst.py 프로젝트: MichaelEdage/open-syllabus-project

def run_doc_to_inst():

    """
    Match documents -> institutions.
    """

    for doc in query_bar(Document.select()):
        try: doc_to_inst(doc.id)
        except: pass

예제 #6

파일 보기

def run_doc_to_inst():
    """
    Match documents -> institutions.
    """

    for doc in query_bar(Document.select()):
        try:
            doc_to_inst(doc.id)
        except:
            pass

예제 #7

파일 보기

파일: fields.py 프로젝트: project-renard-survey/open-syllabus-project

def run_doc_to_fields():
    """
    Match documents -> fields.
    """

    for doc in query_bar(Document.select()):
        try:
            doc_to_fields(doc.id)
        except:
            pass

예제 #8

파일 보기

파일: institution_document.py 프로젝트: project-renard-survey/open-syllabus-project

    def link(cls):

        """
        Link documents -> institutions.
        """

        domain_to_inst = defaultdict(list)

        # Map domain -> [(regex, inst), ...]
        for inst in ServerSide(Institution.select()):

            domain = parse_domain(inst.url)

            regex = seed_to_regex(inst.url)

            domain_to_inst[domain].append((regex, inst))

        for doc in query_bar(Document.select()):

            try:

                # TODO: Get rid of @property.
                url = doc.syllabus.url

                domain = parse_domain(url)

                # Find institutions with matching URLs.
                matches = []
                for pattern, inst in domain_to_inst[domain]:

                    match = pattern.search(url)

                    if match:
                        matches.append((match.group(), inst))

                if matches:

                    # Sort by length of match, descending.
                    matches = sorted(
                        matches,
                        key=lambda x: len(x[0]),
                        reverse=True,
                    )

                    # Link to the institution with the longest match.
                    cls.create(
                        institution=matches[0][1],
                        document=doc,
                    )

            except Exception as e:
                print(e)

예제 #9

파일 보기

파일: test_insert_documents.py 프로젝트: tollycoast/open-syllabus-project

def test_insert_new_documents(mock_osp):

    """
    When new documents are added to the corpus, just the new documents should
    be registered in the database.
    """

    # 10 files in `000`.
    for i in range(10):
        mock_osp.add_file(segment="000", name="000-" + str(i))

    # Should add 10 docs.
    Document.insert_documents()
    assert Document.select().count() == 10

    # 10 new files in `001`.
    for i in range(10):
        mock_osp.add_file(segment="001", name="001-" + str(i))

    # Should add 10 docs.
    Document.insert_documents()
    assert Document.select().count() == 20