Python TextMap.split_spans Examples

Programming Language: Python

Namespace/Package Name: analyser.documents

Class/Type: TextMap

Method/Function: split_spans

Examples at hotexamples.com: 2

Python TextMap.split_spans - 2 examples found. These are the top rated real world Python examples of analyser.documents.TextMap.split_spans extracted from open source projects. You can rate examples to help us improve the quality of examples.

Frequently Used Methods

Show Hide

TextMap(28)

text_range(10)

slice(4)

token_index_by_char(4)

sentence_at_index(2)

split_spans(2)

tokens_by_range(2)

char_range(1)

finditer(1)

split(1)

token_indices_by_char_range(1)

Example #1

Show file

File: test_tokenization.py Project: nemoware/analyser

  def test_split_span_add_delimiters(self):
    text = '1 2 3\nмама\nಶ್ರೀರಾಮ'
    tm = TextMap(text)

    spans = [s for s in tm.split_spans('\n', add_delimiter=True)]
    for k in spans:
      print(tm.text_range(k))

    self.assertEqual('1 2 3\n', tm.text_range(spans[0]))

Example #2

Show file

File: headers_detector.py Project: nemoware/analyser

def doc_features(tokens_map: TextMap):
    body_lines_ranges = tokens_map.split_spans(PARAGRAPH_DELIMITER,
                                               add_delimiter=True)

    _doc_features = []
    _line_spans = []
    ln = 0
    _prev_features = None
    for line_span in body_lines_ranges:
        _line_spans.append(line_span)

        _features = line_features(tokens_map, line_span, ln, _prev_features)
        _doc_features.append(_features)
        _prev_features = _features
        ln += 1
    doc_featuresX_data = pd.DataFrame.from_records(_doc_features)
    doc_features_data = np.array(doc_featuresX_data)

    return doc_features_data, _line_spans