Python BeautifulSoup.get_text примеры использования

Язык программирования: Python

Пространство имен/Пакет: BeautifulSoup

Класс/Тип: BeautifulSoup

Метод/Функция: get_text

Примеров на hotexamples.com: 2

Python BeautifulSoup.get_text - 2 примера найдено. Это лучшие примеры Python кода для BeautifulSoup.BeautifulSoup.get_text, полученные из open source проектов. Вы можете ставить оценку каждому примеру, чтобы помочь нам улучшить качество примеров.

Основные методы

Показать Скрыть

BeautifulSoup(30)

decompose(30)

first(30)

find_all(30)

findAll(30)

find(30)

fetch(30)

feed(30)

getText(29)

insert(20)

findChildren(19)

body(12)

close(11)

__str__(11)

encode(8)

new_tag(6)

findChild(5)

append(4)

prettify(4)

findSelect(4)

decode(4)

get(4)

__unicode__(3)

goahead(3)

lower(3)

div(3)

findall(3)

pretify(3)

__init__(3)

firstText(2)

pop(2)

data(2)

findNext(2)

read(2)

index(1)

html(1)

query(1)

json(1)

load(1)

re_left(1)

noscript(1)

orig_url(1)

partition(1)

popTag(1)

pretiffy(1)

head(1)

findNextSiblings(1)

group(1)

encodeContents(1)

attrs(1)

Пример #1

Показать файл

Файл: beautiful-a.py Проект: youarewinner/g2b

#! /usr/bin/python2 
# -*- encoding: utf-8 -*-
# 유닉스의 경우에,  cgi 스크립트를 실행하기 위해서는 현재 파일을 chmod +x 로 실행가능비트로 지정하고 #! /usr/bin/python2와 같이 경로를 지정한다. 
# windows에서는 이런게 필요없다.
# python 2.4.3
# Beautiful Soup (2.1.1)
import urllib
from BeautifulSoup import BeautifulSoup
#from bs4 import BeautifulSoup

html_source = urllib.urlopen('http://www.naver.com').read()
soup = BeautifulSoup(html_source, fromEncoding="utf-8")

for link in soup.findAll('a'):
#    print(link.get('href'))
	print soup.get_text()

Пример #2

Показать файл

Файл: search.py Проект: Khushboojk/Local_project

	################## Pre-Processing Of Text ###############################

	# data = soup.findAll(text=True)
	# [s.extract() for s in soup(['style', 'script', '[document]', 'head', 'title'])]
	# visible_text = soup.getText()

	
	
	
		# kill all script and style elements
	for script in soup(["script", "style","title","head","[document]"]):
	    script.extract()    # rip it out

	# get text
	text = soup.get_text()

	# break into lines and remove leading and trailing space on each
	lines = (line.strip() for line in text.splitlines())
	# break multi-headlines into a line each
	chunks = (phrase.strip() for line in lines for phrase in line.split("  "))
	# drop blank lines
	text = '\n'.join(chunk for chunk in chunks if chunk)

	visible_text=(text.encode('utf-8'))

	FewText=visible_text[2500:3000]


	for  words in SearchWords :