Esempi in Python per Parse.extract_parts

Linguaggio di programmazione: Python

Spazio dei nomi/nome del pacchetto: parse

Classe/tipologia: Parse

Metodo/funzione: extract_parts

Esempi su hotexamples.com: 1

Parse.extract_parts in Python: 1 esempio trovato. Questo è il miglior esempio reale in Python per parse.Parse.extract_parts, estratto da progetti open source. Lo puoi valutare, per aiutarci a migliorare la qualità dei nostri esempi.

Metodi utilizzati di frequente

Mostra Nascondi

Parse(30)

fixed(23)

setEc(10)

evaluate(9)

parse(9)

printStaff(8)

addVariable(7)

parsePage(4)

json(4)

crime(3)

get_specific_crime(3)

dynamic(3)

array(3)

grep(3)

key_values(3)

location(3)

parse_main_page_get_total_pagenum(2)

getRiemannIntegrals(2)

getDifferenceQuotion(2)

model(2)

parseCorpus(2)

hasMoreCommands(2)

main(2)

progress(2)

advance(2)

cmdType(2)

addVarFromList(2)

arg2(2)

arg1(2)

set_timespan(1)

parseData(1)

next(1)

nl_command(1)

parseToJSON(1)

ner(1)

moves(1)

shutdown(1)

modifyNote(1)

start(1)

set_file_path(1)

parse_content(1)

parse_Img(1)

parse_school(1)

prepare(1)

parser(1)

parse_title(1)

regex(1)

regexes(1)

request(1)

parse_request(1)

Esempio n. 1

Mostra file

File: parse-dailynews.py Progetto: nydailynews/vendor-nav

def main(args):
    """ Example usage:
        Describe what we do in this file, then give an example of a command you
        might run on the command line.
        $ python parse-dailynews.py
        """
    fn = 'dailynews.new'
    fh = open(fn, 'rb')
    markup = fh.read()
    if markup[:3] != '   ':
        # We have a gunzip'ed file we have to extract.
        # We know this because the first three characters of www.nydailynews.com urls are always '   '.
        # Always. They are always '   '.
        markup = gzip.GzipFile(fn, 'r').read()

    # Results of this parsing is stored in p.content
    regexes = {
        'header':
        '<header\ id="rh">.*<div\ id="header-container"\ data-reg-role="header-container"></div>\ </header>',
        'footer': '<footer\ id="rf">.*</footer>'
    }
    p = Parse()
    p.regexes = regexes
    p.regex = 'header'
    p.extract_parts(markup)
    p.regex = 'footer'
    p.extract_parts(markup)

    # Turn the nav markup into actionable javascript
    fh = open('html/template-dailynews.js', 'rb')
    js = fh.read()
    # When life was simple:
    #js = js.replace('{{header}}', " ".join(p.content['header'].replace("\n", "\\n").replace("'", "\\'").replace('/', '\/').splitlines()))
    # Life now:
    js = js.replace(
        '{{header}}',
        " ".join(p.content['header'].replace(
            'href="/', 'href="http://www.nydailynews.com/').replace(
                "http://assets.nydaily",
                "//www.nydaily").replace("'", "\\'").replace(
                    '/', '\/').replace("\n", "\\n").replace(
                        'join("\\n")',
                        'join("\\\\n")').replace('/\\n+$', '/\\\\n+$').replace(
                            'rh-app.jpg"', 'rh-app.jpg" alt=""').replace(
                                'rh-subscribe.jpg"',
                                'rh-subscribe.jpg" alt=""').replace(
                                    'notification.png"',
                                    'notification.png" alt=""').
                 replace("search_action();'></button>",
                         "search_action();'>SEARCH</button>").splitlines()))
    js = js.replace(
        '{{footer}}', p.content['footer'].replace(
            'href="/', 'href="http://www.nydailynews.com/').replace(
                'article_750', 'article_250').replace(
                    "http://assets.nydaily", "https://www.nydaily").replace(
                        "'",
                        "\\'").replace('/', '\/').replace("\n", "\\n").replace(
                            'join("\\n")', 'join("\\\\n")').replace(
                                '/\\n+$', '/\\\\n+$').replace(
                                    '7.2945742 -->   <style>\n',
                                    '7.2945742 -->   <style>').replace(
                                        '\r', ''))

    fh = open('html/head.html', 'rb')
    head_markup = fh.read()

    # Write the file
    if p.content['header'] != '':
        f = FileWrapper('output/header.html')
        f.write(p.content['header'])
        f = FileWrapper('output/header-iframeable.html')
        f.write('%s%s' % (head_markup, p.content['header']))
    if p.content['footer'] != '':
        f = FileWrapper('output/footer.html')
        f.write(p.content['footer'])
        f = FileWrapper('output/footer-iframeable.html')
        f.write('%s%s' % (head_markup, p.content['footer']))
    if p.content['footer'] != '' and p.content['header'] != '':
        f = FileWrapper('output/vendor-include.js')
        f.write(js)