Esempi in Python per Cleaner.remove_punct

Linguaggio di programmazione: Python

Spazio dei nomi/nome del pacchetto: Cleaner

Classe/tipologia: Cleaner

Metodo/funzione: remove_punct

Esempi su hotexamples.com: 1

Cleaner.remove_punct in Python: 1 esempio trovato. Questo è il miglior esempio reale in Python per Cleaner.Cleaner.remove_punct, estratto da progetti open source. Lo puoi valutare, per aiutarci a migliorare la qualità dei nostri esempi.

Metodi utilizzati di frequente

Mostra Nascondi

Cleaner(30)

clean_bmi(6)

Clean_Birthday(5)

Clean_Age(4)

clean_text(4)

clean_gender(3)

clean(3)

preprocess_text(2)

n_gram(2)

text_header_remover(2)

clean_file(2)

clean_empid(2)

__init__(2)

stop(1)

run(1)

replace(1)

remove_punct(1)

remove_non_marked(1)

remove_nan(1)

remove_multiple_method_comments(1)

preprocess_danmu(1)

case_fold(1)

get_df(1)

get_data_category_count(1)

get_clean(1)

getDF(1)

cleanSubtitles(1)

extractDate(1)

edit_bulk_comments(1)

delete_tags(1)

clean_df(1)

getContent(1)

Esempio n. 1

Mostra file

class QCleaner:
    
    def __init__(self, queryFile, queryJSON):
        #Initialize the cleaner object
        self._cleaner = Cleaner(" ", " ")
        #txt file in which all queries are stored
        self._qFile = queryFile
        #json file to store the queries after cleaning
        self._qJson = queryJSON
        #list to store raw queries
        self._queryList = list()
        #list to store refined queries
        self._queryDict = dict()
        #stopList
        self._stopList = list()
        #QueryID initialized to 1
        self._qID = 1

    
    def cleanQueries(self):
        
        choices = [0, 0, 0]
        
        choice = raw_input("Perform case-folding?")
        if (choice == 'Y' or choice == 'y'):
            choices[0] = 1
        
        choice = raw_input("Remove Punctuation?")
        if (choice == 'Y' or choice == 'y'):
            choices[1] = 1
                
        choice = raw_input("Perform stopping?")
        if (choice == 'Y' or choice == 'y'):
            choices[2] = 1
    
        self.getQueries()
        
        for query in self._queryList:
            refinedQuery = self._cleaner.getContent(query.split(r"\n"))
            
            if choices[0] == 1:
                refinedQuery = self._cleaner.case_fold(refinedQuery)
            if choices[1] == 1:
                refinedQuery = self._cleaner.remove_punct(refinedQuery)
            if choices[2] == 1:
                refinedQuery = self._cleaner.stop(refinedQuery)
        
            rQuery = ""
            for token in refinedQuery:
                rQuery += (token + " ")
                
            rQuery = re.sub(r'\s+', ' ', rQuery)
            rQuery = rQuery.lstrip()
            rQuery = rQuery.rstrip()
            
            self._queryDict.update({self._qID : rQuery})
            self._qID+=1
        GL.dictToJson(self._qJson, self._queryDict)
            
        
    def getQueries(self):
        
        soup = BS(open(self._qFile), "html.parser")
        for doc in soup.find_all("docno"):
            doc.extract()
        for query in soup.findAll("doc"):
            text = query.text
            self._queryList.append(text.lstrip().rstrip())