Python NearDuplicatesIndex.NearDuplicatesIndex 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: ndindex

클래스/타입: NearDuplicatesIndex

메소드/함수: NearDuplicatesIndex

hotexamples.com에서의 예제들: 3

Python NearDuplicatesIndex.NearDuplicatesIndex - 3개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 ndindex.NearDuplicatesIndex.NearDuplicatesIndex에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

NearDuplicatesIndex(3)

append(3)

get_jaccard(3)

appendif(1)

예제 #1

파일 보기

    def __init__(self, test_docs_dir="./test"):
        self.test_docs_dir = test_docs_dir
        self.files = []

        self.index = NearDuplicatesIndex()

        # Calculate near-duplicates index




	# Try to connect

	try:
	    conn=psycopg2.connect("dbname='djangology' user='******' password=''")
	except:
	    print "I am unable to connect to the database."

	cur = conn.cursor()
	try:
	    cur.execute("""SELECT * from dj_document""")
	except:
	    print "I can't SELECT from dj_document"

	rows = cur.fetchall()
	#print "\nRows: \n"
	for row in rows:
	    #print "   ", row[1]
            self.index.append(row[1], row[0])
	    self.files.append(row[0])
	cur.close()

예제 #2

파일 보기

파일: detector.py 프로젝트: parkr/near-dup-detection

    def __init__(self, test_docs_dir="./test"):
        self.test_docs_dir = test_docs_dir
        self.files = [
            d for d in os.listdir(test_docs_dir)
            if os.path.isfile(os.path.join(test_docs_dir, d)) and d[0] != "."
        ]

        self.index = NearDuplicatesIndex()

        # Calculate near-duplicates index
        for file in self.files:
            filename = self.filename(file)
            with open(filename) as f:
                doc = f.read().strip().strip(
                    ",.!|&-_()[]<>{}/\"'").strip().split(" ")
                self.index.append(doc, filename)

예제 #3

파일 보기

파일: ndindex_spec.py 프로젝트: mcavdar/aicomp

 def setUp(self):
     self.docs = []
     self.docs.append(['this','is','a','document'])
     self.docs.append(['this','is','b','document'])
     self.index = NearDuplicatesIndex()