def poolDetail(start, end): getObj = videoDemo() dbObj = Dbobj('redio', 're_') arr = [] # if type == 0: # _sort = pymongo.DESCENDING # else: # _sort = pymongo.ASCENDING curTableObj = dbObj.getTbname('redios') base = 'https://www.qq.com/' while True: data = curTableObj.find({ "status": 0, "_id": { "$gte": start, "$lt": end } }).sort('_id', pymongo.DESCENDING).limit(10) #_count = len(data) #pool = Pool(_count) if data is None: break g = GoogleTranslator() for v in data: curUrl = base + v['rel'].strip('/') #curTableObj.run(curUrl,i['id']) #curUrl='https://www.qq.com/video13860839/502_-_horny_asian_couple_had_sex_on_bed' tags = getObj.vdetail(curUrl, v['id']) v['title'] = v['title'].replace(''', '').replace('&', '') s = g.translate(v['title']) _curUpData = {} if s == '': _curUpData['status'] = 3 else: _curUpData['status'] = 1 _curUpData['ctitle'] = s if tags != False: _curUpData['tags'] = tags data = curTableObj.update({"id": v['id']}, {"$set": _curUpData}) time.sleep(0.5)
def deRange(num): url = "https://www.xvideos.com/new/1" getObj = videoDemo() dbObj = Dbobj('redio', 're_') arr = [] curTableObj = dbObj.getTbname('redios') i = num - 10000 while num >= i: url = "https://www.xvideos.com/new/" + str(i) for v in getObj.getVedioUrl(url): if v['id'] != '': v['_id'] = dbObj.getNextValue('redios') v['tags'] = v['cates'] = '' v['status'] = 0 curTableObj.update({"id": v['id'].strip()}, {"$setOnInsert": v}, upsert=True) #p.apply_async(_getList, args=(getObj,curTableObj,url)) i -= 1 if i % 10 == 0: time.sleep(0.5)
title = v['title'].strip('/').replace(''', '').replace('&', '') #print(title) tags = trans(translator, title) #print(tags);exit('3') if tags == '': curTableObj.update({"id": v['id']}, {"$set": {"status": 3}}) else: curTableObj.update({"id": v['id']}, {"$set": {"ctitle": tags}}) i += 1 if i % 10: time.sleep(0.3) if __name__ == '__main__': getObj = videoDemo() dbObj = Dbobj('redio', 're_') arr = [] curTableObj = dbObj.getTbname('redios') #pTrans(curTableObj,1);exit('5') #base = 'https://www.xvideos.com/' #_count = len(data) p = Pool(processes=5) i = 5 while i > 0: p.apply_async(pTrans, args=(curTableObj, i)) i -= 1 p.close() p.join() print('成功') # p.apply_async(curTableObj.vdetail, args=(curUrl,v['id']))
def __init__(self): self.dbObj = Dbobj('redio', 're_')