def get_igraph_g(self): from smarttypes.model.twitter_user import TwitterUser from smarttypes.graphreduce.reduce_graph import get_igraph_graph network = {} for score, user_id in self.get_members(): user = TwitterUser.get_by_id(user_id, self.postgres_handle) network[user.id] = set(user.following_ids) g = get_igraph_graph(network) pagerank = g.pagerank(damping=0.65) both = zip(pagerank, g.vs['name']) for x, y in sorted(both): print x print TwitterUser.get_by_id(y, self.postgres_handle).screen_name
def get_igraph_g(self): from smarttypes.model.twitter_user import TwitterUser from smarttypes.graphreduce.reduce_graph import get_igraph_graph network = {} for score, user_id in self.get_members(): user = TwitterUser.get_by_id(user_id, self.postgres_handle) network[user.id] = set(user.following_ids) g = get_igraph_graph(network) pagerank = g.pagerank(damping=0.65) both = zip(pagerank, g.vs["name"]) for x, y in sorted(both): print x print TwitterUser.get_by_id(y, self.postgres_handle).screen_name
def load_user_and_the_people_they_follow(creds, user_id, postgres_handle): remaining_hits_threshold = 10 api_handle = creds.api_handle root_user = creds.root_user is_root_user = False if root_user.id == user_id: is_root_user = True # if is_root_user and 'root_user.is_fake_user': # return None remaining_hits, reset_time = get_rate_limit_status(api_handle) if remaining_hits < remaining_hits_threshold: raise Exception("%s: remaining_hits less than threshold!" % root_user.screen_name) try: api_user = api_handle.get_user(user_id=user_id) except TweepError, ex: print "%s: api_handle.get_user(%s) got a TweepError %s" % (root_user.screen_name, user_id, ex) if 'Sorry, that page does not exist' in str(ex) or 'User has been suspended' in str(ex): print 'setting caused_an_error' model_user = TwitterUser.get_by_id(user_id, postgres_handle) if not model_user: properties = {'id': user_id, 'screen_name': user_id} model_user = TwitterUser(postgres_handle=postgres_handle, **properties) model_user.save() postgres_handle.connection.commit() model_user.caused_an_error = datetime.now() model_user.save() postgres_handle.connection.commit() return None
def index(req, session, postgres_handle): root_user = None if 'user_id' in req.params: root_user = TwitterUser.get_by_id(req.params['user_id'], postgres_handle) if not root_user: root_user = TwitterUser.by_screen_name('SmartTypes', postgres_handle) reduction = TwitterReduction.get_latest_reduction(root_user.id, postgres_handle) if not reduction: root_user = TwitterUser.by_screen_name('SmartTypes', postgres_handle) reduction = TwitterReduction.get_latest_reduction( root_user.id, postgres_handle) return { 'active_tab': 'social_map', 'template_path': 'social_map/index.html', 'root_user': root_user, 'reduction': reduction, 'num_groups': len(TwitterGroup.all_groups(reduction.id, postgres_handle)), 'users_with_a_reduction': TwitterReduction.get_users_with_a_reduction(postgres_handle), }
def get_user_reduction_counts(cls, postgres_handle): from smarttypes.model.twitter_user import TwitterUser return_users = [] qry = """ select root_user_id, count(root_user_id) as reduction_count from twitter_reduction group by root_user_id; """ for result in postgres_handle.execute_query(qry): user = TwitterUser.get_by_id(result['root_user_id'], postgres_handle) return_users.append((user, result['reduction_count'])) return return_users
def get_users_with_a_reduction(cls, postgres_handle): from smarttypes.model.twitter_user import TwitterUser return_users = [] qry = """ select distinct root_user_id from twitter_reduction order by root_user_id; """ for result in postgres_handle.execute_query(qry): user = TwitterUser.get_by_id(result['root_user_id'], postgres_handle) return_users.append(user) return return_users
def top_users(self, num_users=20, just_ids=False): from smarttypes.model.twitter_user import TwitterUser return_list = [] score_user_id_tup_list = self.get_members() for score, user_id in heapq.nlargest(num_users, score_user_id_tup_list): if score: add_this = (score, user_id) if not just_ids: add_this = (score, TwitterUser.get_by_id(user_id, self.postgres_handle)) return_list.append(add_this) else: break return return_list
def user(request): if 'user_id' in request.params: user_id = int(request.params['user_id']) twitter_user = TwitterUser.get_by_id(user_id) else: screen_name = request.params['screen_name'] twitter_user = TwitterUser.by_screen_name(screen_name) return { 'twitter_user':twitter_user, }
def node_details(req, session, postgres_handle): twitter_user, in_links, out_links = None, [], [] if 'node_id' in req.params and 'reduction_id' in req.params: reduction = TwitterReduction.get_by_id(req.params['reduction_id'], postgres_handle) twitter_user = TwitterUser.get_by_id(req.params['node_id'], postgres_handle) if twitter_user: in_links, out_links = reduction.get_in_and_out_links_for_user(req.params['node_id']) return { 'template_path': 'social_map/node_details.html', 'twitter_user': twitter_user, 'in_links':in_links, 'out_links':out_links, }
def top_users(self, num_users=20, just_ids=False): from smarttypes.model.twitter_user import TwitterUser return_list = [] i = 0 for score, user_id in sorted(self.scores_users, reverse=True): if i <= num_users and score > .001: add_this = (score, user_id) if not just_ids: add_this = (score, TwitterUser.get_by_id(user_id)) return_list.append(add_this) else: break i += 1 return return_list
def pull_some_users(user_id): postgres_handle = PostgresHandle(smarttypes.connection_string) root_user = TwitterUser.get_by_id(user_id, postgres_handle) if not root_user: raise Exception('User ID: %s not in our DB!' % user_id) if not root_user.credentials: raise Exception('%s does not have api credentials!' % root_user.screen_name) api_handle = root_user.credentials.api_handle root_user = load_user_and_the_people_they_follow(api_handle, root_user.id, postgres_handle, is_root_user=True) load_this_user_id = root_user.get_id_of_someone_in_my_network_to_load() while load_this_user_id: load_user_and_the_people_they_follow(api_handle, load_this_user_id, postgres_handle) load_this_user_id = root_user.get_id_of_someone_in_my_network_to_load() #load_this_user_id = None print "Finshed loading all related users for %s!" % root_user.screen_name
def node_details(req, session, postgres_handle): twitter_user, in_links, out_links = None, [], [] if 'node_id' in req.params and 'reduction_id' in req.params: reduction = TwitterReduction.get_by_id(req.params['reduction_id'], postgres_handle) twitter_user = TwitterUser.get_by_id(req.params['node_id'], postgres_handle) if twitter_user: in_links, out_links = reduction.get_in_and_out_links_for_user( req.params['node_id']) return { 'template_path': 'social_map/node_details.html', 'twitter_user': twitter_user, 'in_links': in_links, 'out_links': out_links, }
def index(req, session, postgres_handle): root_user = None if 'user_id' in req.params: root_user = TwitterUser.get_by_id(req.params['user_id'], postgres_handle) if not root_user: root_user = TwitterUser.by_screen_name('SmartTypes', postgres_handle) reduction = TwitterReduction.get_latest_reduction(root_user.id, postgres_handle) if not reduction: root_user = TwitterUser.by_screen_name('SmartTypes', postgres_handle) reduction = TwitterReduction.get_latest_reduction(root_user.id, postgres_handle) return { 'active_tab': 'social_map', 'template_path': 'social_map/index.html', 'root_user': root_user, 'reduction': reduction, 'num_groups': len(TwitterGroup.all_groups(reduction.id, postgres_handle)), 'users_with_a_reduction': TwitterReduction.get_users_with_a_reduction(postgres_handle), }
def load_user_and_the_people_they_follow(creds, user_id, postgres_handle): remaining_hits_threshold = 10 api_handle = creds.api_handle root_user = creds.root_user is_root_user = False if root_user.id == user_id: is_root_user = True # if is_root_user and 'root_user.is_fake_user': # return None remaining_hits, reset_time = get_rate_limit_status(api_handle) if remaining_hits < remaining_hits_threshold: raise Exception("%s: remaining_hits less than threshold!" % root_user.screen_name) try: api_user = api_handle.get_user(user_id=user_id) except TweepError, ex: print "%s: api_handle.get_user(%s) got a TweepError %s" % ( root_user.screen_name, user_id, ex) if 'Sorry, that page does not exist' in str( ex) or 'User has been suspended' in str(ex): print 'setting caused_an_error' model_user = TwitterUser.get_by_id(user_id, postgres_handle) if not model_user: properties = {'id': user_id, 'screen_name': user_id} model_user = TwitterUser(postgres_handle=postgres_handle, **properties) model_user.save() postgres_handle.connection.commit() model_user.caused_an_error = datetime.now() model_user.save() postgres_handle.connection.commit() return None
membership_scores.append((A[i][j] * A[j][i], j)) group_adjacency.append(membership_scores) index_to_twitter_id_dict = pickle.load(open('index_to_twitter_id.pickle', 'r')) user_group_map = {} TwitterGroup.bulk_delete('all') for i in range(num_features): membership_scores = [] for j in range(num_items): user_id = index_to_twitter_id_dict[j] follower_score = users_data[i][j] following_score = items_data[i][j] membership_score = following_score * following_score if membership_score > .001: membership_scores.append((membership_score, user_id)) if user_id not in user_group_map: user_group_map[user_id] = [(membership_score, i)] else: user_group_map[user_id].append((membership_score, i)) TwitterGroup.upsert_group(i, membership_scores, group_adjacency[i]) print "Done creating groups." TwitterUser.bulk_update('all', {'scores_groups': None}) i = 0 for user_id, scores_groups in user_group_map.items(): twitter_user = TwitterUser.get_by_id(user_id) twitter_user.scores_groups = scores_groups twitter_user.save() if i % 1000 == 0: print "Done with %s users." % i i += 1
TwitterGroup.bulk_delete('all') for i in range(num_features): membership_scores = [] for j in range(num_items): user_id = index_to_twitter_id_dict[j] follower_score = users_data[i][j] following_score = items_data[i][j] membership_score = following_score * following_score if membership_score > .001: membership_scores.append((membership_score, user_id)) if user_id not in user_group_map: user_group_map[user_id] = [(membership_score, i)] else: user_group_map[user_id].append((membership_score, i)) TwitterGroup.upsert_group(i, membership_scores, group_adjacency[i]) print "Done creating groups." TwitterUser.bulk_update('all', {'scores_groups':None}) i = 0 for user_id, scores_groups in user_group_map.items(): twitter_user = TwitterUser.get_by_id(user_id) twitter_user.scores_groups = scores_groups twitter_user.save() if i % 1000 == 0: print "Done with %s users." % i i += 1
def twitter_user(self): from smarttypes.model.twitter_user import TwitterUser if not self.twitter_id: return None return TwitterUser.get_by_id(self.twitter_id, self.postgres_handle)
def root_user(self): from smarttypes.model.twitter_user import TwitterUser return TwitterUser.get_by_id(self.root_user_id, self.postgres_handle)
def root_user(self): from smarttypes.model.twitter_user import TwitterUser if not self.root_user_id: return None return TwitterUser.get_by_id(self.root_user_id, self.postgres_handle)
if not len(sys.argv) > 1: raise Exception('Need a twitter handle.') else: screen_name = sys.argv[1] if smarttypes.config.IS_PROD: start_here = datetime.now() else: start_here = datetime(2012, 8, 1) root_user = TwitterUser.by_screen_name(screen_name, postgres_handle) distance = 45000 / len(root_user.following[:5000]) #distance = 0 network = TwitterUser.get_rooted_network(root_user, postgres_handle, start_here=start_here, distance=distance, go_back_this_many_weeks=15) print "writing %s nodes to disk" % len(network) g = reduce_graph.get_igraph_graph(network) lang_names = [] loc_names = [] for node_id in g.vs['name']: user = TwitterUser.get_by_id(node_id, postgres_handle) lang_names.append(user.lang.encode('ascii', 'ignore')) loc_names.append(user.location_name.encode('ascii', 'ignore')) g.vs['lang_name'] = lang_names g.vs['loc_name'] = loc_names reduce_graph.write_to_graphml_file(root_user, g, network) # print "mk_user_csv took %s to execute" % (datetime.now() - start_time)