def computeGraphMetrics(GRep, GModel): nddDiff = euclidDist(ndDist(GRep), ndDist(GModel)) knnDiff = euclidDist(knnDist(GRep), knnDist(GModel)) dkDiff = euclidDist(dkDist(GRep), dkDist(GModel)) ccDiff = euclidDist(ccDist(GRep), ccDist(GModel)) ASVals = (ASCoeff(GRep), ASCoeff(GModel)) MxWccVals = (snap.GetMxWcc(GRep).GetNodes(), snap.GetMxWcc(GModel).GetNodes()) effDVals = (snap.GetBfsEffDiam(GRep, 1000, False), snap.GetBfsEffDiam(GModel, 1000, False)) return nddDiff, knnDiff, dkDiff, ccDiff, ASVals, MxWccVals, effDVals
def mean_shortest_path_length(self): """ Measurement: mean_shortest_path_length Description: Calculate the mean shortest path Input: Graph Output: """ if SNAP_LOADED: if self.gUNsn.Empty(): warnings.warn('Empty graph') return 0 return sn.GetBfsEffDiam(self.gUNsn, self.gUNsn.GetNodes(), False) else: if ig.Graph.vcount(self.gUNig) == 0: warnings.warn('Empty graph') return 0 shortest_paths = self.gUNig.shortest_paths_dijkstra(mode='ALL') shortest_paths_cleaned = np.array(shortest_paths).flatten() shortest_paths_cleaned = shortest_paths_cleaned[np.isfinite( shortest_paths_cleaned)] return np.percentile([float(x) for x in shortest_paths_cleaned], 90)
def analyze_diameter_timestamp(graph): print("Analyzing effective diameter...") hyperedge_cnt = [0] for i in range(1, graph.number_of_edges()): if graph.edges[i][0] != graph.edges[i-1][0]: hyperedge_cnt.append(hyperedge_cnt[-1]) hyperedge_cnt[-1] = i total_duration = len(hyperedge_cnt) timestamps = [] diams = [] with open("../results/{}_diameter.txt".format(graph.datatype), "w") as f: for i in range(total_duration): idx_ub = hyperedge_cnt[i]+1 pg = project2(graph, idx_ub) _x = graph.edges[idx_ub-1][0] _y = snap.GetBfsEffDiam(pg, 4000 if i < 50 else 1000, False) f.write(f"{_x} {_y}\n") timestamps.append(_x) diams.append(_y) f.flush() plt.figure() plt.plot(timestamps, diams, 'ro') plt.xlabel("Time (year)") plt.ylabel("Effective Diameter") plt.title("Effective Diameter") plt.savefig("../plots/{}_diameter.png".format(graph.datatype), dpi=300)
def getBasicInfo(strPath, net): G = snap.LoadEdgeList(snap.PUNGraph,strPath,0,1) GraphInfo = {} GraphInfo['nodes'] = G.GetNodes() GraphInfo['edges'] = G.GetEdges() GraphInfo['zeroDegNodes'] = snap.CntDegNodes(G, 0) GraphInfo['zeroInDegNodes'] = snap.CntInDegNodes(G, 0) GraphInfo['zeroOutDegNodes'] = snap.CntOutDegNodes(G, 0) GraphInfo['nonZeroIn-OutDegNodes'] = snap.CntNonZNodes(G) GraphInfo['uniqueDirectedEdges'] = snap.CntUniqDirEdges(G) GraphInfo['uniqueUndirectedEdges'] = snap.CntUniqUndirEdges(G) GraphInfo['selfEdges'] = snap.CntSelfEdges(G) GraphInfo['biDirEdges'] = snap.CntUniqBiDirEdges(G) NTestNodes = 10 IsDir = False GraphInfo['approxFullDiameter'] = snap.GetBfsEffDiam(G, NTestNodes, IsDir) GraphInfo['90effectiveDiameter'] = snap.GetAnfEffDiam(G) DegToCntV = snap.TIntPrV() snap.GetDegCnt(G, DegToCntV) sumofNode = G.GetNodes() L = [item.GetVal1()*item.GetVal2() for item in DegToCntV] GraphInfo['averageDegree'] = float(sum(L))/(sumofNode) (DegreeCountMax ,Degree, DegreeCount, CluDegree, Clu) = getGraphInfo(G) # creatNet(G,net) return GraphInfo,DegreeCountMax , Degree, DegreeCount, CluDegree, Clu
def diameter_of_the_network(): sample = int(len(df) * 0.75) #use 75% of the data D = snap.GetBfsFullDiam(G, sample) print("Diameter", D) ED = snap.GetBfsEffDiam(G, sample) print("Effective Diameter", ED) All_dis = snap.GetBfsEffDiamAll(G, sample, False) print("Average Diameter:", All_dis[3])
def print_effective_diameter(G): """ Prints the approximate effective diameter by sampling 10, 100, 1000 nodes in subgraph G Also prints mean and variance of approximate effective diameters obtained """ d10 = snap.GetBfsEffDiam(G, 10) d100 = snap.GetBfsEffDiam(G, 100) d1000 = snap.GetBfsEffDiam(G, 1000) array = np.array([d10, d100, d1000]) mean = round(np.mean(array), 4) variance = round(np.var(array), 4) print("Approximate effective diameter by sampling 10 nodes:", round(d10, 4)) print("Approximate effective diameter by sampling 100 nodes:", round(d100, 4)) print("Approximate effective diameter by sampling 1000 nodes:", round(d1000, 4)) print(f"Approximate effective diameter (mean and variance): {mean},{variance}")
def effect_diameter(self, n_node=100, isDir=False): ''' Returns the (approximation of the) Effective Diameter (90-th percentile of the distribution of shortest path lengths) of a graph :param n_node: number of nodes to sample :param isDir: consider direct or not ''' snap = self.snap n_node = min(self.num_nodes, n_node) diam = snap.GetBfsEffDiam(self.graph, n_node, isDir) return diam
def analyze_diameter(nodes, edges): G = snap.TUNGraph.New() idxs = {} for i, v in enumerate(nodes): idxs[v] = i for v in nodes: G.AddNode(idxs[v]) for e in edges: G.AddEdge(idxs[e[0]], idxs[e[1]]) G.AddEdge(idxs[e[1]], idxs[e[0]]) DegToCCfV = snap.TFltPrV() print('Clustering Coefficient: ', snap.GetClustCf(G, -1)) print('Effective Diameter: ', snap.GetBfsEffDiam(G, min(len(nodes), 10000), False))
def solve_shortest_path_based_questions(G, GName): Fulldiam1 = snap.GetBfsFullDiam(G, 10, False) print "Approximate full diameter in {0} with sampling {1} nodes: {2}".format( GName[:-10], 10, Fulldiam1) Fulldiam2 = snap.GetBfsFullDiam(G, 100, False) print "Approximate full diameter in {0} with sampling {1} nodes: {2}".format( GName[:-10], 100, Fulldiam2) Fulldiam3 = snap.GetBfsFullDiam(G, 1000, False) print "Approximate full diameter in {0} with sampling {1} nodes: {2}".format( GName[:-10], 1000, Fulldiam3) temp = np.array([Fulldiam1, Fulldiam2, Fulldiam3]) print "Approximate full diameter in {0} with sampling nodes (mean and variance): {1}, {2}".format( GName[:10], np.mean(temp), np.var(temp)) effdiam1 = snap.GetBfsEffDiam(G, 10, False) print "Approximate Effective diameter in {0} with sampling {1} nodes: {2}".format( GName[:-10], 10, effdiam1) effdiam2 = snap.GetBfsEffDiam(G, 100, False) print "Approximate Effective diameter in {0} with sampling {1} nodes: {2}".format( GName[:-10], 100, effdiam2) effdiam3 = snap.GetBfsEffDiam(G, 1000, False) print "Approximate Effective diameter in {0} with sampling {1} nodes: {2}".format( GName[:-10], 1000, effdiam3) temp = np.array([effdiam1, effdiam2, effdiam3]) print "Approximate full diameter in {0} with sampling nodes (mean and variance): {1}, {2}".format( GName[:10], np.mean(temp), np.var(temp)) snap.PlotShortPathDistr(G, GName[:-10], GName[:-10] + " - shortest path") filename = "diam." + GName[:-10] + ".png" print "Shortest path distribution of {0} is in: {1}".format( GName[:-10], filename)
def mean_shortest_path_length(self, node_level=False): """ Measurement: mean_shortest_path_length Description: Calculate the mean shortest path Input: Graph Output: """ if SNAP_LOADED: if self.gUNsn.Empty(): warnings.warn('Empty graph') return 0 return sn.GetBfsEffDiam(self.gUNsn, self.gUNsn.GetNodes(), False) else: if not node_level: graphs = {'all': self.gUNig} else: graphs = self.gUNigs meas = {} for key, graph in graphs.item(): if ig.Graph.vcount(graph) == 0: warnings.warn('Empty graph') return 0 shortest_paths = graph.shortest_paths_dijkstra(mode='ALL') shortest_paths_cleaned = np.array(shortest_paths).flatten() shortest_paths_cleaned = shortest_paths_cleaned[np.isfinite( shortest_paths_cleaned)] meas[key] = np.percentile( [float(x) for x in shortest_paths_cleaned], 90) if not node_level: return meas['all'] else: return meas
def analyze_diameter_node(graph, total_duration): print("Analyzing effective diameter...") nodes = [] diams = [] with open("../results/{}_diameter.txt".format(graph.datatype), "w") as f: idx_ub = 0 for i in range(total_duration): target_node_cnt = ((i+1) * graph.number_of_nodes()) // total_duration target_node_time = graph.nodes[target_node_cnt-1][0] while idx_ub < graph.number_of_edges() and graph.edges[idx_ub][0] <= target_node_time: idx_ub += 1 pg = project2(graph, idx_ub) _x = pg.GetNodes() _y = snap.GetBfsEffDiam(pg, 4000 if i < 50 else 1000, False) f.write(f"{_x} {_y}\n") nodes.append(_x) diams.append(_y) f.flush() plt.figure() plt.plot(nodes, diams, 'ro') plt.xlabel("# of nodes") plt.ylabel("Effective Diameter") plt.title("Effective Diameter") plt.savefig("../plots/{}_diameter.png".format(graph.datatype), dpi=300)
@author: Erik G. Larsson, 2018-2019 """ import sys import snap sys.path.append("/courses/tsks11/ht2019/snap-4.1.0-4.1-centos6.5-x64-py2.6/") # circle network # Generates and returns a random small-world graph using the Watts-Strogatz model. # We assume a circle where each node creates links to NodeOutDeg other nodes. G1 = snap.GenSmallWorld(nodes=1000, nodeOutDeg=2, rewireProb=0) # Returns approximate Effective Diameter(90-th percentile of the distribution of shortest # path lengths) of a graph(by performing BFS from NTestNodes random starting nodes). print snap.GetBfsEffDiam(Graph=G1, NTestNodes=50, IsDir=False) # Watt-Strogatz, 0.01 G2 = snap.GenSmallWorld(Nodes=1000, NodeOutDeg=2, RewireProb=0.01) print snap.GetBfsEffDiam(Graph=G2, NTestNodes=50, IsDir=False) # Watt-Strogatz, 0.1 G3 = snap.GenSmallWorld(Nodes=1000, NodeOutDeg=2, RewireProb=0.01) print snap.GetBfsEffDiam(Graph=G3, NTestNodes=50, IsDir=False) # Amazon network # Loads a (directed, undirected or multi) graph from a text file InFNm # with 1 edge per line (whitespace separated columns, int node ids). G = snap.LoadEdgeList( GraphType=snap.PUNGraph, InFNm="/courses/TSKS11/ht2019/data_and_fcns/session4/task1/amazon0302.txt",
def CalculateAveragePathLength(graph): Num = 100 Dist = snap.GetBfsEffDiam(graph, Num) return Dist
def main(): parentDir = os.getcwd() os.chdir(parentDir + "/subgraphs") sub_graph = snap.LoadEdgeList(snap.PUNGraph, sys.argv[1], 0, 1) subGraphName = sys.argv[1].split(".")[0] os.chdir(parentDir) #### 1 ######## node_count = 0 for node in sub_graph.Nodes(): node_count = node_count + 1 printWithOutNewLine("Number of nodes:", node_count) printWithOutNewLine("Number of edges:", snap.CntUniqBiDirEdges(sub_graph)) #### 2 ######## printWithOutNewLine("Number of nodes with degree=7:", snap.CntDegNodes(sub_graph, 7)) rndMaxDegNId = snap.GetMxDegNId(sub_graph) nodeDegPairs = snap.TIntPrV() snap.GetNodeInDegV(sub_graph, nodeDegPairs) maxDegVal = 0 for pair in nodeDegPairs: if (pair.GetVal1() == rndMaxDegNId): maxDegVal = pair.GetVal2() break maxDegNodes = [] for pair in nodeDegPairs: if (pair.GetVal2() == maxDegVal): maxDegNodes.append(pair.GetVal1()) print("Node id(s) with highest degree:", end=" ") print(*maxDegNodes, sep=',') #### 3 ######## sampledFullDiam = [] sampledFullDiam.append(snap.GetBfsFullDiam(sub_graph, 10, False)) sampledFullDiam.append(snap.GetBfsFullDiam(sub_graph, 100, False)) sampledFullDiam.append(snap.GetBfsFullDiam(sub_graph, 1000, False)) sampledFullDiamStats = [] sampledFullDiamStats.append(round(statistics.mean(sampledFullDiam), 4)) sampledFullDiamStats.append(round(statistics.variance(sampledFullDiam), 4)) printWithOutNewLine("Approximate full diameter by sampling 10 nodes:", sampledFullDiam[0]) printWithOutNewLine("Approximate full diameter by sampling 100 nodes:", sampledFullDiam[1]) printWithOutNewLine("Approximate full diameter by sampling 1000 nodes:", sampledFullDiam[2]) print("Approximate full diameter (mean and variance):", end=" ") print(*sampledFullDiamStats, sep=',') sampledEffDiam = [] sampledEffDiam.append(round(snap.GetBfsEffDiam(sub_graph, 10, False), 4)) sampledEffDiam.append(round(snap.GetBfsEffDiam(sub_graph, 100, False), 4)) sampledEffDiam.append(round(snap.GetBfsEffDiam(sub_graph, 1000, False), 4)) sampledEffDiamStats = [] sampledEffDiamStats.append(round(statistics.mean(sampledEffDiam), 4)) sampledEffDiamStats.append(round(statistics.variance(sampledEffDiam), 4)) printWithOutNewLine("Approximate effective diameter by sampling 10 nodes:", sampledEffDiam[0]) printWithOutNewLine( "Approximate effective diameter by sampling 100 nodes:", sampledEffDiam[1]) printWithOutNewLine( "Approximate effective diameter by sampling 1000 nodes:", sampledEffDiam[2]) print("Approximate effective diameter (mean and variance):", end=" ") print(*sampledEffDiamStats, sep=',') #### 4 ######## printWithOutNewLine("Fraction of nodes in largest connected component:", round(snap.GetMxSccSz(sub_graph), 4)) bridgeEdges = snap.TIntPrV() snap.GetEdgeBridges(sub_graph, bridgeEdges) printWithOutNewLine("Number of edge bridges:", len(bridgeEdges)) articulationPoints = snap.TIntV() snap.GetArtPoints(sub_graph, articulationPoints) printWithOutNewLine("Number of articulation points:", len(articulationPoints)) #### 5 ######## printWithOutNewLine("Average clustering coefficient:", round(snap.GetClustCf(sub_graph, -1), 4)) printWithOutNewLine("Number of triads:", snap.GetTriads(sub_graph, -1)) randomNodeId = sub_graph.GetRndNId() nodeIdCcfMap = snap.TIntFltH() snap.GetNodeClustCf(sub_graph, nodeIdCcfMap) print("Clustering coefficient of random node", end=" ") print(randomNodeId, end=": ") print(round(nodeIdCcfMap[randomNodeId], 4)) print("Number of triads random node", end=" ") print(randomNodeId, end=" participates: ") print(snap.GetNodeTriads(sub_graph, randomNodeId)) printWithOutNewLine( "Number of edges that participate in at least one triad:", snap.GetTriadEdges(sub_graph, -1)) #### plots ######## if not os.path.isdir('plots'): os.makedirs('plots') os.chdir(parentDir + "/plots") plotsDir = os.getcwd() snap.PlotOutDegDistr(sub_graph, subGraphName, subGraphName + " Subgraph Degree Distribution") snap.PlotShortPathDistr( sub_graph, subGraphName, subGraphName + " Subgraph Shortest Path Lengths Distribution") snap.PlotSccDistr( sub_graph, subGraphName, subGraphName + " Subgraph Connected Components Size Distribution") snap.PlotClustCf( sub_graph, subGraphName, subGraphName + " Subgraph Clustering Coefficient Distribution") files = os.listdir(plotsDir) for file in files: if not file.endswith(".png"): os.remove(os.path.join(plotsDir, file)) plots = os.listdir(plotsDir) filePrefix = "filename" for file in plots: nameSplit = file.split(".") if (len(nameSplit) == 2): continue if (nameSplit[0] == "ccf"): filePrefix = "clustering_coeff_" elif (nameSplit[0] == "outDeg"): filePrefix = "deg_dist_" elif (nameSplit[0] == "diam"): filePrefix = "shortest_path_" elif (nameSplit[0] == "scc"): filePrefix = "connected_comp_" os.rename(file, filePrefix + nameSplit[1] + "." + nameSplit[2]) os.chdir(parentDir)
for item in InDegV: if (item.GetVal2() == max_d): nodes.append(item.GetVal1()) print(", ".join(list(map(str, sorted(nodes))))) dFull_10 = snap.GetBfsFullDiam(G, 10, False) dFull_100 = snap.GetBfsFullDiam(G, 100, False) dFull_1000 = snap.GetBfsFullDiam(G, 1000, False) m_dFull, v_dFull = map(float, MeanAndVariance(dFull_10, dFull_100, dFull_1000)) print("Approximate full diameter by sampling 10 nodes:", dFull_10) print("Approximate full diameter by sampling 100 nodes:", dFull_100) print("Approximate full diameter by sampling 1000 nodes:", dFull_1000) print("Approximate full diameter (mean and variance): %.4f, %.4f" % (m_dFull, v_dFull)) dEff_10 = snap.GetBfsEffDiam(G, 10, False) dEff_100 = snap.GetBfsEffDiam(G, 100, False) dEff_1000 = snap.GetBfsEffDiam(G, 1000, False) m_dEff, v_dEff = map(float, MeanAndVariance(dEff_10, dEff_100, dEff_1000)) print("Approximate effective diameter by sampling 10 nodes: %.4f" % dEff_10) print("Approximate effective diameter by sampling 100 nodes: %.4f" % dEff_100) print("Approximate effective diameter by sampling 1000 nodes: %.4f" % dEff_1000) print("Approximate effective diameter (mean and variance): %.4f, %.4f" % (m_dEff, v_dEff)) print("Fraction of nodes in largest connected component: %.4f" % snap.GetMxSccSz(G)) EdgeV = snap.TIntPrV() snap.GetEdgeBridges(G, EdgeV)
index] index = index + 1 mean = float(sum(diameter) / 3.0) variance = float((pow((diameter[0] - mean), 2) + pow( (diameter[1] - mean), 2) + pow((diameter[2] - mean), 2)) / 2.0) print "Approx. diameter in " + input_file + " (mean and variance): ", round( mean, 3), ", ", round(variance, 3) print "" diameter = [0, 0, 0] index = 0 for i in [10, 100, 1000]: diameter[index] = snap.GetBfsEffDiam(Graph1, i, False) print "Approx. effective diameter in " + input_file + " with sampling ", i, " nodes: ", round( diameter[index], 3) index = index + 1 mean = float(sum(diameter) / 3.0) variance = float((pow((diameter[0] - mean), 2) + pow( (diameter[1] - mean), 2) + pow((diameter[2] - mean), 2)) / 2.0) print "Approx. effective diameter in " + input_file + " (mean and variance): ", round( mean, 3), ", ", round(variance, 3) snap.PlotShortPathDistr(Graph1, "shortest_path_plot_" + input_file, "Undirected graph - shortest path", 1000) print "Shortest path distribution of " + input_file + " is in: diam.shortest_path_plot_" + input_file + ".png"
source: input DBLP .tsv file destination: file where the coauthorship network should be stored""" exit(1) srcfile = sys.argv[1] dstfile = sys.argv[2] if len(sys.argv) >= 3 else None ringo = ringo.Ringo() t = testutils.Timer(ENABLE_TIMER) S = [("Key", "string"), ("Author", "string")] T = ringo.LoadTableTSV(S, srcfile) t.show("load") T = ringo.SelfJoin(T, "Key") t.show("join") ringo.GenerateProvenance(T, '01-DBLP-provenance.py') # TODO: use simpler conventions for column renaming G = ringo.ToGraph(T, "1_1.Author", "1_2.Author") t.show("graph") if not dstfile is None: G.Save(snap.TFOut(dstfile)) t.show("save") diameter = snap.GetBfsEffDiam(G, N_TEST_NODES) t.show("diameter (%d test nodes)" % N_TEST_NODES) print "Diameter: {0:.5f}".format(diameter)
plotRemove("outDeg", "deg_dist", name) # Question 3 numNodes = [10, 100, 1000] ## Full diameter fullDia = [sn.GetBfsFullDiam(graph, tNodes) for tNodes in numNodes] for i in range(3): print("Approximate full diameter by sampling {} nodes: {}".format( numNodes[i], fullDia[i])) print( "Approximate full diameter (mean and variance): {:.4f} {:.4f}".format( np.mean(fullDia), np.var(fullDia))) ## Effective Diameter effDia = [sn.GetBfsEffDiam(graph, tNodes) for tNodes in numNodes] for i in range(3): print("Approximate effective diameter by sampling {} nodes: {:.4f}". format(numNodes[i], effDia[i])) print("Approximate effective diameter (mean and variance): {:.4f} {:.4f}". format(np.mean(effDia), np.var(effDia))) ## Plot Shortest Path Distr sn.PlotShortPathDistr(graph, name, "Shortest Path Distribution") plotRemove("diam", "shortest_path", name) #Question 4 ## Max Comp Fraction MxConCompSize = sn.GetMxScc(graph).GetNodes() print("Fraction of nodes in largest connected component: {:0.4f}".format(
# -*- coding: utf-8 -*- """ Created on Sun Aug 19 11:10:05 2018 Small world experiments @author: Erik G. Larsson, 2018-2019 """ import sys sys.path.append("/courses/tsks11/ht2019/snap-4.1.0-4.1-centos6.5-x64-py2.6/") import snap # circle network G1 = snap.GenSmallWorld(1000, 2, 0) print snap.GetBfsEffDiam(G1, 50, False) # Watt-Strogatz, 0.01 G2 = snap.GenSmallWorld(1000, 2, 0.001) print snap.GetBfsEffDiam(G2, 50, False) # Watt-Strogatz, 0.1 G3 = snap.GenSmallWorld(1000, 2, 0.1) print snap.GetBfsEffDiam(G3, 50, False) # Amazon network G = snap.LoadEdgeList( snap.PUNGraph, "/courses/TSKS11/ht2019/data_and_fcns/session4/task1/amazon0302.txt", 0, 1) # snap.PrintInfo(G, "amazon", "_amazon-info.txt", False) print snap.GetBfsEffDiam(G, 50, False)
print("Approximate full diameter by sampling 10 nodes:", full_diameter[-1]) full_diameter.append(snap.GetBfsFullDiam(graph, 100)) print("Approximate full diameter by sampling 100 nodes:", full_diameter[-1]) full_diameter.append(snap.GetBfsFullDiam(graph, 1000)) print("Approximate full diameter by sampling 1000 nodes:", full_diameter[-1]) print("Approximate full diameter (mean and variance): ", round(get_mean(full_diameter), 4), ',', round(get_variance(full_diameter), 4), sep="") effective_diameter = [] effective_diameter.append(snap.GetBfsEffDiam(graph, 10)) print("Approximate effective diameter by sampling 10 nodes:", round(effective_diameter[-1], 4)) effective_diameter.append(snap.GetBfsEffDiam(graph, 100)) print("Approximate effective diameter by sampling 100 nodes:", round(effective_diameter[-1], 4)) effective_diameter.append(snap.GetBfsEffDiam(graph, 1000)) print("Approximate effective diameter by sampling 1000 nodes:", round(effective_diameter[-1], 4)) print("Approximate effective diameter (mean and variance):", round(get_mean(effective_diameter), 4), ',', round(get_variance(effective_diameter), 4), sep="")
# # 3a full_diam_list = [] for i in range(1, 4): no_nodes = 10**i full_diam = snap.GetBfsFullDiam(Fb_graph, no_nodes, False) full_diam_list.append(full_diam) print("Approximate full diameter by sampling " + str(no_nodes) + " nodes: " + str(round(full_diam, 4))) mean = sum(full_diam_list) / len(full_diam_list) res = sum((i - mean)**2 for i in full_diam_list) / len(full_diam_list) print("Approximate full diameter (mean and variance): " + str(round(mean, 4)) + "," + str(round(res, 4))) eff_diam_list = [] for i in range(1, 4): no_nodes = 10**i eff_diam = snap.GetBfsEffDiam(Fb_graph, no_nodes, False) eff_diam_list.append(eff_diam) print("Approximate effective diameter by sampling " + str(no_nodes) + " nodes: " + str(round(eff_diam, 4))) mean = sum(eff_diam_list) / len(eff_diam_list) res = sum((i - mean)**2 for i in eff_diam_list) / len(eff_diam_list) print("Approximate effective diameter (mean and variance): " + str(round(mean, 4)) + "," + str(round(res, 4))) snap.PlotShortPathDistr(Fb_graph, "exa", "Directed graph - shortest path")
def diameter(G): G_sub = snap.GetMxWcc(G) d = snap.GetBfsEffDiam(G_sub, 100, False) return d
def graphStructure(elistName, elistPath): """ Calculate properties of the graph as given in the assignment Args: elistName (str) -> Input elist name elistPath (pathlib.Path) -> Input elist using which graph needs to be built Return: RESULTS (dict) -> Dictionary containing results for different subparts of the assignment """ RESULTS = {} subGraph = snap.LoadEdgeList(snap.PUNGraph, elistPath, 0, 1) # Part 1 (Size of the network) RESULTS['nodeCount'] = subGraph.GetNodes() RESULTS['edgeCount'] = subGraph.GetEdges() # Part 2 (Degree of nodes in the network) maxDegree = 0 maxDegreeNodes = [] degree7Count = 0 for node in subGraph.Nodes(): if node.GetDeg() == 7: degree7Count += 1 maxDegree = max(maxDegree, node.GetDeg()) for node in subGraph.Nodes(): if node.GetDeg() == maxDegree: maxDegreeNodes.append(node.GetId()) plotFilename = f"deg_dist_{elistName}" # Since it is an undirected graph, in/out degree is unimportant snap.PlotOutDegDistr(subGraph, plotFilename) RESULTS['maxDegree'] = maxDegree RESULTS['maxDegreeNodes'] = ','.join(map(str, maxDegreeNodes)) RESULTS['degree7Count'] = degree7Count # Part 3 (Paths in the network) # Full Diameter Calculation fullDiameters = { 10: snap.GetBfsFullDiam(subGraph, 10, False), 100: snap.GetBfsFullDiam(subGraph, 100, False), 1000: snap.GetBfsFullDiam(subGraph, 1000, False) } fullMean, fullVariance = meanVariance(fullDiameters.values()) fullDiameters['mean'] = fullMean fullDiameters['variance'] = fullVariance RESULTS['fullDiameters'] = fullDiameters # Effective Diameter Calculation effDiameters = { 10: snap.GetBfsEffDiam(subGraph, 10, False), 100: snap.GetBfsEffDiam(subGraph, 100, False), 1000: snap.GetBfsEffDiam(subGraph, 1000, False), } effMean, effVariance = meanVariance(effDiameters.values()) effDiameters['mean'] = effMean effDiameters['variance'] = effVariance RESULTS['effDiameters'] = effDiameters plotFilename = f"shortest_path_{elistName}" snap.PlotShortPathDistr(subGraph, plotFilename) # Part 4 (Components of the network) edgeBridges = snap.TIntPrV() articulationPoints = snap.TIntV() RESULTS['fractionLargestConnected'] = snap.GetMxSccSz(subGraph) snap.GetEdgeBridges(subGraph, edgeBridges) snap.GetArtPoints(subGraph, articulationPoints) RESULTS['edgeBridges'] = len(edgeBridges) RESULTS['articulationPoints'] = len(articulationPoints) plotFilename = f"connected_comp_{elistName}" snap.PlotSccDistr(subGraph, plotFilename) # Part 5 (Connectivity and clustering in the network) RESULTS['avgClusterCoefficient'] = snap.GetClustCf(subGraph, -1) RESULTS['triadCount'] = snap.GetTriadsAll(subGraph, -1)[0] nodeX = subGraph.GetRndNId(Rnd) nodeY = subGraph.GetRndNId(Rnd) RESULTS['randomClusterCoefficient'] = (nodeX, snap.GetNodeClustCf( subGraph, nodeX)) RESULTS['randomNodeTriads'] = (nodeY, snap.GetNodeTriads(subGraph, nodeY)) RESULTS['edgesTriads'] = snap.GetTriadEdges(subGraph) plotFilename = f"clustering_coeff_{elistName}" snap.PlotClustCf(subGraph, plotFilename) return RESULTS
diam10 = snap.GetBfsFullDiam(UGraph, 10, False) diam100 = snap.GetBfsFullDiam(UGraph, 100, False) diam1000 = snap.GetBfsFullDiam(UGraph, 1000, False) data = [diam10, diam100, diam1000] m = mean(data) v = variance(data) print "Approx. diameter in %s with sampling 10 nodes: %d" % (file, diam10) print "Approx. diameter in %s with sampling 100 nodes: %d" % (file, diam100) print "Approx. diameter in %s with sampling 1000 nodes: %d" % (file, diam1000) print "Approx. diameter in %s (mean and variance): %d, %d\n" % (file, m, v) # b) approximate effective diameter effDiam10 = snap.GetBfsEffDiam(UGraph, 10, False) effDiam100 = snap.GetBfsEffDiam(UGraph, 100, False) effDiam1000 = snap.GetBfsEffDiam(UGraph, 1000, False) effData = [effDiam10, effDiam100, effDiam1000] em = mean(effData) ev = variance(effData) print "Approx. effective diameter in %s with sampling 10 nodes: %d" % ( file, effDiam10) print "Approx. effective diameter in %s with sampling 100 nodes: %d" % ( file, effDiam100) print "Approx. effective diameter in %s with sampling 1000 nodes: %d" % ( file, effDiam1000) print "Approx. effective diameter in %s (mean and variance): %d, %d\n" % ( file, em, ev)
# Self-join # >>> table.selfjoin(['Key']) table = table.SelfJoin("Key") t.show("join", table) # Save final table # >>> table.save('table.tsv') if not dstDir is None: table.SaveSS(os.path.join(dstDir, OUTPUT_TABLE_FILENAME)) t.show("save edge table", table) # Create network # >>> graph = table.graph('Author_1', 'Author_2', directed=False) # TODO: use simpler conventions for column renaming table.SetSrcCol("1_2_1.1.Author") table.SetDstCol("1_2_2.1.Author") graph = snap.ToGraph(table, snap.aaFirst) t.show("graph", graph) if not dstDir is None: graph.Save(snap.TFOut(os.path.join(dstDir, OUTPUT_GRAPH_FILENAME))) t.show("save graph", graph) # Print diameter # >>> print graph.diameter(10000) diameter = snap.GetBfsEffDiam(graph, N_TEST_NODES) t.show("diameter (%d test nodes)" % N_TEST_NODES) print "Diameter: {0:.5f}".format(diameter)
def mean_shortest_path_length(self): if self.gUNsn.Empty(): warnings.warn('Empty graph') return None return sn.GetBfsEffDiam(self.gUNsn, 500, False)
snap.PlotInDegDistr(Graph1, str, "Degree Distribution") #3.Paths in the Network full1 = snap.GetBfsFullDiam(Graph1, 10, False) full2 = snap.GetBfsFullDiam(Graph1, 100, False) full3 = snap.GetBfsFullDiam(Graph1, 1000, False) print("Approximate full diameter by sampling ", 10, " nodes: ", full1) print("Approximate full diameter by sampling ", 100, " nodes: ", full2) print("Approximate full diameter by sampling ", 1000, " nodes: ", full3) fmean = (full1 + full2 + full3) / 3.0 fvar = (((full1 * full1) + (full2 * full2) + (full3 * full3)) / 3.0) - (fmean * fmean) print("Approximate full diameter (mean and variance): %0.4f,%0.4f" % (fmean, fvar)) eff1 = snap.GetBfsEffDiam(Graph1, 10, False) eff2 = snap.GetBfsEffDiam(Graph1, 100, False) eff3 = snap.GetBfsEffDiam(Graph1, 1000, False) print("Approximate effective diameter by sampling ", 10, " nodes: %0.4f" % eff1) print("Approximate effective diameter by sampling ", 100, " nodes: %0.4f" % eff2) print("Approximate effective diameter by sampling ", 1000, " nodes: %0.4f" % eff3) effmean = (eff1 + eff2 + eff3) / 3.0 effvar = (((eff1 * eff1) + (eff2 * eff2) + (eff3 * eff3)) / 3.0) - (effmean * effmean) print("Approximate effective diameter (mean and variance): %0.4f,%0.4f" % (effmean, effvar)) str1 = 'shortest_path_' + file_name
# [3] Paths in the network full_diam = [] for num_test_nodes in [10, 100, 1000]: d = snap.GetBfsFullDiam(G, num_test_nodes, False) full_diam.append(d) print("Approximate full diameter by sampling {} nodes: {}".format( num_test_nodes, d)) print("Approximate full diameter (mean and variance): {}, {}".format( round(statistics.mean(full_diam), 4), round(statistics.variance(full_diam), 4))) eff_diam = [] for num_test_nodes in [10, 100, 1000]: d = snap.GetBfsEffDiam(G, num_test_nodes, False) eff_diam.append(d) print("Approximate effective diameter by sampling {} nodes: {}".format( num_test_nodes, round(d, 4))) print("Approximate effective diameter (mean and variance): {}, {}".format( round(statistics.mean(eff_diam), 4), round(statistics.variance(eff_diam), 4))) # Get Shortest Path Distribution shortest_path_dist = {} for NI in G.Nodes(): NIdToDist = snap.TIntH() shortestPath = snap.GetShortPath(G, NI.GetId(), NIdToDist) for item in NIdToDist: if NIdToDist[item] in shortest_path_dist:
while (i <= 1000): diam = snap.GetBfsFullDiam(fbsgel, i, False) print("Approximate full diameter by sampling", i, "nodes:", round(diam, 4)) i *= 10 average += diam variance += (diam * diam) average /= 3 variance = (variance / 3) - average * average print("Approximate full diameter(mean and variance): %0.4f,%0.4f" % (average, variance)) #b i = 10 average = 0.0 variance = 0.0 while (i <= 1000): diam = snap.GetBfsEffDiam(fbsgel, i, False) print("Approximate effective diameter by sampling", i, "nodes:", round(diam, 4)) i *= 10 average += diam variance += (diam * diam) average /= 3 variance = (variance / 3) - average * average print("Approximate effective diameter(mean and variance): %0.4f,%0.4f" % (average, variance)) #c Plot snap.PlotShortPathDistr(fbsgel, "shortest_path_" + str(subgraph_name), "shortest_path_" + str(subgraph_name)) #Q4 #a
snap.GetBfsFullDiam(p2p_gnutella04_subgraph, 10), snap.GetBfsFullDiam(p2p_gnutella04_subgraph, 100), snap.GetBfsFullDiam(p2p_gnutella04_subgraph, 1000) ] print "Approximate full diameter in p2p-Gnutella04-subgraph with sampling nodes(mean and variance):" + str( round(statistics.mean(value), 2)) + "," + str( round(statistics.variance(value), 2)) # Task 1.2.3.2 if (sub_graph_name == "soc-Epinions1-subgraph"): # Calculating the effective diameter print "Approximate Effective diameter in soc-Epinions1-subgraph with sampling 10 nodes: " + str( round(snap.GetBfsEffDiam(soc_epinions1_subgraph, 10, v1, True)[0], 3)) print "Approximate Effective diameter in soc-Epinions1-subgraph with sampling 100 nodes: " + str( round(snap.GetBfsEffDiam(soc_epinions1_subgraph, 100, v1, True)[0], 3)) print "Approximate Effective diameter in soc-Epinions1-subgraph with sampling 1000 nodes: " + str( round( snap.GetBfsEffDiam(soc_epinions1_subgraph, 1000, v1, True)[0], 3)) value_new = [ snap.GetBfsEffDiam(soc_epinions1_subgraph, 10, v1, True)[0], snap.GetBfsEffDiam(soc_epinions1_subgraph, 100, v1, True)[0], snap.GetBfsEffDiam(soc_epinions1_subgraph, 1000, v1, True)[0] ] print "Approximate Effective diameter in soc-Epinions1-subgraph with sampling nodes(mean and variance):" + str( round(statistics.mean(value_new), 3)) + "," + str( round(statistics.variance(value_new), 4))