[ACE] Graphs

Graph


Definition of a graph

A graph is a set of nodes, or vertices, connected by edges.
Graphs can be
  • undirected - edges don’t have direction
  • directed - edges have direction
Undirected graphs can be represented as directed graphs where for each edge (X,Y) there is a corresponding edge (Y,X).
Graphs can also be
  • unweighted
  • weighted - edges have weights
  • enter image description here

Notation

  • Set V of vertices (nodes)
  • Set E of edges (E⊂V*V)
  • Node B is adjacent to A if there is an edge from A to B.(A —–>B)
  • enter image description here

Paths and reachability

A path from A to B is a sequence of vertices A1,…,An such that there is an edge from A to A1, from A1 to A2,… from An to B
Vertex B is reachable from A if there is a path from A to B

Other Terminology

  • A cycle is a path from a vertex to itself
  • Graph is acyclic if it does not have cycles
  • Graph is connected if there is a path between every pair of vertices
  • Graph is strongly connected if there is a path in both directions between every pair of vertices

how to implement a graph

As with lists, there are several approaches, but most common options are:
  • static indexed data structure –Adjacency matrix
  • dynamic data structure – Adjacency lists

    Adjacency matrix

  • Store node in an array: each node is associated with an integer(array index)
  • Represent information about the edges using a two dimensional array, where array[i][j] == 1 , iff there is an edge from node with index i to the node with index j.
  • For weighted graphs, place weights in the matrix(if there is no edge we use a value which can’t be confused with a weight, e.g. -1 or Integer.MAX_VALUE

Disadvantages of adjacency matrices

  • sparse graphs with few edges for number that are possible result in many zero entries in adjacency matrix
  • This wastes space and makes many algorithms less efficient
  • Also, if the number of nodes in the graph may change, matrix representation is too inflexible, especially if we don’t know the maximal size of the graph

Adjacency list

  • For every vertex, keep a list of adjacent vertices
  • Keep a list of vertices, or keep vertices in a Map (e.g. HashMap) as keys and lists of vertices as values
  • enter image description here

BFS and DFS

Graph traversals

Generally have three sets of nodes
1. Nodes that have not yet been discovered
2. “Working Set” - nodes we are currently processing in some way
3. Nodes that we have finished with

BFS

enter image description here

DFS

enter image description here
code here

Space complexity of BFS and DFS

For a general graph
  • Need a queue/stack of size |V|(the number of vertices)
  • Space complexity O(V)

If the graph is a tree

  • In DFS the stack will be the height, this can be as good as O(logn)
  • In BFS we still need to store all nodes of a level, hence is still O(n)
  • Hence in trees, DFS can be a lot more space efficient than BFS

Time complexity of BFS and DFS

  • In terms of the number of vertices V: two nested loops over V, hence at worst O(V^2).
  • More useful complexity estimate is in terms of the number of edges(usually, the number of edges is much less than V^2)
  • BFS time complexity : Gives a O(|V|+|E|) time complexity
  • Each vertex is enqueued and dequeued at most once. Scanning for all adjacent vertices takes O(|E|) time, since sum of lengths of adjacency lists is |E|.
  • DFS time complexity : Each vertex is pushed on the stack and popped at most once, For every vertex we check what the next unvisited neighbour is, so it is O(|V|+|E|) again

detect cycles in directed graph by DFS

  • Instead of visited and unvisited, use three “colors”
  • white = unvisited (not done)
  • grey = on the stack (half done)
  • black = finished
    enter image description here

评论

此博客中的热门博文

[MLE] W2 Multivariate linear regression

[MLE] W1 Introduction

[AIM] MetaHeuristics