Answer by Alex Kritchevsky:

It's the discrete analogue of the Laplacian operator [math] \nabla^{2} [/math] (the multivariable second-derivative, if you like). If you took a function and approximated it with a graph such that nodes are more dense where the function's second derivative is greater (so, it has bumps that put more "surface area" over the same part of its domain), they have more edges in the discrete graph.

Suppose you had a long line of nodes connected to their neighbors with a Laplacian matrix, and you took their density to infinity – that is, you squeezed more and more nodes in until there was one at every x value on the line and the Laplacian matrix had become "infinitely long", whatever that is. The Laplacian matrix is then identical everywhere: it has a 2 for every diagonal element and a -1 for every element next to the diagonal on either side.

If you had a map of "values" associated with every node on this line, [math] f_{k} [/math], then when you made it infinitely dense you've turned it into a function [math] f(x) [/math]. But if you imagined that instead as an infinitely long vector [math] ( \ldots , f_{k-1}, f_{k} , f_{k+1}, \ldots)^{T} [/math] – note that the indices aren't really well defined when the line is infinitely dense, but this sort works anyway (it "can be made rigorous", as they say, but it doesn't help with what I'm trying to show). Now if you multiplied the infinite Laplacian matrix by this infinite vector you would get a long list of terms of the form [math] v_{k} = 2f_{k} – f_{k-1} – f_{k+1} [/math].

This is the (negative) (numerator of) the "finite difference approximation" of the second derivative.I'll show that from the other direction. If you have a function [math] f(x) [/math] sampled at discrete points [math] f(x + n\Delta x) [/math] (that is, at even intervals) you can approximate the first derivative with [math] f'(x) \approx \frac{f(x+\Delta x)-f_(x)}{\Delta x} [/math] or [math] f'(x) \approx \frac{f(x)-f(x-\Delta x)}{\Delta x} [/math]. Both of those are "valid" first derivative approximations – they're sort of derivatives on each side of our point [math]x[/math] and they would be "next to" each other in the derivative vector. If you let [math] \Delta x \rightarrow 0 [/math] you would get the limit definition of the derivative that you learned in single-variable calculus out of either of them (one's the limit from the left and one's from the right).

If you took the derivative of that series – by subtracting adjacent elements – you get [math] f''(x) \approx \frac{ \frac{f(x+\Delta x)-f(x)}{\Delta x} – \frac{f(x)-f(x-\Delta x)}{\Delta x} }{\Delta x } [/math] which is [math] \frac{f(x+\Delta x) – 2f(x) + f(x-\Delta x)}{(\Delta x)^{2}}[/math].

If you labeled [math] f(x) = f_{k} [/math] and [math] f_{k+1} = f(x+\Delta x) [/math], etc, then its numerator is the negative of the Laplacian operator's version: [math] f''(x) \approx \frac{-v_{k}}{(\Delta x)^{2}} [/math]. Remember [math] v_{k} [/math] came from the

matrixversion above, not from the approximation I just used.But even still: if you take [math] \Delta x [/math] to zero it becomes the "second derivative" – really, actually, the second derivative from 1-D calculus.

So, the "general" Laplacian matrix is the negative numerator of the discretify'd second derivative operation.

Wikipedia's article Laplacian matrix mentions "normalizing" the Laplacian matrix by dividing every matrix element by [math]\sqrt{ d_{i}}\sqrt{d_{j}}[/math]. Note that for nodes on the diagonal you're doing this too: you divide by the square root of their own degree squared, which gives 1. This is sorta like dividing by [math] (\Delta x)^{2} [/math], I believe, which means the normalized matrix is the "negative second derivative operator", numerator and denominator both. That means the "degree" is something like the "distance from nearby nodes", which makes sense – we might imagine nodes with higher degrees accounting for a larger area of the "infinite surface" version of the graph, so that their neighbors are further away and therefore the distances between them are larger. Though that's definitely hand waving on my part.

Also, the notation I used at the top, [math] \nabla^{2} [/math], is how you write the "coordinate invariant second derivative" in any number of dimensions. Without going in too much detail (take a multivariable calculus class! – actually, no, take a Riemannian Geometry class, but it

shouldbe in multivariable calculus classes). It's the "divergence of the gradient", which is, of course, useless.I only showed it to you for one dimensional functions, but I think it's credible once you get that that it applies to larger dimensional functions. note that in regular calculus your 2d coordinates are an infinite grid, etc. But graphs are much more general – they can be connected in ways that can't be necessarily laid down without loops. In this sense it's really approximating the Laplacian on a knotted manifold (gods!) rather than a plane, but for your purposes it's the same thing.

Thinking of the Laplacian matrix as the "second derivative operator" means: if you multiply it by a vector (which is basically a function defined at every point on the graph) it gives you something like the "curvature" (remember that your second derivative is how much the first derivative is changing, so whether it's 'concave up' or 'concave down' rather than sloping up or down).

Writing this has left me with a ton of things I want to explore about this analogy. I couldn't find a good source for this stuff so I'm sure I missed some important interpretation but maybe this helps.

—————-

Oh! Oh! More comments.In response to this part of your question:

Both of these I can easily understand. But they seem to express so different aspects of a graph, how is it meaningful to look at their

difference?The answer is that they

doexpress the same thing in a graph – the same units, so to speak – it's just non-obvious that they do. Note that the matrix's elements add up to zero. That is, the "2" or whatever, on the degree, is counterbalanced by the "-1"'s for each of its outgoing edges. Remember how we got to it, though: we first took the difference of adjacent elements and then we took the difference of those. You could imagine an edge between 2 nodes being an arrow pointing each way and the edge is the sum of those two edges, which make a closed loop. Then the second derivative is a loop that goes out-in in both directions. So the "+2" in the middle is the two "out" arrows and the "-1"s are the "in" arrows. (By the way, all derivatives are like this: an x-y derivative is 4 arrows going around a square.) So, the degree of a node is a count of out arrows; the adjacency elements are a count of in arrows; and the Laplacian adds them together to get the total count of second-order arrows which might satisfy your objection to [math] D – A[/math].Also, this interpretation I presented of the Laplacian matrix as a finite derivative should be used to understand the properties of the Laplacian better.

The Wikipedia article gives a list of properties.

- The Laplacian is positive-semidefinite, that is, all of its eigenvalues have [math] \lambda_{i} \geq 0[/math] with the least one 0. This equates to the second derivative having eigenvalues greater than 0 – so [math] \nabla^{2} f = -\lambda f [/math] holds. If you know differential equations, this means its eigenfunctions are sines and cosines, aka a Fourier series. Off the top of my head, this probably means that you can take a sort of "dual graph" of a finite graph which is something like its Fourier transform, whose nodes are linear combinations of the nodes in your original graph – the eigenvectors of the graph's Laplacian. That sounds cool. And the statement that the eigenvalues are positive might correspond to the fact that you can't have an unbounded function (an exponential) on a
finitegraph, while for infinite graphs you can?- The number of 0 eigenvalues is the number of connected components. This makes sense: each connected component forms a "block' in the Laplacian matrix that only has edges within itself (a "block diagonal form" in Representation theory). Each block is the Laplacian for a small connected component and it has one zero eigenvalue, so the number of zeros is the number of blocks is the number of connected components. The eigenvector for this 0 eigenvalue is all of the nodes in the component: [math] (1,1,1,1,\ldots)^{T}[/math]; it adds the in and outdegrees of every node and those add to 0. This is something like the "DC" component in the Fourier transform; the second derivative does nothing because it's a straight line of all the nodes (that's handwaving interpretation, though).
- The incidence matrix talked about on the Wiki is like the measure of all the "out arrows", which is why it squares to the Laplacian matrix: for every out arrow there is an "in arrow", so if you double every arrow like that you get the Laplacian. Alternatively, the incidence matrix [math] M [/math] is the "first derivative" matrix, but taken using [math] f_{k+1} – f_{k-1} [/math] and skipping [math] f_{k} [/math]. Even still, it seems to square to the Laplacian, which bodes well for our interpretation of the Laplacian as the second derivative.
Etc. I'm sorta making this up as I go but it's neat, eh?

What's the intuition behind a Laplacian matrix?