February 5, 2006

Tensor Products

The first time I saw the axioms for tensor products of vector spaces I was confused to say the least. I kept thinking about why the particular axioms were chosen. Recently I have developed a better understanding of where the axioms come from.

We begin our considerations with 2 by 2 matrices [tex]A[/tex] and [tex]B[/tex] over a field [tex]\mathbb{F}[/tex], where
[tex] A= \left ( \begin{array}{cc} a_{11} & a_{12} \\ a_{21} & a_{22} \end{array} \right ) \quad \textrm{and} \quad B= \left ( \begin{array}{cc} b_{11} & b_{12} \\\ b_{21} & b_{22} \end{array} \right )[/tex]

The tensor (or Kronecker) product of [tex]A[/tex] and [tex]B[/tex] is defined as

[tex] A \otimes B = \left ( \begin{array}{cc} a_{11} B & a _{12} B \\\ a_{21} B & a_{22} B \end{array} \right ) [/tex]

[tex] = \left ( \begin{array}{cccc}a_{11} b_{11} & a_{11} b _{12} & a _{12} b_{11} & a _{12}b _{12} \\ a_{11} b_{21} & a_{11} b_{22} & a _{12} b_{21} & a _{12}b_{22} \\ a_{21} b_{11} & a_{21} b _{12} & a_{22} b_{11} & a_{22}b_{12} \\ a_{21} b_{21} & a_{21} b_{22} & a_{22} b_{21} & a_{22} b_{22} \end{array} \right )[/tex].

The tensor product of [tex]B[/tex] with [tex]A [/tex] is
[tex] B \otimes A = \left ( \begin{array}{cc} b_{11} A & b _{12}A \\ b_{21} A & b_{22}A \end{array} \right )[/tex]

[tex] = \left ( \begin{array}{cccc} b_{11} a_{11} & b_{11} a _{12} & b _{12} a_{11} & b _{12} a _{12} \\ b_{11} a_{21} & b_{11} a_{22} & b _{12} a_{21} & b _{12} a_{22} \\ b_{21} a_{11} & b_{21}a _{12} & b_{22} a_{11} & b_{22}a _{12} \\ b_{21}a_{21} & b_{21}a_{22} & b_{22} a_{21} & b_{22}a_{22} \end{array} \right )[/tex] .

Clearly, [tex]A \otimes B \ne B \otimes A[/tex] , so that the tensor product is non-commutative [1] (it is however associative).

We now consider the effect of multiplication by a scalar [tex] \nu \in \mathbb F[/tex] . We set

[tex] \nu A = \nu \left ( \begin{array}{cc} a_{11} & a _{12} \\\ a_{21} & a_{22} \end{array}\right ) = \left ( \begin{array}{cc} \nu a_{11} & \nu a _{12} \\\ \nu a_{21} & \nu a_{22} \end{array} \right) [/tex]

now
[tex] \nu A \otimes B = \left ( \begin{array}{cc} \nu a_{11} & \nu a _{12} \\\ \nu a_{21} & \nu a_{22} \end{array}\right) \otimes \left ( \begin{array}{cc} b_{11} & b _{12} \\\ b_{21} & b_{22} \end{array}\right) [/tex]
[tex] = \left ( \begin{array}{cc} \nu a_{11} B & \nu a _{12} B \\\ \nu a_{21} B & \nu a_{22} B \end{array} \right ) [/tex]

[tex] = \left ( \begin{array}{cccc} \nu a_{11} b_{11} & \nu a_{11} b _{12} & \nu a _{12} b_{11} & \nu a _{12}b _{12} \\ \nu a_{11} b_{21} & \nu a_{11} b_{22} & \nu a _{12} b_{21} & \nu a _{12}b_{22} \\ \nu a_{21} b_{11} & \nu a_{21} b _{12} & \nu a_{22} b_{11} & \nu a_{22}b_{12} \\ \nu a_{21} b_{21} & \nu a_{21} b_{22} & \nu a_{22} b_{21} & \nu a_{22} b_{22} \end{array} \right )[/tex]

[tex] = \nu \left ( \begin{array}{cccc} a_{11} b_{11} & a_{11} b _{12} & a _{12} b_{11} & a _{12}b _{12} \\ a_{11} b_{21} & a_{11} b_{22} & a _{12} b_{21} & a _{12}b_{22} \\ a_{21} b_{11} & a_{21} b _{12} & a_{22} b_{11} & a_{22}b_{12} \\ a_{21} b_{21} & a_{21} b_{22} & a_{22} b_{21} & a_{22} b_{22} \end{array} \right )[/tex]

[tex] = \nu (A \otimes B) = A \otimes \nu B [/tex] .

For arbitrary matrices [tex]A = a_{ij}, B = b_{kl}[/tex] the tensor product takes the form
[tex]A \otimes B = \left( \begin{array}{ccc} a_{11} B & \cdots & a_{1j} B \\ \vdots & \ddots & \vdots \\ a_{i1}B & \cdots & a_{ij}B\end{array} \right)[/tex]

For column vectors, [tex]u=\left( \begin{array}{c}u_1 \\\ \vdots \\\ u_j\end{array} \right), v= \left( \begin{array}{c} v_1 \\\ \vdots \\\ v_k \end{array} \right)[/tex] the tensor product is of the same general form.

A column vector is simply a [tex]k \times 1[/tex] matrix and the tensor product of [tex]u[/tex] with [tex]v[/tex] is

[tex]u \otimes v =\left( \begin{array}{c}u_1 v\\\ \vdots \\\ u_j v\end{array} \right)[/tex].

We can begin abstracting a bit now. Let [tex]V [/tex] and [tex]W[/tex] be two finite dimensional vector spaces. We define the tensor product [tex]V \otimes W[/tex] as the vector space spanned by all formal products [tex]v \otimes w[/tex] , [tex]v \in V[/tex] , [tex]w \in W [/tex] where [tex]\otimes [/tex] satisfies

(i) [tex] ( v_1 + v_2 ) \otimes w = v_1 \otimes w + v_2 \otimes w[/tex]

(ii) [tex] v \otimes ( w_1 + w_2 ) = v \otimes w_1 + v \otimes w_2 [/tex]

(iii) [tex] \nu ( v \otimes w ) = ( \nu v ) \otimes w = v \otimes ( \nu w ) [/tex],

for all [tex]v, v_1, v_2 \in V, w,w_1,w_2 \in W, \nu \in \mathbb{F}[/tex].

The first two parts of the defintion are straightforward, we want [tex]\otimes[/tex] to be linear in both the first and the second spaces (i.e. bilinear). The third axiom is the most confusing one. As you can see from the results derived from the Kronecker product of matrices, the third axiom makes sense if we are going to model tensor products algebraically.

Note that a general element of [tex]V \otimes W[/tex] is of the form [tex]\sum_{i,j} \alpha_{ij} v_i \otimes w_j[/tex] and cannot in general be expressed in the simpler form [tex]v \otimes w [/tex], for some [tex]v,w \in V[/tex].

Example, let [tex]V[/tex] be a 2-dimensionl vector space with basis [tex]\{e_1, e_2\}[/tex], then consider the space [tex]V \otimes V[/tex]. A typical element of [tex]V \otimes V[/tex] is
[tex]e_1 \otimes e_2 + e_2 \otimes e_1[/tex].
It cannot be expressed as a tensor product of two vectors, for suppose that there are vectors [tex]v, v’ \in V[/tex] such that
[tex] v \otimes v’ = e_1 \otimes e_2 + e_2 \otimes e_1 [/tex].
Now since [tex]\{e_1,e_2\}[/tex] is a basis of [tex]V[/tex] we can express [tex]v,v’[/tex] in terms of the basis vectors as
[tex]v = \alpha_1 e_1 + \alpha_2 e_2,\quad v’ =\beta_1 e_1 + \beta_2 e_2[/tex]
where [tex]\alpha_1,\alpha_2, \beta_1, \beta_2 \in \mathhbb{F}[/tex], so that

[tex]v \otimes v’ = (\alpha_1 e_1 + \alpha_2 e_2) \otimes (\beta_1 e_1 + \beta_2 e_2)[/tex]
[tex] =\alpha_1 \beta_1 \ e_1 \otimes e_1 + \alpha_1 \beta_2 \ e_1 \otimes e_2 + \alpha_2 \beta_1 \ e_2 \otimes e_1 + \alpha_2 \beta_2 \ e_2 \otimes e_2 [/tex]
[tex]= e_1 \otimes e_2 + e_2 \otimes e_1[/tex].

Now by equating coeffecients we must have [tex]\alpha_1 \beta_1=0[/tex], which implies that either [tex]\alpha_1 \beta_2 =0[/tex] or [tex]\alpha_2 \beta_1 =0[/tex], so that there is no solution, hence there are no vectors [tex]v,v’[/tex] such that [tex] v \otimes v’ = e_1 \otimes e_2 + e_2 \otimes e_1 [/tex].

The tensor product comes up a lot in quantum mechanics, for example in the addition of angular momenta.
It is also the basis of entanglement in quantum mechanics, a state such as
[tex] | \uparrow \ \rangle \otimes |\downarrow \ \rangle + |\downarrow \ \rangle\otimes |\uparrow\ \rangle [/tex]
is called an entangled state because it has no decomposition, just replace [tex]e_1[/tex] with [tex]| \uparrow \ \rangle[/tex] (spin up) and [tex]e_2[/tex] with [tex]| \downarrow \ \rangle[/tex] (spin down) in the previous argument.

A state such as
[tex] |\uparrow\ \rangle \otimes |\uparrow \ \rangle + |\downarrow\ \rangle\otimes |\uparrow \ \rangle [/tex] is not entangled because it can be expressed as [tex] (1|\uparrow \ \rangle + 1|\downarrow \ \rangle) \otimes ( 1|\uparrow \ \rangle + 0|\downarrow \ \rangle )[/tex].


[1]. To see how [tex]A \otimes B[/tex] and [tex]B \otimes A[/tex] are related, we introduce the permutation operator [tex]T[/tex] where,
[tex] T = \left ( \begin{array}{cccc} 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 \end{array} \right )[/tex].

Note that [tex]T^2 =I[/tex], so that [tex]T = T^{-1}[/tex]. A quick calculation shows that
[tex] A \otimes B = T \cdot (B \otimes A ) \cdot T^{-1} [/tex].

7 Comments »

  1. Hey Tel,

    Hey, the definition of the tensor product of two matrices V and W is dependent on the basis of V \otimes W you choose, isn’t it? Ok, you’re implicitly choosing the basis in writing out V \otimes W. The way I’ve always dealt with it is just to be really clear about which basis I’m using. You know, I never actually learnt that “the definition of V \otimes W is…” I just worked it out myself!

    Cheers,
    Sach

    Comment by Sacha — February 5, 2006 @ 11:25 pm

  2. Sorry, I should have said “is dependent on the ordering of the basis vector of V \otimes W you choose” :-)

    Cheers,
    Sach

    Comment by Sacha Blumen — February 6, 2006 @ 12:56 am

  3. Hi Tel,

    Hey, like you, I write the components of a matrix A as aij where i is the row and j is the column.

    Now, in Penrose’s tome, he writes it as aji. Do you know if this common physicists notation?

    When I invented my own tensor notation, I wrote the action of the matrix A on a basis vector vi as
    ajivj (assuming the summation convention), but in Penrose’s scheme, it is
    aijvj.

    Comment by Sacha — March 26, 2006 @ 7:35 pm

  4. Hi Sach,

    This notation is common in differential geometry and hence relativity (and hence Penrose).

    Depending on what I’m doing I usually write aij or aij. For a lot of things the latter is more natural. (Especially when working with quantum doubles).

    If you define your matrix as A = aij, then the upper index is indicating the row number and the lower the column number. For a vector, which has n rows but only one column, the index should go in the upstairs location for consistency v = vj. The transformation equation then becomes

    v’ i = aijvj,

    where we’ve followed Einstein’s summation convention of summing over dummy indices (provided one index is in the floor and the other index is in the ceiling).

    So, what’s vj I hear you ask ? Well, it’s an element of the dual space, a co-vector or 1-form. Skipping a lot of details (and precision), the two are related by the Kronecker delta which we use to change vectors into co-vectors and vice versa

    viijvj.

    Maybe you’ve had a sneaking suspicion all along about the transpose of a column vector :).

    (Drinfeld’s quantum double combines a Hopf algebra H (actually HCop) with its dual H* to create a new Hopf algebra which is always quasi-triangular.)

    Tel

    Comment by Tel — March 26, 2006 @ 10:23 pm

  5. Yes, writing A vi as

    aji vj

    (assuming the summation convention) seems more straightforward to me. Employing upper and lower indices makes it much easier to keep a grip on summations.

    Comment by Sacha — March 27, 2006 @ 2:55 am

  6. Hi Tel,

    Re: your para ‘If you define your matrix as A = aij, then the upper index is indicating the row number and the lower the column number. For a vector, which has n rows but only one column, the index should go in the upstairs location for consistency v = vj.’

    I agree - this makes more sense.

    Comment by Sacha — March 27, 2006 @ 3:53 pm

  7. Actually, for consistency I should write a column vector as

    vi

    and the action of the matrix A on vi as

    aji vj (assuming the summation convention).

    Comment by Sacha — March 29, 2006 @ 12:37 am

RSS feed for comments on this post. | TrackBack URI

You can also bookmark this on del.icio.us or check the cosmos

Leave a comment