A Mini, Fool-Proof Guide to Understanding Eigenvectors and Eigenvalues
When I was still studying for my Bachelor’s degree in engineering, I didn’t particularly enjoy learning linear algebra. Sure, it was relatively easy compared to other topics covered in the math classes that I had to take — maybe even the easiest if I’m being honest. However, with how it was completely crammed in engineering math class (of which it was the very first topic that got covered) by a tired professor who couldn’t care less whether his students understood or not, I ended up having a hard time truly understanding the concepts and found it tedious instead. It was especially bad for the more advanced concepts like eigenvectors and eigenvalues that I practically had no idea how they worked or what they even were in the first place. All I knew about them was that they had intimidating German-sounding names. As a result, I didn’t bother to truly gain a complete understanding of it and only brute-forced my way to pass the class just so I could get them over with.
It wasn’t until I started working as a data scientist that I gained more appreciation of how versatile linear algebra as a field was. There are actually so many things you can do by utilizing the concepts and techniques found in linear algebra alone. Among them, eigenvectors and eigenvalues combined together are especially powerful to leverage. However, it could be tricky to teach these concepts to people, seeing how these are advanced concepts like I mentioned before. So, with that in mind, I tried to come up with a way to explain these concepts intuitively and why people should pay more attention to them, and I think I’ve found one that can be easily understood even for people who have minimum knowledge in linear algebra.
Just as a disclaimer, though: I’m only attempting to explain the concepts of eigenvectors and eigenvalues at a high level (i.e., not bothering too much with the actual technicalities in formal math). The aim of this is to give people an overall understanding of what they are, how they work, and how one can utilize them in solving various practical problems, particularly in data science. If you want to gain a deeper understanding of them and know the actual, more “math-y” nitty-gritty stuff, you have to take an actual linear algebra class, since it’s virtually impossible to explain everything comprehensively in just a single post.
Linear Algebra, Vectors, and Linear Transformations
Before I begin my explanation on eigenvectors and eigenvalues, let me briefly explain about linear algebra, vectors, and linear transformations, because this is crucial for a better understanding of what eigenvalues and eigenvectors are.
Linear algebra is, simply put, a branch of math that deals with vectors (as opposed to scalars) and anything related to them, like vector spaces and linear transformations, for instance. In contrast to a scalar, which only has a magnitude, a vector can be defined as an object that has both magnitude and direction. You can think of it as an arrow, in which the way the arrow points represent the vector’s direction in space.
Vectors can be manipulated using linear transformations. With linear transformations, there are many ways you can change a vector in a systematic way, such as: stretching it (or, conversely, compressing it), rotating it, reflecting it over a specified line (2D) or plane (3D), etc. You can even do a combination of those manipulations at the same time. From this alone, you can probably already tell just how many practical implications they have as a mathematical tool.
Eigenvalues and Eigenvectors
How do eigenvalues and eigenvectors factor into all of this, then? To answer that, I’d like to use this mirror analogy: Imagine there’s a mirror placed at an angle in a room. When you look at the reflection of the objects in the mirror, some objects might appear differently than others. In this analogy, the mirror represents a linear transformation, while the objects in the room represent the vectors. The reflection in the mirror shows how the transformation affect these vectors.
Now, imagine that you stand in front of the mirror while your hand points straight up. After reflection, your hand might appear longer, but it still points to the same direction (which is up). This hand, with its reflection length changed yet still keeping the same direction, represents an eigenvector of the transformation. The length change after the reflection represents the eigenvalue. In contrast, if you point your hand perpendicular to the mirror, you’ll see that the reflection of your hand will not just change in length, but also in direction (i.e., it’ll be flipped). This means that your hand is no longer an eigenvector — it’s just your good ol’ regular vector.
Since we already have a clear picture of these concepts using an analogy, let’s define them more formally. Essentially, an eigenvector is a special vector (it is quite literally the German for “special vector”) that doesn’t change its direction regardless of any linear transformations applied on it. Eigenvalue (again, literally German for “special value”), on the other hand, refers to the factor by which an eigenvector is scaled.
Eigenvectors and eigenvalues provide a way to help understand linear transformations more deeply. How exactly can they do it, anyway? Well, linear transformations can often be complex involving various manipulations. By just identifying the eigenvectors and eigenvalues, we can simplify this complexity by breaking down the transformation into its most basic actions: scaling along specific directions. Some examples to illustrate:
- If the eigenvalue is greater than 1, we can tell a transformation of stretching is applied.
- If it is between 0 and 1, it involves a compression.
- If it is negative, the transformation involves flipping.
So on and so forth. Using this understanding, we can do a lot of things involving complex transformations. For instance, you can decompose a vector into a combinations of its eigenvectors. Another use of this is simplifying repeated transformations — if you need to apply the transformation multiple times, eigenvalues and eigenvectors can also simplify the process significantly.
Practical Application of Eigenvalues and Eigenvectors: Dimensionality Reduction
One popular application of eigenvalues and eigenvectors, particularly in the field of data science, is dimensionality reduction. To understand what it is and why it is important, let me give you this case: Imagine that you have a dataset with hundreds of features, in which case the features are the dimensions of the data. Processing that data with its original amount of features might not be efficient and effective (e.g., there might be too much noise), let alone manageable, so you might want to reduce the dimensions without losing too much relevant information that the data contains. This is where dimensionality reduction comes into play — instead of just cutting down the dimensions, you find another way that can best capture the data except with less dimensions this time.
How do eigenvectors and eigenvalues help in this? Well, one such dimensionality reduction technique is Principal Component Analysis (commonly shortened as PCA), which works by transforming the data into a new coordinate system such that the greatest variances by any projection of the data come to lie on the first few coordinates (called principal components). In order to perform this technique, you need to obtain the covariance matrix of the data you’re working with. Covariance matrix captures the variance of each feature and the covariance between the features. After you have your covariance matrix, you’ll need to find the eigenvectors of the matrix and calculate their eigenvalues. The eigenvectors represent the directions (principal components) and the eigenvalues represent the magnitude of the variance in these directions. Sort these eigenvectors by their eigenvalues (from largest to smallest), and now you have each of these eigenvectors as a principal component ranked by their eigenvalue. Depending on how many dimensions you want, the top eigenvectors form the basis of the reduced-dimensional space which captures the most significant variance in the data, in which you can use as the new axes to represent your data.
Closing Statement
I hope this mini guide has been of help for you to gain a better, more intuitive understanding of eigenvectors and eigenvalues, complete with how they’re used. If you have any suggestions on how to improve this post, please don’t hesitate to let me know. I’d appreciate it if you tell me, actually!
I would also like to thank and recommend Mathematics for Machine Learning course and 3Blue1Brown for providing an intuitive explanation on the concepts of eigenvectors and eigenvalues. Please check them out as well!