Notes from Chaudhury, et al. (2024), chapter 6
New highlights added June 23, 2024 at 11:05 AM
- If the inputs all have a common property, the points are not distributed randomly over the input space. Rather, they occupy a region in the input space with a definite shape. (View Highlight)
- We then identify a probability distribution whose sample point cloud matches the shape of the (potentially transformed) training data point cloud. We can generate faux input by sampling from this distribution. (View Highlight)
- The concept of entropy attempts to quantify the uncertainty associated with a chancy event. (View Highlight)
- prefix coding: no two colors share the same prefix. (View Highlight)
- Entropy measures the overall uncertainty associated with a probability distribution. (View Highlight)
- Entropy is a measure that is high if everything is more or less equally probable and low if a few items have a much higher probability than the others. (View Highlight)
(View Highlight)- (6.6) (View Highlight)
- Note: iPad support is horrible — capture eq 6.6 for multidimensional entropy
- Geometrically speaking, entropy is a function of how lopsided the PDF is (see figure 6.2). (View Highlight)
- (6.7) (View Highlight)
- Note: The thing to notice is that the entropy of a Gaussian is proportional only the log of the variance.
- (6.8) (View Highlight)
- Note: Likewise, for the multivariate Gaussian, it depends only on the determinant of the covariance matrix.
New highlights added June 24, 2024 at 11:06 AM
- large contributions can happen only in case 1b, where pgt (i) is high and ppred (i) is low—that is, pgt and ppred are very dissimilar. (View Highlight)
- if there is dissimilarity, the cross-entropy is high. (View Highlight)
- We should look at cross-entropy as a dissimilarity with an offset. (View Highlight)