opencv 文档学习记录(二)

DFT 离散傅里叶变换

图像在经过傅里叶变换后会分解为正弦和余弦成分，换句话说，会把一副图像从空间域转换到频域，任意的函数都可以被正弦和余弦函数精确逼近，傅里叶变换所实现的就是这种逼近。

#include "opencv2/core.hpp"
#include "opencv2/imgproc.hpp"
#include "opencv2/imgcodecs.hpp"
#include "opencv2/highgui.hpp"
#include <iostream>
using namespace cv;
using namespace std;
static void help(void)
{
 cout << endl
 << "This program demonstrated the use of the discrete Fourier transform (DFT). " << endl
 << "The dft of an image is taken and it's power spectrum is displayed." << endl
 << "Usage:" << endl
 << "./discrete_fourier_transform [image_name -- default ../data/lena.jpg]" << endl;
}
int main(int argc, char ** argv)
{
 help();
 const char* filename = argc >=2 ? argv[1] : "../data/lena.jpg";
 Mat I = imread(filename, IMREAD_GRAYSCALE);
 if( I.empty()){
 cout << "Error opening image" << endl;
 return -1;
 }
 Mat padded; //expand input image to optimal size
 int m = getOptimalDFTSize( I.rows );
 int n = getOptimalDFTSize( I.cols ); // 通过在边界添加零值实现扩展图像大小以适应 DFT 处理
 copyMakeBorder(I, padded, 0, m - I.rows, 0, n - I.cols, BORDER_CONSTANT, Scalar::all(0));
 //void cv::copyMakeBorder(InputArray src, OutputArray dst, int top, int bottom, int left, int right, int borderType, const Scalar& value = Scalar())
//为输入图像添加边框，top、bottom、left、right表示各部位边框大小，borderType表示边框类型，value是填充边框的值
 Mat planes[] = {Mat_<float>(padded), Mat::zeros(padded.size(), CV_32F)};//planes数组，存储傅里叶变换的虚部和实部
 Mat complexI;
 merge(planes, 2, complexI); // 将两个单通道的对象合并为一个双通道对象，第一个通道存实部，第二个虚部
 dft(complexI, complexI); // this way the result may fit in the source matrix
 // compute the magnitude and switch to logarithmic scale
 // => log(1 + sqrt(Re(DFT(I))^2 + Im(DFT(I))^2))
 split(complexI, planes); // planes[0] = Re(DFT(I), planes[1] = Im(DFT(I))
 magnitude(planes[0], planes[1], planes[0]);// planes[0] = magnitude
 Mat magI = planes[0];
 magI += Scalar::all(1); // switch to logarithmic scale
 log(magI, magI);
 // crop the spectrum, if it has an odd number of rows or columns
 magI = magI(Rect(0, 0, magI.cols & -2, magI.rows & -2));
 // rearrange the quadrants of Fourier image so that the origin is at the image center
 int cx = magI.cols/2;
 int cy = magI.rows/2;
 Mat q0(magI, Rect(0, 0, cx, cy)); // Top-Left - Create a ROI per quadrant
 Mat q1(magI, Rect(cx, 0, cx, cy)); // Top-Right
 Mat q2(magI, Rect(0, cy, cx, cy)); // Bottom-Left
 Mat q3(magI, Rect(cx, cy, cx, cy)); // Bottom-Right
 Mat tmp; // swap quadrants (Top-Left with Bottom-Right)
 q0.copyTo(tmp);
 q3.copyTo(q0);
 tmp.copyTo(q3);
 q1.copyTo(tmp); // swap quadrant (Top-Right with Bottom-Left)
 q2.copyTo(q1);
 tmp.copyTo(q2);
 normalize(magI, magI, 0, 1, NORM_MINMAX); // Transform the matrix with float values into a
 // viewable image form (float between values 0 and 1).
 imshow("Input Image" , I ); // Show the result
 imshow("spectrum magnitude", magI);
 waitKey();
 return 0;
}

二维图像的傅里叶变换公式：

f 是空间域中的图像值，F 是频域中的图像值，变换的结果是复数。如果想要观察这个结果，可以观察实数部分和虚数部分的图像，也可以观察变换结果的幅度和相位图像。在整个过程中只有幅度图像是我们感兴趣的部分，因为它包含着我们需要的所有图像几何信息，但是如果想要修改变换结果，那就需要用到所有部分。

上述例子展示了如何计算和显示傅里叶变换后的幅度图像，数字图像是离散的，像素取值是离散的，因此这里傅里叶变换也是离散的，如果想从几何角度了解图像结构就会用到这个方法。以下对程序主要步骤进行详细讲解：

图像最佳尺寸扩展

DFT 的性能与输入图像的尺寸有关，如果图像的尺寸是2、3、5的倍数，速度最快。为了实现最大性能，通常对图像边界进行补充，使得能够满足最佳尺寸要求。使用函数 getOptimalDFTSize() 会返回这个最佳尺寸，然后使用 copyMakeBorder() 函数来扩展图像 (扩展部分的像素值初始化为0)

创建变量存放实部和虚部

傅里叶变换的结果是复数，这意味着每幅图形的变换结果是两幅图像（复数的实部和虚部）。另外，频域取值范围远大于对应的空域部分，因此变换结果至少用float类型的数据。所以输入图像需要转化为该类型，并且增加一个通道存放虚部。

1
2
3

Mat planes[] = {Mat_<float>(padded), Mat::zeros(padded.size(), CV_32F)};
Mat complexI;
merge(planes, 2, complexI); // Add to the expanded another plane with zeros

执行函数并将实部和虚部转换为幅值

幅值计算公式如下所示：

dft(complexI, complexI); // this way the result may fit in the source matrix
split(complexI, planes); // planes[0] = Re(DFT(I), planes[1] = Im(DFT(I))
magnitude(planes[0], planes[1], planes[0]);// planes[0] = magnitude
Mat magI = planes[0];

dft 计算后结果覆盖了原变量，然后将其实部和虚部进行了拆分，存放在 planes 数组中。magnitude 函数使用了上述公式，计算幅值并将结果保存在 planes[0] 中。

转换为对数刻度

傅里叶系数的动态范围太大，不适宜用图像显示，而且一些较小和快速变化的值不容易观察到，这会导致高值全部变为白点，小值则是黑点。为了进行灰度值可视化，可以将线性刻度转换为对数刻度：

1 2	magI += Scalar::all(1); // switch to logarithmic scale log(magI, magI);

截取并重新组合图像

在之前进行计算时为了简化流程我们将图像进行了扩展，现在为了可视化，可以重新组合图像象限，使得原点与图像中心对齐：

// crop the spectrum, if it has an odd number of rows or columns
magI = magI(Rect(0, 0, magI.cols & -2, magI.rows & -2));
// rearrange the quadrants of Fourier image so that the origin is at the image center
int cx = magI.cols/2;
int cy = magI.rows/2;
Mat q0(magI, Rect(0, 0, cx, cy)); // Top-Left - Create a ROI per quadrant
Mat q1(magI, Rect(cx, 0, cx, cy)); // Top-Right
Mat q2(magI, Rect(0, cy, cx, cy)); // Bottom-Left
Mat q3(magI, Rect(cx, cy, cx, cy)); // Bottom-Right
Mat tmp; // swap quadrants (Top-Left with Bottom-Right)
q0.copyTo(tmp);
q3.copyTo(q0);
tmp.copyTo(q3);
q1.copyTo(tmp); // swap quadrant (Top-Right with Bottom-Left)
q2.copyTo(q1);
tmp.copyTo(q2);

这里值得一提的是第一行代码 magI = magI(Rect(0, 0, magI.cols & -2, magI.rows & -2))，这里裁剪频谱使它的行数和列数都是偶数，以便后面进行重新组合。这里将宽度和高度与 -2 进行了位与运算，通常情况下整数和整数位与运算是针对其二进制数进行位与的，-2 的二进制表示常使用补码表示，在这里其八位二进制补码是 11111110，通过与运算能确保第一位永远是 0，即行和列永远是偶数。

这里交换象限的目的是让原来的图像中四个角点拼凑到中心点，也就是将频谱的原点移到图像中心便于分析，因为这些点的位置对应于频谱的低频信息，低频信息表示图像整体亮度和颜色分布，这里有几张截图可供参考：

归一化

上面经过对数变换后，数值可能仍然不在0和1之间。这里使用 cv::normalize() 函数进行变换：

1	normalize(magI, magI, 0, 1, NORM_MINMAX);

读写 XML 和 YAML 文件

XML 和 YAML 文件可用于保存算法参数或数据，比如 SVM、神经网络的训练结果等，在读写过程中，两者没有区别，仅仅是文件后缀名不同而已，但是用文本编辑器查看时，会发现 YAML 文件更为简洁，这一部分主要用 OpenCV 实现上述两种文件的读写，包括常见的数据类型、Mat 类型、自定义数据类型等。针对自定义的数据类型，通过添加读写函数，实现了与常见数据类型同样的读写方法。使用到的 opencv 数据类型有 cv::FileStorage , cv::FileNode 和 cv::FileNodeIterator。

opencv 文档及各函数的学习笔记，核心模块部分，涉及离散傅里叶变换、读写 XML 和 YAML 以及并行运行。