Friday, September 30, 2005

 

Some Image Processing related Websites

出处:不详

1. Annotated Computer Vision Bibliography
An annotated bibliography of references for computer vision, along with image processing and other related topics.

http://iris.usc.edu/Vision-Notes/bibliography/contents.html

2. Netpbm
Netpbm is a toolkit for manipulation of graphic images, including
conversion of images between a variety of different formats. There
are over 220 separate tools in the package including converters for
about 100 graphics formats. Examples of the sort of image
manipulation we're talking about are: Shrinking an image by 10%;
Cutting the top half off of an image; Making a mirror image; Creating
a sequence of images that fade from one image to another

http://netpbm.sourceforge.net/

(说真的,这个学期没上Digital Image Processing之前,我根本不知道Netpbm。上了课之后才知道它其实还是很有用的,而且支持者众多)

3. Moth -A 2D Graphing Application
Moth is an application to graph data in two dimensions. It can read from different data sources (currently text files, rrd files, and mysql databases). It's built on libart and freetype, so the text and drawing are anti-aliased and the colors support alpha levels.

http://moth.sourceforge.net/

(按:该源码的最大卖点就是使用XML进行数据储存,从而实现脚本化易扩展的绘图。就我个人而言,是非常喜欢这个idea的)

4. Gnuplot Central

http://www.gnuplot.info/

gnuplot is a command-driven interactive function plotting program. It can be used to plot functions and data points in both two- and three-dimensional plots in many different formats, and will accommodate many of the needs of today's scientists for graphic data representation. gnuplot is copyrighted, but freely distributable; you don't have to pay for it.

(按:经典的Vector Graphics Tools,如果要写自己的绘图软件的话,您最好先看看它是怎么定义自己的数据结构和程序框架的,然后您可以在其不足的地方自由发挥,写出如Octave(http://www.octave.org/ ),SoftIntegration(http://www.softintegration.com/ )那样的产品。)

5. paintlib
paintlib is a portable C class library for image loading, saving and manipulation. Images can be loaded from BMP, GIF, JPEG, PCX, PGM, PICT, PNG, PSD, TGA, TIFF and WMF files and saved in BMP, JPEG, PNG and TIFF formats. Image manipulation can be done either through filters implemented in filter classes or by directly accessing the bitmap bits. Full C source is provided. This is library version 2.50, 26/01/03. The newest version of paintlib can be found via http://www.paintlib.de/paintlib/ .

http://www.paintlib.de/paintlib/

(按:这个库也是上数字图象处理课的时候别人推荐的。特点是简洁明了,文档详细,对VC和GCC都能较好的支持。是一个比较成熟的开源计划。BTW,我看过的几个德国人的开源代码,都写的很清楚,也不玩什么奇技淫巧,阅读和维护扩展都比较舒服。这一点确实值得提倡)

6. OpenDX
OpenDX is a uniquely powerful, full-featured software package for the visualization of scientific, engineering and analytical data: Its open system design is built on familiar standard interface environments. And its sophisticated data model provides users with great flexibility in creating visualizations.

http://www.opendx.org/index2.php

(按:这又是一个广告打到邮箱里来的开源计划,也就是IBM Visualization Data Explorer的源代码计划。个人感觉就是它网站上有很多精美的图片资料(而且您自己也可以申请上传图片),可以用来测试您的软件的运行情况,并和它进行比较)

7.
The web site of the leading digital image processing books and other educational resources
http://www.imageprocessingbook.com/index_dip2e.htm
http://www.imageprocessingbook.com/index_dipum.htm

提供丰富的图象资源和Matlab函数文件,很有用的网站

8.
http://www-2.cs.cmu.edu/~cil/vision.html
这是卡奈基梅隆大学的计算机视觉研究组的主页,上面提供很全的资料,从发表文章的下载到演示程序、测试图像、常用链接、相关软硬件,甚至还有一个搜索引擎。

9.
http://www.cmis.csiro.au/IAP/zimage.htm
这是一个侧重图像分析的站点,一般。但是提供一个Image Analysis环境---ZIMAGE and SZIMAGE。

10.
http://www.via.cornell.edu/
康奈尔大学的计算机视觉和图像分析研究组,好像是电子和计算机工程系的。侧重医学方面的研究,但是在上面有相当不错资源,关键是它正在建设中,能够跟踪一些信息。

11.
Statistical Pattern and Image Analysis Area
http://www2.parc.com/istl/groups/did/didoverview.shtml
有一个很有意思的项目:DID(文档图像解码)。该公司的其它一些项目也很有趣。

12.
http://www.fmrib.ox.ac.uk/analysis/
主要研究:Brain Extraction Tool,Nonlinear noise reduction,Linear Image Registration,Automated Segmentation,Structural brain change analysis,motion correction,etc.

13.
http://www.cse.msu.edu/prip/
这是密歇根州立大学计算机和电子工程系的模式识别--图像处理研究组,它的FTP上有许多的文章(NEW)。

14.
http://pandora.inf.uni-jena.de/p/e/index.html
德国的一个数字图像处理研究小组,在其上面能找到一些不错的链接资源。

15.
http://www-staff.it.uts.edu.au/~sean/CVCC.dir/home.html
CVIP(used to be CVCC for Computer Vision and Cluster Computing) is a research group focusing on cluster-based computer vision within the Spiral Architecture.

16.
http://cfia.gmu.edu/
The mission of the Center for Image Analysis is to foster multi-disciplinary research in image, multimedia and related technologies by establishing links between academic institutes, industry and government agencies, and to transfer key technologies to help industry build next generation commercial and military imaging and multimedia systems.

17.
http://peipa.essex.ac.uk/info/groups.html
可以通过它来搜索全世界各地的知名的计算机视觉研究组(CV Groups),极力推荐。

18.
Image-Based Measurement and Analysis Research Projects
http://www.ph.tn.tudelft.nl/Research/

19.
http://iraf.noao.edu/
Welcome to the IRAF Homepage! IRAF is the Image Reduction and Analysis Facility, a general purpose software system for the reduction and analysis of astronomical data.

20.
http://entropy.brneurosci.org/tnimage.html
一个非常不错的Unix系统的图像处理工具,看看它的截图。你可以在此基础上构建自己的专用图像处理工具包。

21.
http://www.ifp.uiuc.edu/yrui_ifp_home/
The Image Formation and Processing (IFP) group at University of Illinois at Urbana-Champaign

22.
http://www.evisual.org/
Welcome to the Center for Image Processing in Education

23.
http://www.cs.washington.edu/research/metip/metip.html
Mathematics Experiences Through Image Processing (METIP)

24.
http://www.cdsp.neu.edu/
东北大学数字信号处理中心

25.
http://foulard.ee.cornell.edu/
康奈尔大学视觉通信实验室:视觉通信、数字成像、图像处理、图像交流及图形的可测传输。

26.
http://sipi.usc.edu/
南加里福尼亚大学信号和图像处理研究所

27.
http://noodle.med.yale.edu/
耶鲁图形处理和分析研究组

28.
http://www-dsp.rice.edu/
Rice大学数字信号处理研究组:子波、滤波器设计和演算发展的进展情况及有关的研究报告和软件。

29.
http://ilab.usc.edu/research/
The main fundamental research focus of the lab is in using computational modeling to gain insight into biological brain function.

Wednesday, September 21, 2005

 

Some notes on Matlab Image Processing - 1

1.
用matlab读取16位、14位、12位灰度bmp图像

By darnshong
From http://www.52blog.net/user1/4566/archives/2004/72461.shtml

16位、14位、12位灰度bmp图像(*.bmp)并不是标准的格式,bmp图像只有16位的彩色图像而没有16位的灰度图像之说。所以,对于16位、14位、12位灰度bmp图像,一般是当做16位的彩色图像写入的,由于这个原因,计算机上看到的这些16位、14位、12位灰度bmp图像看起来都是彩色的,虽然我们写数据的时候写的是灰度的信息。
  用matlab可以读取16位的bmp图像,但是得到的数据是matlab解析过的彩色数据信息,RGB三个分量都是uint8,且matlab已经把RGB三分量的范围映射到0~255。这些并不是我们想要的bmp图像里的16位数据信息,必须经过适当的处理之后才能得到。只要得到了bmp图像里的16位数据信息,也就得到了那些非标准格式的灰度bmp图像(16位、14位、12位灰度bmp图像)的灰度信息。

  倘若想保存16位、14位、12位灰度图像而又不让计算机解析成彩色图像,可以保存为*.png、*.tiff格式,这两种格式都支持16位的灰度图像。

  另外还要提一下的是,由于我们现在所使用的计算机一般最高级也就24位的真彩色(RGB各8位),而要显示灰度图像,RGB三分量必须都相等,也就是说,用来显示灰度的其实只有8位,对于8位以上(不含8位)的灰度图像,计算机是没法正确显示的,无怪乎你若是把原来16位的灰度图像从bmp格式转化成png格式,图像反而不清楚了,因为16位灰度bmp格式时,计算机显示的是彩色图像,而16灰度png格式时,计算机显示的是灰色图像。

  16位的bmp图像格式有两种,一种是R5G5B5,最高位不用,另一种是R5G6B5。

  以下是matlab实现上述过程的源代码:

function b=read16graybmp(filename)
%usage: read16graybmp(filename)
%designed by darnshong ioe,cas
if(nargin~=1)
error('Need 1 parameters!');
end;
i=0:2^5-1;index5=floor(i*255.99/(2^5-1));%for 5bits
%index5=[0 8 16 24 33 41 49 57 66 74 82 90 99 107 115 123 132 140 148 156 165 173 181 189 198 206 214 222 231 239 247 255];
value5(index5+1)=0:2^5-1;
j=0:2^6-1;index6=floor(j*255.99/(2^6-1));%for 6bits
value6(index6+1)=0:2^6-1;
minfo=imfinfo(filename);
%if(strcmp(minfo.CompressionType,'none'))%for 555 format R5G5B5
a=imread(filename);
b1=value5(a(:,:,1)+1);
b2=value5(a(:,:,2)+1);
b3=value5(a(:,:,3)+1);
b=bitshift(uint16(b1),10)+bitshift(uint16(b2),5)+uint16(b3);
% else % for 565 format R5G6B5 ,matlab not supported!
%a=imread(filename);
%b1=value5(a(:,:,1)+1);
%b2=value6(a(:,:,2)+1);
% b3=value5(a(:,:,3)+1);
% b=bitshift(uint16(b3),11)+bitshift(uint16(b2),5)+uint16(b1);
%end;

2.
普通文件与图像文件的相互转化

By darnshong
From http://www.52blog.net/user1/4566/archives/2004/51659.shtml

时下,好些论坛都提供贴图服务,也就是说,你可以把你喜欢的图像上传,与他人共享。但是如果是其他非图像和文本的文件的共享就不是很好办了。为此,我用matlab写了几个函数,用于把普通的文件(一切文件格式)转化成某些特定格式的图像文件(*.bmp,*.png,*.ras,*.tif,*.tiff)。通过这些转化而来的图像文件,你就可以与他人共享你的一切文件。对方只要保存了这个转化而来的图像文件,就可用我所写的函数完整地恢复原来的文件的面目,从而达到通过图像文件来共享一切文件的目的。

以下是我写的几个函数(file2imgdata、file2img、imgdata2file、img2file):

function dat=file2imgdata(filename,imgwidth)
%%dat=file2imgdata(filename,imgwidth)
%convert the common file to image data
%%--------------------the image map-------------------
% |-----------------the image width-------------------|
% |'csz'(3)+extension(20)+patch with '0'+file size(?)-|
% |the uint8 format data,acquired from the source file|
% |...................................................|
% |....................no data,patch with uint8(0)----|
% designed by darnshong,ioe,cas
if(~exist(filename,'file'))
error('The file does not exist!');
end;
[fpath,fname,fileext]=fileparts(filename);% get the file'extension
fileext=fileext(2:numel(fileext));
extnum=uint8(fileext);
if(numel(extnum)<20)
temp=zeros(1,20);
temp(1:numel(extnum))=extnum;
temp((numel(extnum)+1):20)=uint8('.');
extnum=temp;
else
error('the length of the file extension is large than 20');
end;
%%%%%%% get the source file size

s=dir(filename);
fsize=s.bytes;%get the file size.
fsize=num2str(uint32(fsize),'%d');
fsize=uint8(fsize);
extralen=imgwidth-3-20;
if(numel(fsize)<=extralen)
temp=zeros(1,extralen);
temp(1:extralen-numel(fsize))=uint8('0');
temp((extralen+1-numel(fsize)):extralen)=fsize;
fsize=temp;
else
error('the length of the file size is too large!');

end;
dat=[];
dat(1,:)=[uint8('czs'),extnum,fsize];
fsrc=fopen(filename,'r');
totalline=uint32(ceil(s.bytes/imgwidth));
for i=1:totalline-1
dat(end+1,:)=fread(fsrc,imgwidth);
end;
temp=fread(fsrc,imgwidth); %not enough data,so patch with 0;
temp1=zeros(1,imgwidth);
temp1(1:numel(temp))=temp;
dat(end+1,:)=temp1;
fclose(fsrc);
dat=uint8(dat);

%%%%%%%%%%%%%%%%%%%%%%%%%%%

function yy=file2img(commonfile,imgfile,imgwidth)
%yy=file2img(commonfile,imgfile,imgwidth)
%convert the common file like .exe,.doc,to the image file.
%thus you can post your common file as a image file.
% designed by darnshong,ioe,cas
if(nargin~=3)
error('Input parameters error!Require 3 parameters!');
end;
yy=file2imgdata(commonfile,imgwidth);
finddot=find(imgfile=='.');
imgext=imgfile(finddot(1)+1:end);
if~(strcmp(imgext,'bmp') || strcmp(imgext,'png') || strcmp(imgext,'ras') ||...
strcmp(imgext,'tif') || strcmp(imgext,'tiff'))
error('Not support image format!The image file format should be *.bmp,*.png,*.ras,*.tif,*.tiff');
end;
imwrite(yy,imgfile);

%%%%%%%%%%%%%%%%%%%%%%%%%%

function imgdata2file(dat,filenoext)
%%%% imgdata2file(dat,filenoext)
% ----dat,the image data
% convert the imagedata to the common file
%%dat=file2imgdata(filename,imgwidth)
%%--------------------the image map-------------------
% |-----------------the image width-------------------|
% |'csz'(3)+extension(20)+patch with '0'+file size(?)-|
% |the uint8 format data,acquired from the source file|
% |...................................................|
% |....................no data,patch with uint8(0)----|
% designed by darnshong,ioe,cas

testczs=dat(1,1:3);%test if the image data is valid.
testczs=char(testczs);
if(~strcmp(testczs,'czs'))
error('wrong image data!');
end
imgwidth=size(dat,2);
finddot=find(dat(1,:)==uint8('.'));
ext=dat(1,4:finddot(1)-1);%get the common file's extension
ext=char(ext);
fsize=dat(1,(finddot(end)+1):imgwidth);%get the common file's size

fsize=str2double(char(fsize));
dstfile=strcat(filenoext,'.',ext);
fdst=fopen(dstfile,'w');
totalline=uint32(ceil(fsize/imgwidth));
for i=2:totalline,
fwrite(fdst,dat(i,:),'uint8');
end;
lastline=dat(totalline+1,1:(fsize-imgwidth*(totalline-1)));
fwrite(fdst,lastline,'uint8');
fclose(fdst);

%%%%%%%%%%%%%%%%%%%%%%%%%%

function yy=img2file(imgfile,commonfilenoext)
%%% yy=img2file(imgfile,commonfilenoext)
% convert the image file to the common file
% thus you can restore the common file from the image file.
% designed by darnshong,ioe,cas
if(nargin~=2)
error('Input parameters error!Require 2 parameter!');
end;
finddot=find(imgfile=='.');
imgext=imgfile(finddot(1)+1:end);
if~(strcmp(imgext,'bmp') || strcmp(imgext,'png') || strcmp(imgext,'ras') ||...
strcmp(imgext,'tif') || strcmp(imgext,'tiff'))%only support some image format
error('Not support image format!The image file format should be *.bmp,*.png,*.ras,*.tif,*.tiff');
end; %%%% because the *.jpg image format is lossy compression,it can't be used.
yy=imread(imgfile);
imgdata2file(yy,commonfilenoext);

%%%%%%%%%%%%%%%%%%%%%%%%%%

3.
发现matlab不支持16位bmp图像的几种特殊格式

By darnshong
From http://www.52blog.net/user1/4566/archives/2005/153011.shtml

今天用photoshop做了几幅不同格式的16位bmp图像,包括X1R5G5B5格式、R5G6B5格式和X4R4G4B4格式。然后我用matlab的imread函数来读取,只有X1R5G5B5可以正确读取,其他两种格式无法读取,一读取就出错。我的matlab是7.01版本的,看来还是不支持R5G6B5和X4R4G4B4这两种格式的16位bmp图像!

4.
将图像转化成avi格式电影

By darnshong
From http://www.52blog.net/user1/4566/archives/2005/153011.shtml

  导师那有好些系列图像,想弄成电影。查了一下matlab的帮助,轻松地实现了,转化成avi格式电影!以下是代码:

function produceavifrompic(pfrom,pto,pext,navi)
aviobj = avifile(navi);
aviobj.Quality = 100;
aviobj.compression='None';
cola=0:1/255:1;
cola=[cola;cola;cola];%%黑白图像
cola=cola';
aviobj.colormap=cola;
for i=pfrom:pto
fname=strcat(num2str(i),pext)
adata=imread(fname);
aviobj = addframe(aviobj,uint8(adata));
end
aviobj=close(aviobj);

以上是将一系列8位黑白图像转化成avi格式电影,如果是彩色图像,则如下

function produceavifrompic(pfrom,pto,pext,navi)
aviobj = avifile(navi);
aviobj.Quality = 100;
aviobj.compression='None';
for i=pfrom:pto
fname=strcat(num2str(i),pext)
adata=imread(fname);
aviobj = addframe(aviobj,uint8(adata));
end
aviobj=close(aviobj);

挺简单的吧!快去试试!

5.
从avi文件中提取图片

By darnshong
From http://www.52blog.net/user1/4566/archives/2005/190925.shtml

function avi2pic(avifile,pickind)
%function avi2pic(avifile,pickind)
% avifile-- the avi filename,like 'darnshong.avi','ioe.avi',etc;
% pickind-- the kind of image format,like 'jpg','bmp',etc
% supported export image
% format:'jpg','jpeg','bmp','tiff','tif','gif','png',etc
mov=aviread(avifile);
temp=size(mov);
fnum=temp(2);
for i=1:fnum,
strtemp=strcat(int2str(i),'.',pickind);
imwrite(mov(i).cdata(:,:,:),mov(i).colormap,strtemp);
end,

6.
ZIGZAG扫描的MATLAB实现
From http://blog.csdn.net/hunnish/archive/2004/10/29/158885.aspx

转自阿须数码,用MATLAB实现MPEG中的 ZIG-ZAG 扫描。觉得有点研究价值,实现的方法也很巧妙。

下面给一个参照MPEG提供的方法:

===
function b=zigzag(a)
% 这是参照 University of California 提供的 MPEG 源代码的基础上编制的。
% Copyright (c) 1995 The Regents of the University of California.

[n,m]=size(a);
if(n~=8 & m~=8)
error('Input array is NOT 8-by-8');
end

% Set up array for fast conversion from row/column coordinates to
% zig zag order. 下标从零开始,因为是从MPEG的C代码拷贝过来的
zigzag = [ 0, 1, 8, 16, 9, 2, 3, 10, ...
17, 24, 32, 25, 18, 11, 4, 5, ...
12, 19, 26, 33, 40, 48, 41, 34, ...
27, 20, 13, 6, 7, 14, 21, 28, ...
35, 42, 49, 56, 57, 50, 43, 36, ...
29, 22, 15, 23, 30, 37, 44, 51, ...
58, 59, 52, 45, 38, 31, 39, 46, ...
53, 60, 61, 54, 47, 55, 62, 63];

zigzag = zigzag + 1; % 下标加1,符合MATLAB的下标习惯
aa = reshape(a,1,64); % 将输入块变成1x64的向量
b = aa(zigzag); % 对 aa 按照查表方式取元素,得到 zig-zag 扫描结果

===


程序运行结果:

?a=magic(8)
a =
64 2 3 61 60 6 7 57
9 55 54 12 13 51 50 16
17 47 46 20 21 43 42 24
40 26 27 37 36 30 31 33
32 34 35 29 28 38 39 25
41 23 22 44 45 19 18 48
49 15 14 52 53 11 10 56
8 58 59 5 4 62 63 1

?b=zigzag(a)
b =
Columns 1 through 12
64 9 2 3 55 17 40 47 54 61 60 12
Columns 13 through 24
46 26 32 41 34 27 20 13 6 7 51 21
Columns 25 through 36
37 35 23 49 8 15 22 29 36 43 50 57
Columns 37 through 48
16 42 30 28 44 14 58 59 52 45 38 31
Columns 49 through 60
24 33 39 19 53 5 4 11 18 25 48 10
Columns 61 through 64
62 63 56 1

Wednesday, September 14, 2005

 

Some notes on OpenCV - 4

1.
OTSU方法计算图像二值化的自适应阈值
From http://blog.csdn.net/hunnish/archive/2004/09/02/92086.aspx

/*
OTSU 算法可以说是自适应计算单阈值(用来转换灰度图像为二值图像)的简单高效方法。下面的代码最早由 Ryan Dibble提供,此后经过多人Joerg.Schulenburg, R.Z.Liu 等修改,补正。

转自:http://forum.assuredigit.com/display_topic_threads.asp?ForumID=8&TopicID=3480

算法对输入的灰度图像的直方图进行分析,将直方图分成两个部分,使得两部分之间的距离最大。划分点就是求得的阈值。

parameter: *image --- buffer for image
rows, cols --- size of image
x0, y0, dx, dy --- region of vector used for computing threshold
vvv --- debug option, is 0, no debug information outputed
*/
/*======================================================================*/
/* OTSU global thresholding routine */
/* takes a 2D unsigned char array pointer, number of rows, and */
/* number of cols in the array. returns the value of the threshold */
/*======================================================================*/
int otsu (unsigned char *image, int rows, int cols, int x0, int y0, int dx, int dy, int vvv)
{

unsigned char *np; // 图像指针
int thresholdValue=1; // 阈值
int ihist[256]; // 图像直方图,256个点

int i, j, k; // various counters
int n, n1, n2, gmin, gmax;
double m1, m2, sum, csum, fmax, sb;

// 对直方图置零...
memset(ihist, 0, sizeof(ihist));

gmin=255; gmax=0;
// 生成直方图
for (i = y0 + 1; i < y0 + dy - 1; i++) {
np = &image[i*cols+x0+1];
for (j = x0 + 1; j < x0 + dx - 1; j++) {
ihist[*np]++;
if(*np > gmax) gmax=*np;
if(*np < gmin) gmin=*np;
np++; /* next pixel */
}
}

// set up everything
sum = csum = 0.0;
n = 0;

for (k = 0; k <= 255; k++) {
sum += (double) k * (double) ihist[k]; /* x*f(x) 质量矩*/
n += ihist[k]; /* f(x) 质量 */
}

if (!n) {
// if n has no value, there is problems...
fprintf (stderr, "NOT NORMAL thresholdValue = 160\n");
return (160);
}

// do the otsu global thresholding method
fmax = -1.0;
n1 = 0;
for (k = 0; k < 255; k++) {
n1 += ihist[k];
if (!n1) { continue; }
n2 = n - n1;
if (n2 == 0) { break; }
csum += (double) k *ihist[k];
m1 = csum / n1;
m2 = (sum - csum) / n2;
sb = (double) n1 *(double) n2 *(m1 - m2) * (m1 - m2);
/* bbg: note: can be optimized. */
if (sb > fmax) {
fmax = sb;
thresholdValue = k;
}
}

// at this point we have our thresholding value

// debug code to display thresholding values
if ( vvv & 1 )
fprintf(stderr,"# OTSU: thresholdValue = %d gmin=%d gmax=%d\n",
thresholdValue, gmin, gmax);

return(thresholdValue);
}

2.
彩色图像分割的FLOOD FILL方法(源代码)
From http://blog.csdn.net/hunnish/archive/2004/10/11/131783.aspx

下面是OPENCV B4.0 附带的 FLOOD FILL 算法的源代码样例,可以实现简单的彩色图像分割。

#ifdef _CH_
#pragma package
#endif

#ifndef _EiC
#include "cv.h"
#include "highgui.h"
#include
#include
#endif

IplImage* color_img0;
IplImage* mask;
IplImage* color_img;
IplImage* gray_img0 = NULL;
IplImage* gray_img = NULL;
int ffill_case = 1;
int lo_diff = 20, up_diff = 20;
int connectivity = 4;
int is_color = 1;
int is_mask = 0;
int new_mask_val = 255;

void on_mouse( int event, int x, int y, int flags )
{
if( !color_img )
return;

switch( event )
{
case CV_EVENT_LBUTTONDOWN:
{
CvPoint seed = cvPoint(x,y);
int lo = ffill_case == 0 ? 0 : lo_diff;
int up = ffill_case == 0 ? 0 : up_diff;
int flags = connectivity + (new_mask_val << 8) +
(ffill_case == 1 ? CV_FLOODFILL_FIXED_RANGE : 0);
int b = rand() & 255, g = rand() & 255, r = rand() & 255;
CvConnectedComp comp;

if( is_mask )
cvThreshold( mask, mask, 1, 128, CV_THRESH_BINARY );

if( is_color )
{
CvScalar color = CV_RGB( r, g, b );
cvFloodFill( color_img, seed, color, CV_RGB( lo, lo, lo ),
CV_RGB( up, up, up ), &comp, flags, is_mask ? mask : NULL );
cvShowImage( "image", color_img );
}
else
{
CvScalar brightness = cvRealScalar((r*2 + g*7 + b + 5)/10);
cvFloodFill( gray_img, seed, brightness, cvRealScalar(lo),
cvRealScalar(up), &comp, flags, is_mask ? mask : NULL );
cvShowImage( "image", gray_img );
}

printf("%g pixels were repainted\n", comp.area );

if( is_mask )
cvShowImage( "mask", mask );
}
break;
}
}

int main( int argc, char** argv )
{
char* filename = argc >= 2 ? argv[1] : (char*)"fruits.jpg";

if( (color_img0 = cvLoadImage(filename,1)) == 0 )
return 0;

printf( "Hot keys: \n"
"\tESC - quit the program\n"
"\tc - switch color/grayscale mode\n"
"\tm - switch mask mode\n"
"\tr - restore the original image\n"
"\ts - use null-range floodfill\n"
"\tf - use gradient floodfill with fixed(absolute) range\n"
"\tg - use gradient floodfill with floating(relative) range\n"
"\t4 - use 4-connectivity mode\n"
"\t8 - use 8-connectivity mode\n" );

color_img = cvCloneImage( color_img0 );
gray_img0 = cvCreateImage( cvSize(color_img->width, color_img->height), 8, 1 );
cvCvtColor( color_img, gray_img0, CV_BGR2GRAY );
gray_img = cvCloneImage( gray_img0 );
mask = cvCreateImage( cvSize(color_img->width + 2, color_img->height + 2), 8, 1 );

cvNamedWindow( "image", 0 );
cvCreateTrackbar( "lo_diff", "image", &lo_diff, 255, NULL );
cvCreateTrackbar( "up_diff", "image", &up_diff, 255, NULL );

cvSetMouseCallback( "image", on_mouse );

for(;;)
{
int c;

if( is_color )
cvShowImage( "image", color_img );
else
cvShowImage( "image", gray_img );

c = cvWaitKey(0);
switch( c )
{
case '\x1b':
printf("Exiting ...\n");
goto exit_main;
case 'c':
if( is_color )
{
printf("Grayscale mode is set\n");
cvCvtColor( color_img, gray_img, CV_BGR2GRAY );
is_color = 0;
}
else
{
printf("Color mode is set\n");
cvCopy( color_img0, color_img, NULL );
cvZero( mask );
is_color = 1;
}
break;
case 'm':
if( is_mask )
{
cvDestroyWindow( "mask" );
is_mask = 0;
}
else
{
cvNamedWindow( "mask", 0 );
cvZero( mask );
cvShowImage( "mask", mask );
is_mask = 1;
}
break;
case 'r':
printf("Original image is restored\n");
cvCopy( color_img0, color_img, NULL );
cvCopy( gray_img0, gray_img, NULL );
cvZero( mask );
break;
case 's':
printf("Simple floodfill mode is set\n");
ffill_case = 0;
break;
case 'f':
printf("Fixed Range floodfill mode is set\n");
ffill_case = 1;
break;
case 'g':
printf("Gradient (floating range) floodfill mode is set\n");
ffill_case = 2;
break;
case '4':
printf("4-connectivity mode is set\n");
connectivity = 4;
break;
case '8':
printf("8-connectivity mode is set\n");
connectivity = 8;
break;
}
}

exit_main:

cvDestroyWindow( "test" );
cvReleaseImage( &gray_img );
cvReleaseImage( &gray_img0 );
cvReleaseImage( &color_img );
cvReleaseImage( &color_img0 );
cvReleaseImage( &mask );

return 1;
}

#ifdef _EiC
main(1,"ffilldemo.c");
#endif

3.
单通道图像的直方图(C/C++源代码)
From http://blog.csdn.net/hunnish/archive/2004/10/13/134501.aspx

计算并绘制单通道图像的直方图。在MATLAB中绘制直方图是一件非常简单的事情,可是到了C环境下,竟然变成了一个问题。各种实现方法都有,而且要自己动手重新编程。幸好有了OPENCV。下面的代码要求OPENCV4.0的支持,并在VC6中编译通过。

转自阿须数码

//
// 对单通道图像做直方图
//

#include "cv.h"
#include "highgui.h"
#include
#include

int main( int argc, char** argv )
{
IplImage *src = 0;
IplImage *histimg = 0;
CvHistogram *hist = 0;

int hdims = 50; // 划分HIST的个数,越高越精确
float hranges_arr[] = {0,255};
float* hranges = hranges_arr;
int bin_w;
float max_val;
int i;

if( argc != 2 || (src=cvLoadImage(argv[1], 0)) == NULL) // force to gray image
return -1;

cvNamedWindow( "Histogram", 1 );
hist = cvCreateHist( 1, &hdims, CV_HIST_ARRAY, &hranges, 1 ); // 计算直方图
histimg = cvCreateImage( cvSize(320,200), 8, 3 );

cvZero( histimg );

cvCalcHist( &src, hist, 0, 0 ); // 计算直方图
cvGetMinMaxHistValue( hist, 0, &max_val, 0, 0 ); // 只找最大值
cvConvertScale( hist->bins, hist->bins, max_val ? 255. / max_val : 0., 0 ); // 缩放 bin 到区间 [0,255]

cvZero( histimg );
bin_w = histimg->width / hdims; // hdims: 条的个数,则 bin_w 为条的宽度

// 画直方图
for( i = 0; i < hdims; i++ )
{
double val = ( cvGetReal1D(hist->bins,i)*histimg->height/255 );
CvScalar color = CV_RGB(255,255,0); //(hsv2rgb(i*180.f/hdims);
cvRectangle( histimg, cvPoint(i*bin_w,histimg->height),
cvPoint((i+1)*bin_w,(int)(histimg->height - val)),
color, 1, 8, 0 );
}

cvShowImage( "Histogram", histimg );
cvWaitKey(0);

cvDestroyWindow("Histogram");
cvReleaseImage( &src );
cvReleaseImage( &histimg );
cvReleaseHist ( &hist );

return 0;
}

4.
数字图像的直方图均衡化(C/C++源代码)
From http://blog.csdn.net/hunnish/archive/2004/10/14/136003.aspx

数字图像的直方图均衡化是常用的图像增强方法,因为均衡化是自动完成的,无需人工干预,而且常常得到比较满意的结果。下面的程序是利用OPENCV提供的函数,实现这个功能。需要OPENCV B4.0的支持,在VC6下编译通过。

//
// perform histgram equalization for single channel image
// AssureDigit Sample code
//


#include "cv.h"
#include "highgui.h"

#define HDIM 256 // bin of HIST, default = 256

int main( int argc, char** argv )
{
IplImage *src = 0, *dst = 0;
CvHistogram *hist = 0;

int n = HDIM;
double nn[HDIM];
uchar T[HDIM];
CvMat *T_mat;

int x;
int sum = 0; // sum of pixels of the source image 图像中象素点的总和
double val = 0;

if( argc != 2 || (src=cvLoadImage(argv[1], 0)) == NULL) // force to gray image
return -1;

cvNamedWindow( "source", 1 );
cvNamedWindow( "result", 1 );

// calculate histgram 计算直方图
hist = cvCreateHist( 1, &n, CV_HIST_ARRAY, 0, 1 );
cvCalcHist( &src, hist, 0, 0 );

// Create Accumulative Distribute Function of histgram
val = 0;
for ( x = 0; x < n; x++)
{
val = val + cvGetReal1D (hist->bins, x);
nn[x] = val;
}

// Compute intensity transformation 计算变换函数的离散形式
sum = src->height * src->width;
for( x = 0; x < n; x++ )
{
T[x] = (uchar) (255 * nn[x] / sum); // range is [0,255]
}

// Do intensity transform for source image
dst = cvCloneImage( src );
T_mat = cvCreateMatHeader( 1, 256, CV_8UC1 );
cvSetData( T_mat, T, 0 );
// directly use look-up-table function 直接调用内部函数完成 look-up-table 的过程
cvLUT( src, dst, T_mat );

cvShowImage( "source", src );
cvShowImage( "result", dst );
cvWaitKey(0);

cvDestroyWindow("source");
cvDestroyWindow("result");
cvReleaseImage( &src );
cvReleaseImage( &dst );
cvReleaseHist ( &hist );

return 0;
}

5.
图像灰度值调整(C/C++源代码)
From http://blog.csdn.net/hunnish/archive/2004/09/23/114398.aspx

图像的象素值变换,包括亮度、对比度和GAMMA校正算法,环境是OPENCV4.0,VC6.0。算法参考了MATLAB函数 imadjust 。

//
// perform histgram equalization for single channel image
//

#include "cv.h"
#include "highgui.h"

/*
Reference for correspondent MATLAB function: imadjust
IMADJUST Adjust image intensity values or colormap.
J = IMADJUST(I,[LOW_IN HIGH_IN],[LOW_OUT HIGH_OUT],GAMMA) maps the
values in intensity image I to new values in J such that values between
LOW_IN and HIGH_IN map to values between LOW_OUT and HIGH_OUT. Values
below LOW_IN and above HIGH_IN are clipped; that is, values below LOW_IN
map to LOW_OUT, and those above HIGH_IN map to HIGH_OUT. You can use an
empty matrix ([]) for [LOW_IN HIGH_IN] or for [LOW_OUT HIGH_OUT] to
specify the default of [0 1]. GAMMA specifies the shape of the curve
describing the relationship between the values in I and J. If GAMMA is
less than 1, the mapping is weighted toward higher (brighter) output
values. If GAMMA is greater than 1, the mapping is weighted toward lower
(darker) output values. If you omit the argument, GAMMA defaults to 1
(linear mapping).

Note that if HIGH_OUT < LOW_OUT, the output image is reversed, as in a
photographic negative.
====
src and dst are grayscale, 8-bit images;
Default input value:
[low, high] = [0,1];
[bottom, top] = [0,1];
gamma = 1;
if adjust successfully, return 0, otherwise, return non-zero.
Author: R.Z.Liu, 18/09/04
====
*/
int ImageAdjust(IplImage* src, IplImage* dst,
double low, double high, // low and high are the intensities of src
double bottom, double top, // mapped to bottom and top of dst
double gamma )
{
double low2 = low*255;
double high2 = high*255;
double bottom2 = bottom*255;
double top2 = top*255;
double err_in = high2 - low2;
double err_out = top2 - bottom2;

int x,y;
double val;

if( low<0 && low>1 && high <0 && high>1 && bottom<0 && bottom>1 && top<0 && top>1)
return 1;

// intensity transform
for( y = 0; y < src->height; y++)
{
for (x = 0; x < src->width; x++)
{
val = ((uchar*)(src->imageData + src->widthStep*y))[x];
val = pow((val - low2)/err_in, gamma) * err_out + bottom2;
if(val>255) val=255; if(val<0) val=0; // Make sure src is in the range [low,high]
((uchar*)(dst->imageData + dst->widthStep*y))[x] = (uchar) val;
}
}
return 0;
}

效率实在太低,只是表明算法的思想

Tuesday, September 13, 2005

 

Some notes on OpenCV - 3

1.
OPENCV用户手册之图像处理部分(之一):梯度、边缘与角点(中文翻译)
From http://blog.csdn.net/hunnish/archive/2004/09/03/93171.aspx

下面是OPENCV用户手册之图像处理部分:梯度、边缘与角点(中文翻译),有错误欢迎指正,原文在:

http://www.assuredigit.com/incoming/sourcecode/opencv/chinese_docs/ref/opencvref_cv.htm

注意:
本章描述图像处理和分析的一些函数。大多数函数是针对二维数组的。所以我们用数组来描述“图像”,而图像不必是 IplImage,还可以是 CvMat's 或 CvMatND。


--------------------------------------------------------------------------------

梯度、边缘和角点
翻译:HUNNISH, 阿须数码



--------------------------------------------------------------------------------


Sobel
使用扩展 Sobel 算子计算一阶、二阶、三阶或混合图像差分

void cvSobel( const CvArr* src, CvArr* dst, int xorder, int yorder, int aperture_size=3 );


src
输入图像.
dst
输出图像.
xorder
x? 方向上的差分阶数
yorder
y? 方向上的差分阶数
aperture_size
扩展 Sobel 核的大小,必须是 1, 3, 5 或 7。 除了尺寸为 1, 其它情况下, aperture_size ×aperture_size 可分离内核将用来计算差分。对 aperture_size=1的情况, 使用 3x1 或 1x3 内核 (不进行高斯平滑操作)。有一个特殊变量? CV_SCHARR (=-1),对应 3x3 Scharr 滤波器,可以给出比 3x3 Sobel 滤波更精确的结果。Scharr 滤波器系数是:
| -3 0 3|
|-10 0 10|
| -3 0 3|

对 x-方向 以及转置矩阵对 y-方向。
函数 cvSobel 通过对图像用相应的内核进行卷积操作来计算图像差分:

dst(x,y) = dxorder+yodersrc/dxxorder•dyyorder |(x,y)

Sobel 算子结合 Gaussian 平滑和微分,以提高计算结果对噪声的抵抗能力。通常情况,函数调用采用如下参数 (xorder=1, yorder=0, aperture_size=3) 或 (xorder=0, yorder=1, aperture_size=3) 来计算一阶 x- 或 y- 方向的图像差分。第一种情况对应:

|-1 0 1|
|-2 0 2|
|-1 0 1|

核。第二种对应

|-1 -2 -1|
| 0 0 0|
| 1 2 1|
or
| 1 2 1|
| 0 0 0|
|-1 -2 -1|

核,它依赖于图像原点的定义 (origin 来自 IplImage 结构的定义)。不进行图像尺度变换。所以输出图像通常比输入图像大。为防止溢出,当输入图像是 8 位的,要求输出图像是 16 位的。产生的图像可以用函数 cvConvertScale 或 cvConvertScaleAbs 转换为 8 位的。除了 8-比特 图像,函数也接受 32-位 浮点数图像。所有输入和输出图像都必须是单通道,且图像大小或ROI尺寸一致。


--------------------------------------------------------------------------------

Laplace
计算图像的 Laplacian?

void cvLaplace( const CvArr* src, CvArr* dst, int aperture_size=3 );


src
输入图像.
dst
输出图像.
aperture_size
核大小 (与 cvSobel 中定义一样).
函数 cvLaplace 计算输入图像的 Laplacian,方法是对用 sobel 算子计算的二阶 x- 和 y- 差分求和:

dst(x,y) = d2src/dx2 + d2src/dy2

对 aperture_size=1 则给出最快计算结果,相当于对图像采用如下内核做卷积:

|0 1 0|
|1 -4 1|
|0 1 0|

类似于 cvSobel 函数,也不作图像的尺度变换,而且支持输入、输出图像类型一致。


--------------------------------------------------------------------------------

Canny
采用 Canny 算法做边缘检测

void cvCanny( const CvArr* image, CvArr* edges, double threshold1,
double threshold2, int aperture_size=3 );


image
输入图像.
edges
输出的边缘图像
threshold1
第一个阈值
threshold2
第二个阈值
aperture_size
Sobel 算子内核大小 (见 cvSobel).
函数 cvCanny 采用 CANNY 算法发现输入图像的边缘而且在输出图像中标识这些边缘。小的阈值 threshold1 用来控制边缘连接,大的阈值用来控制强边缘的初始分割。


--------------------------------------------------------------------------------

PreCornerDetect
计算特征图,用于角点检测

void cvPreCornerDetect( const CvArr* image, CvArr* corners, int aperture_size=3 );


image
输入图像.
corners
保存角点坐标的数组
aperture_size
Sobel 算子的核大小(见cvSobel).
函数 cvPreCornerDetect 计算函数 Dx2Dyy+Dy2Dxx - 2DxDyDxy 其中 D? 表示一阶图像差分,D?? 表示二阶图像差分。 角点被认为是函数的局部最大值:

// assuming that the image is 浮点数
IplImage* corners = cvCloneImage(image);
IplImage* dilated_corners = cvCloneImage(image);
IplImage* corner_mask = cvCreateImage( cvGetSize(image), 8, 1 );
cvPreCornerDetect( image, corners, 3 );
cvDilate( corners, dilated_corners, 0, 1 );
cvSubS( corners, dilated_corners, corners );
cvCmpS( corners, 0, corner_mask, CV_CMP_GE );
cvReleaseImage( &corners );
cvReleaseImage( &dilated_corners );


--------------------------------------------------------------------------------

CornerEigenValsAndVecs
计算图像块的特征值和特征向量,用于角点检测

void cvCornerEigenValsAndVecs( const CvArr* image, CvArr* eigenvv,
int block_size, int aperture_size=3 );


image
输入图像.
eigenvv
保存结果的数组。必须比输入图像宽 6 倍。
block_size
邻域大小 (见讨论).
aperture_size
Sobel 算子的核尺寸(见 cvSobel).
对每个象素,函数 cvCornerEigenValsAndVecs 考虑 block_size × block_size 大小的邻域 S(p),然后在邻域上计算差分的相关矩阵:

| sumS(p)(dI/dx)2 sumS(p)(dI/dx•dI/dy)|
M = | |
| sumS(p)(dI/dx•dI/dy) sumS(p)(dI/dy)2 |

然后它计算矩阵的特征值和特征向量,并且按如下方式(λ1, λ2, x1, y1, x2, y2)存储这些值到输出图像中,其中

λ1, λ2 - M 的特征值,没有排序
(x1, y1) - 特征向量,对 λ1
(x2, y2) - 特征向量,对 λ2



--------------------------------------------------------------------------------

CornerMinEigenVal
计算梯度矩阵的最小特征值,用于角点检测

void cvCornerMinEigenVal( const CvArr* image, CvArr* eigenval, int block_size, int aperture_size=3 );


image
输入图像.
eigenval
保存最小特征值的图像. 与输入图像大小一致
block_size
邻域大小 (见讨论 cvCornerEigenValsAndVecs).
aperture_size
Sobel 算子的核尺寸(见 cvSobel). 当输入图像是浮点数格式时,该参数表示用来计算差分的浮点滤波器的个数.
函数 cvCornerMinEigenVal 与 cvCornerEigenValsAndVecs 类似,但是它仅仅计算和存储每个象素点差分相关矩阵的最小特征值,即前一个函数的 min(λ1, λ2)


--------------------------------------------------------------------------------

FindCornerSubPix
精确角点位置

void cvFindCornerSubPix( const CvArr* image, CvPoint2D32f* corners,
int count, CvSize win, CvSize zero_zone,
CvTermCriteria criteria );


image
输入图像.
corners
输入角点的初始坐标,也存储精确的输出坐标
count
角点数目
win
搜索窗口的一半尺寸。如果 win=(5,5) 那么使用 5*2+1 × 5*2+1 = 11 × 11 大小的搜索窗口
zero_zone
死区的一半尺寸,死区为不对搜索区的中央位置做求和运算的区域。它是用来避免自相关矩阵出现的某些可能的奇异性。当值为 (-1,-1) 表示没有死区。
criteria
求角点的迭代过程的终止条件。即角点位置的确定,要么迭代数大于某个设定值,或者是精确度达到某个设定值。 criteria 可以是最大迭代数目,也可以是精确度
函数 cvFindCornerSubPix 通过迭代来发现具有子象素精度的角点位置,或如图所示的放射鞍点(radial saddle points)。



Sub-pixel accurate corner locator is based on the observation that every vector from the center q to a point p located within a neighborhood of q is orthogonal to the image gradient at p subject to image and measurement noise. Consider the expression:

εi=DIpiT•(q-pi)

where DIpi is the image gradient at the one of the points pi in a neighborhood of q. The value of q is to be found such that εi is minimized. A system of equations may be set up with εi' set to zero:

sumi(DIpi•DIpiT)•q - sumi(DIpi•DIpiT•pi) = 0

where the gradients are summed within a neighborhood ("search window") of q. Calling the first gradient term G and the second gradient term b gives:

q=G-1•b

The algorithm sets the center of the neighborhood window at this new center q and then iterates until the center keeps within a set threshold.


--------------------------------------------------------------------------------

GoodFeaturesToTrack
确定图像的强角点

void cvGoodFeaturesToTrack( const CvArr* image, CvArr* eig_image, CvArr* temp_image,
CvPoint2D32f* corners, int* corner_count,
double quality_level, double min_distance,
const CvArr* mask=NULL );


image
输入图像,8-位或浮点32-比特,单通道
eig_image
临时浮点32-位图像,大小与输入图像一致
temp_image
另外一个临时图像,格式与尺寸与 eig_image 一致
corners
输出参数,检测到的角点
corner_count
输出参数,检测到的角点数目
quality_level
最大最小特征值的乘法因子。定义可接受图像角点的最小质量因子。
min_distance
限制因子。得到的角点的最小距离。使用 Euclidian 距离
mask
ROI:感兴趣区域。函数在ROI中计算角点,如果 mask 为 NULL,则选择整个图像。
函数 cvGoodFeaturesToTrack 在图像中寻找具有大特征值的角点。该函数,首先用cvCornerMinEigenVal 计算输入图像的每一个象素点的最小特征值,并将结果存储到变量 eig_image 中。然后进行非最大值压缩(仅保留3x3邻域中的局部最大值)。下一步将最小特征值小于 quality_level•max(eig_image(x,y)) 排除掉。最后,函数确保所有发现的角点之间具有足够的距离,(最强的角点第一个保留,然后检查新的角点与已有角点之间的距离大于 min_distance )。

2.
OPENCV用户手册之图像处理部分(之二):采样、差值与几何变换(中文翻译)
From http://blog.csdn.net/hunnish/archive/2004/09/03/93174.aspx

采样、差值和几何变换
翻译:HUNNISH, 阿须数码



--------------------------------------------------------------------------------


InitLineIterator
初始化线段迭代器

int cvInitLineIterator( const CvArr* image, CvPoint pt1, CvPoint pt2,
CvLineIterator* line_iterator, int connectivity=8 );


image
带线段的图像.
pt1
线段起始点
pt2
线段结束点
line_iterator
指向线段迭代器结构的指针
connectivity
被扫描线段的连通数,4 或 8.
函数 cvInitLineIterator 初始化线段迭代器,并返回两点之间的象素点数目。两个点必须在图像内。当迭代器初始化后,连接两点的光栅线上所有点,都可以连续通过调用 CV_NEXT_LINE_POINT 来得到。线段上的点是使用 4-连通或8-连通利用 Bresenham 算法逐点计算的。

例子:使用线段迭代器计算彩色线上象素值的和
CvScalar sum_line_pixels( IplImage* image, CvPoint pt1, CvPoint pt2 )
{
CvLineIterator iterator;
int blue_sum = 0, green_sum = 0, red_sum = 0;
int count = cvInitLineIterator( image, pt1, pt2, &iterator, 8 );

for( int i = 0; i < count; i++ ){
blue_sum += iterator.ptr[0];
green_sum += iterator.ptr[1];
red_sum += iterator.ptr[2];
CV_NEXT_LINE_POINT(iterator);

/* print the pixel coordinates: demonstrates how to calculate the coordinates */
{
int offset, x, y;
/* assume that ROI is not set, otherwise need to take it into account. */
offset = iterator.ptr - (uchar*)(image->imageData);
y = offset/image->widthStep;
x = (offset - y*image->widthStep)/(3*sizeof(uchar) /* size of pixel */);
printf("(%d,%d)\n", x, y );
}
}
return cvScalar( blue_sum, green_sum, red_sum );
}


--------------------------------------------------------------------------------

SampleLine
将光栅线读入缓冲区

int cvSampleLine( const CvArr* image, CvPoint pt1, CvPoint pt2,
void* buffer, int connectivity=8 );


image
带线段图像
pt1
起点
pt2
终点
buffer
存储线段点的缓存区,必须有足够大小来存储点 max( |pt2.x-pt1.x|+1, |pt2.y-pt1.y|+1 ) :8-连通情况下,以及 |pt2.x-pt1.x|+|pt2.y-pt1.y|+1 : 4-连通情况下.
connectivity
The line connectivity, 4 or 8.
函数 cvSampleLine 实现了线段迭代器的一个特殊应用。它读取由两点 pt1 和 pt2 确定的线段上的所有图像点,包括终点,并存储到缓存中。


--------------------------------------------------------------------------------

GetRectSubPix
从图像中提取象素矩形,使用子象素精度

void cvGetRectSubPix( const CvArr* src, CvArr* dst, CvPoint2D32f center );


src
输入图像.
dst
提取的矩形.
center
提取的象素矩形的中心,浮点数坐标。中心必须位于图像内部.
函数 cvGetRectSubPix 从图像 src 中提取矩形:

dst(x, y) = src(x + center.x - (width(dst)-1)*0.5, y + center.y - (height(dst)-1)*0.5)

其中非整数象素点坐标采用双线性差值提取。对多通道图像,每个通道独立单独完成提取。矩形中心必须位于图像内部,而整个矩形可以部分不在图像内。这种情况下,复制的边界模识用来得到图像边界外的象素值(Hunnish:令人费解)


--------------------------------------------------------------------------------

GetQuadrangleSubPix
提取象素四边形,使用子象素精度

void cvGetQuadrangleSubPix( const CvArr* src, CvArr* dst, const CvMat* map_matrix,
int fill_outliers=0, CvScalar fill_value=cvScalarAll(0) );


src
输入图像.
dst
提取的四边形.
map_matrix
3 × 2 变换矩阵 [A|b] (见讨论).
fill_outliers
该标志位指定是否对原始图像边界外面的象素点使用复制模式(fill_outliers=0)进行差值或者将其设置为指定值(fill_outliers=1)。
fill_value
对原始图像边界外面的象素设定固定值,当 fill_outliers=1.
函数 cvGetQuadrangleSubPix 从图像 src 中提取四边形,使用子象素精度,并且将结果存储于 dst ,计算公式是:

dst(x+width(dst)/2, y+height(dst)/2)= src( A11x+A12y+b1, A21x+A22y+b2),

where A and b are taken from map_matrix
| A11 A12 b1 |
map_matrix = | |
| A21 A22 b2 |

其中在非整数坐标 A•(x,y)T+b 的象素点值通过双线性变换得到。多通道图像的每一个通道都单独计算.

例子:使用 cvGetQuadrangleSubPix 进行图像旋转
#include "cv.h"
#include "highgui.h"
#include "math.h"

int main( int argc, char** argv )
{
IplImage* src;
/* the first command line parameter must be image file name */
if( argc==2 && (src = cvLoadImage(argv[1], -1))!=0)
{
IplImage* dst = cvCloneImage( src );
int delta = 1;
int angle = 0;

cvNamedWindow( "src", 1 );
cvShowImage( "src", src );

for(;;)
{
float m[6];
double factor = (cos(angle*CV_PI/180.) + 1.1)*3;
CvMat M = cvMat( 2, 3, CV_32F, m );
int w = src->width;
int h = src->height;

m[0] = (float)(factor*cos(-angle*2*CV_PI/180.));
m[1] = (float)(factor*sin(-angle*2*CV_PI/180.));
m[2] = w*0.5f;
m[3] = -m[1];
m[4] = m[0];
m[5] = h*0.5f;

cvGetQuadrangleSubPix( src, dst, &M, 1, cvScalarAll(0));

cvNamedWindow( "dst", 1 );
cvShowImage( "dst", dst );

if( cvWaitKey(5) == 27 )
break;

angle = (angle + delta) % 360;
}
}
return 0;
}


--------------------------------------------------------------------------------

Resize
图像大小变换

void cvResize( const CvArr* src, CvArr* dst, int interpolation=CV_INTER_LINEAR );


src
输入图像.
dst
输出图像.
interpolation
差值方法:
CV_INTER_NN - 最近邻差值,
CV_INTER_LINEAR - 双线性差值 (缺省使用)
CV_INTER_AREA - 使用象素关系重采样。当图像缩小时候,该方法可以避免波纹出现。当图像放大是,类似于 CV_INTER_NN 方法..
CV_INTER_CUBIC - 立方差值.
函数 cvResize 将图像 src 改变尺寸得到与 dst 同样大小。若设定 ROI,函数将按常规支持 ROI.


--------------------------------------------------------------------------------

WarpAffine
对图像做仿射变换

void cvWarpAffine( const CvArr* src, CvArr* dst, const CvMat* map_matrix,
int flags=CV_INTER_LINEAR+CV_WARP_FILL_OUTLIERS,
CvScalar fillval=cvScalarAll(0) );


src
输入图像.
dst
输出图像.
map_matrix
2×3 变换矩阵
flags
差值方法与开关选项:
CV_WARP_FILL_OUTLIERS - 填充所有缩小图像的象素。如果部分象素落在输入图像的边界外,那么它们的值设定为 fillval.
CV_WARP_INVERSE_MAP - 指定 matrix 是输出图像到输入图像的反变换,因此可以直接用来做象素差值。否则, 函数从 map_matrix 得到反变换。
fillval
用来填充边界外面的值
函数 cvWarpAffine 利用下面指定的矩阵变换输入图像:

dst(x',y')<-src(x,y)
如果没有指定 CV_WARP_INVERSE_MAP , (x',y')T=map_matrix•(x,y,1)T+b ,
否则, (x, y)T=map_matrix•(x',y&apos,1)T+b

函数与 cvGetQuadrangleSubPix 类似,但是不完全相同。 cvWarpAffine 要求输入和输出图像具有同样的数据类型,有更大的资源开销(因此对大图像不太合适)而且输出图像的部分可以保留不变。而 cvGetQuadrangleSubPix 可以精确地从8位图像中提取四边形到浮点数缓存区中,具有比较小的系统开销,而且总是全部改变输出图像的内容。

要变换稀疏矩阵,使用 cxcore 中的函数 cvTransform 。


--------------------------------------------------------------------------------

2DRotationMatrix
计算二维旋转的仿射变换矩阵

CvMat* cv2DRotationMatrix( CvPoint2D32f center, double angle,
double scale, CvMat* map_matrix );


center
输入图像的旋转中心
angle
旋转角度(度)。正值表示逆时针旋转(坐标原点假设在左上角).
scale
各项同性的尺度因子
map_matrix
输出 2×3 矩阵的指针
函数 cv2DRotationMatrix 计算矩阵:

[ α β | (1-α)*center.x - β*center.y ]
[ -β α | β*center.x + (1-α)*center.y ]

where α=scale*cos(angle), β=scale*sin(angle)

该变换映射旋转中心到它本身。如果这不是目的的话,应该调整平移(Hunnish: 这段话令人费解:The transformation maps the rotation center to itself. If this is not the purpose, the shift should be adjusted)


--------------------------------------------------------------------------------

WarpPerspective
对图像进行透视变换

void cvWarpPerspective( const CvArr* src, CvArr* dst, const CvMat* map_matrix,
int flags=CV_INTER_LINEAR+CV_WARP_FILL_OUTLIERS,
CvScalar fillval=cvScalarAll(0) );


src
输入图像.
dst
输出图像.
map_matrix
3×3 变换矩阵
flags
差值方法的开关选项:
CV_WARP_FILL_OUTLIERS - 填充所有缩小图像的象素。如果部分象素落在输入图像的边界外,那么它们的值设定为 fillval.
CV_WARP_INVERSE_MAP - 指定 matrix 是输出图像到输入图像的反变换,因此可以直接用来做象素差值。否则, 函数从 map_matrix 得到反变换。
fillval
用来填充边界外面的值
函数 cvWarpPerspective 利用下面指定矩阵变换输入图像:

dst(x',y')<-src(x,y)
若指定 CV_WARP_INVERSE_MAP, (tx',ty',t)T=map_matrix•(x,y,1)T+b
否则, (tx, ty, t)T=map_matrix•(x',y&apos,1)T+b

要变换稀疏矩阵,使用 cxcore 中的函数 cvTransform 。


--------------------------------------------------------------------------------

WarpPerspectiveQMatrix
用4个对应点计算透视变换矩阵

CvMat* cvWarpPerspectiveQMatrix( const CvPoint2D32f* src,
const CvPoint2D32f* dst,
CvMat* map_matrix );


src
输入图像的四边形的4个点坐标
dst
输出图像的对应四边形的4个点坐标
map_matrix
输出的 3×3 矩阵
函数 cvWarpPerspectiveQMatrix 计算透视变换矩阵,使得:

(tix'i,tiy'i,ti)T=matrix•(xi,yi,1)T

where dst(i)=(x'i,y'i), src(i)=(xi,yi), i=0..3.

3.
OPENCV用户手册之图像处理部分(之三):形态学操作(中文翻译)
From http://blog.csdn.net/hunnish/archive/2004/09/06/95532.aspx

形态学操作
HUNNISH 注:

本翻译是直接根据 OpenCV Beta 4.0 版本的用户手册翻译的,原文件是:/doc/ref/opencvref_cv.htm, 可以从 SOURCEFORG 上面的 OpenCV 项目下载,也可以直接从 阿须数码 中下载:http://www.assuredigit.com/incoming/sourcecode/opencv/chinese_docs/ref/opencvref_cv.htm。

翻译中肯定有不少错误,另外也有些术语和原文语义理解不透导致翻译不准确或者错误,也请有心人赐教。翻译这些英文参考手册的目的是想与国内 OPENCV 的爱好者一起提高 OPENCV 在计算机视觉、模式识别和图像处理方面的实际应用水平。


--------------------------------------------------------------------------------

CreateStructuringElementEx
创建结构元素

IplConvKernel* cvCreateStructuringElementEx( int cols, int rows, int anchor_x, int anchor_y,
int shape, int* values=NULL );


cols
结构元素的列数目
rows
结构元素的行数目
anchor_x
锚点的相对水平偏移量
anchor_y
锚点的相对垂直便宜量
shape
结构元素的形状,可以是下列值:
CV_SHAPE_RECT, 长方形元素;
CV_SHAPE_CROSS, 交错元素 a cross-shaped element;
CV_SHAPE_ELLIPSE, 椭圆元素;
CV_SHAPE_CUSTOM, 用户自定义元素。这种情况下参数 values 定义了 mask,即象素的那个邻域必须考虑。
values
指向结构元素的指针,它是一个平面数组,表示对元素矩阵逐行扫描。非零值的点表示该点属于该元素。如果点为 NULL,那么所有值都被认为是非零,即元素是一个长方形。该参数仅仅当形状是 CV_SHAPE_CUSTOM 时才予以考虑。
函数 cv CreateStructuringElementEx 分配和填充结构 IplConvKernel, 它可作为形态操作中的结构元素。


--------------------------------------------------------------------------------

ReleaseStructuringElement
删除结构元素

void cvReleaseStructuringElement( IplConvKernel** element );


element
被删除的结构元素的指针
函数 cvReleaseStructuringElement 释放结构 IplConvKernel 。如果 *element 为 NULL, 则函数不作用。


--------------------------------------------------------------------------------

Erode
使用结构元素腐蚀图像

void cvErode( const CvArr* src, CvArr* dst, IplConvKernel* element=NULL, int iterations=1 );


src
输入图像.
dst
输出图像.
element
用于腐蚀的结构元素。若为 NULL, 则使用 3×3 长方形的结构元素
iterations
腐蚀的次数
函数 cvErode 对输入图像使用指定的结构元素进行腐蚀,该结构决定每个具有最小值象素点的邻域形状:

dst=erode(src,element): dst(x,y)=min((x',y') in element))src(x+x',y+y')

函数支持(in-place)模式。腐蚀可以重复进行 (iterations) 次. 对彩色图像,每个彩色通道单独处理。


--------------------------------------------------------------------------------

Dilate
使用结构元素膨胀图像

void cvDilate( const CvArr* src, CvArr* dst, IplConvKernel* element=NULL, int iterations=1 );


src
输入图像.
dst
输出图像.
element
用于膨胀的结构元素。若为 NULL, 则使用 3×3 长方形的结构元素
iterations
膨胀的次数
函数 cvErode 对输入图像使用指定的结构元素进行腐蚀,该结构决定每个具有最小值象素点的邻域形状:

函数 cvDilate 对输入图像使用指定的结构元素进行膨胀,该结构决定每个具有最小值象素点的邻域形状:

dst=dilate(src,element): dst(x,y)=max((x',y') in element))src(x+x',y+y')

函数支持(in-place)模式。膨胀可以重复进行 (iterations) 次. 对彩色图像,每个彩色通道单独处理。


--------------------------------------------------------------------------------

MorphologyEx
高级形态变换

void cvMorphologyEx( const CvArr* src, CvArr* dst, CvArr* temp,
IplConvKernel* element, int operation, int iterations=1 );


src
输入图像.
dst
输出图像.
temp
临死图像,某些情况下需要
element
结构元素
operation
形态操作的类型:
CV_MOP_OPEN - 开口
CV_MOP_CLOSE - 闭口
CV_MOP_GRADIENT - 形态梯度
CV_MOP_TOPHAT - "顶帽"
CV_MOP_BLACKHAT - "黑帽"

iterations
膨胀和腐蚀次数.
函数 cvMorphologyEx 在膨胀和腐蚀基本操作的基础上,完成一些高级的形态变换:

开口:
dst=open(src,element)=dilate(erode(src,element),element)

闭口:
dst=close(src,element)=erode(dilate(src,element),element)

形态梯度
dst=morph_grad(src,element)=dilate(src,element)-erode(src,element)

"顶帽":
dst=tophat(src,element)=src-open(src,element)

"黑帽":
dst=blackhat(src,element)=close(src,element)-src

临时图像 temp 在形态梯度以及对“顶帽”和“黑帽”操作时的 in-place 模式下需要。

4.
OPENCV用户手册之图像处理部分(之四):滤波器与色彩转换(中文翻译)
From http://blog.csdn.net/hunnish/archive/2004/09/06/95534.aspx

滤波器与色彩转换
HUNNISH 注:

本翻译是直接根据 OpenCV Beta 4.0 版本的用户手册翻译的,原文件是:/doc/ref/opencvref_cv.htm, 可以从 SOURCEFORG 上面的 OpenCV 项目下载,也可以直接从 阿须数码 中下载:http://www.assuredigit.com/incoming/sourcecode/opencv/chinese_docs/ref/opencvref_cv.htm。

翻译中肯定有不少错误,另外也有些术语和原文语义理解不透导致翻译不准确或者错误,也请有心人赐教。翻译这些英文参考手册的目的是想与国内 OPENCV 的爱好者一起提高 OPENCV 在计算机视觉、模式识别和图像处理方面的实际应用水平


--------------------------------------------------------------------------------

Smooth
各种方法的图像平滑

void cvSmooth( const CvArr* src, CvArr* dst,
int smoothtype=CV_GAUSSIAN,
int param1=3, int param2=0, double param3=0 );


src
输入图像.
dst
输出图像.
smoothtype
平滑方法:
CV_BLUR_NO_SCALE (简单不带尺度变换的模糊) - 对每个象素领域 param1×param2 求和。如果邻域大小是变化的,可以事先利用函数 cvIntegral 计算积分图像。
CV_BLUR (simple blur) - 对每个象素邻域 param1×param2 求和并做尺度变换 1/(param1•param2).
CV_GAUSSIAN (gaussian blur) - 对图像进行核大小为 param1×param2 的高斯卷积
CV_MEDIAN (median blur) - 发现邻域 param1×param1 的中值 (i.e. 邻域是方的).
CV_BILATERAL (双滤波) - 应用双向 3x3 滤波,彩色 sigma=param1,空间 sigma=param2. 关于双向滤波,可参考 http://www.dai.ed.ac.uk/CVonline/LOCAL_COPIES/MANDUCHI1/Bilateral_Filtering.html
param1
平滑操作的第一个参数.
param2
平滑操作的第二个参数. param2 为零对应简单的尺度变换和高斯模糊。
param3
对应高斯参数的 Gaussian sigma (标准差). 如果为零,这由下面的核尺寸计算:
sigma = (n/2 - 1)*0.3 + 0.8, 其中 n=param1 对应水平核,
n=param2 对应垂直核.

对小的卷积核 (3×3 to 7×7) 使用标准 sigma 速度会快。如果 param3 不为零,而 param1 和 param2 为零,则核大小有 sigma 计算 (以保证足够精确的操作).
函数 cvSmooth 可使用上面任何一种方法平滑图像。每一种方法都有自己的特点以及局限。

没有缩放的图像平滑仅支持单通道图像,并且支持8位、16位、32位和32位浮点格式。

简单模糊和高斯模糊支持 1- 或 3-通道, 8-比特 和 32-比特 浮点图像。这两种方法可以(in-place)方式处理图像。

中值和双向滤波工作于 1- 或 3-通道, 8-位图像,但是不能以 in-place 方式处理图像.


--------------------------------------------------------------------------------

Filter2D
对图像做卷积

void cvFilter2D( const CvArr* src, CvArr* dst,
const CvMat* kernel,
CvPoint anchor=cvPoint(-1,-1));
#define cvConvolve2D cvFilter2D


src
输入图像.
dst
输出图像.
kernel
卷积核, 单通道浮点矩阵. 如果想要应用不同的核于不同的通道,先用 cvSplit 函数分解图像到单个色彩通道上,然后单独处理。
anchor
核的锚点表示一个被滤波的点在核内的位置。 锚点应该处于核内部。缺省值 (-1,-1) 表示锚点在核中心。
函数 cvFilter2D 对图像进行线性滤波,支持 In-place 操作。当开孔部分位于图像外面时,函数从最近邻的图像内部象素差值得到边界外面的象素值。


--------------------------------------------------------------------------------

Integral
计算积分图像

void cvIntegral( const CvArr* image, CvArr* sum, CvArr* sqsum=NULL, CvArr* tilted_sum=NULL );


image
输入图像, W×H, 单通道,8位或浮点 (32f 或 64f).
sum
积分图像, W+1×H+1, 单通道,32位整数或 double 精度的浮点数(64f).
sqsum
对象素值平方的积分图像,W+1×H+1, 单通道,32位整数或 double 精度的浮点数 (64f).
tilted_sum
旋转45度的积分图像,单通道,32位整数或 double 精度的浮点数 (64f).
函数 cvIntegral 计算一次或高次积分图像:

sum(X,Y)=sumx
sqsum(X,Y)=sumx
tilted_sum(X,Y)=sumy
利用积分图像,可以方便得到某个区域象素点的和、均值、标准方差或在 0(1) 的选择角度。例如:


sumx1<=x

因此可以在变化的窗口内做快速平滑或窗口相关。


--------------------------------------------------------------------------------

CvtColor
色彩空间转换

void cvCvtColor( const CvArr* src, CvArr* dst, int code );


src
输入的 8-比特 或浮点图像.
dst
输出的 8-比特 或浮点图像.
code
色彩空间转换,通过定义 CV_2 常数 (见下面).
函数 cvCvtColor 将输入图像从一个色彩空间转换为另外一个色彩空间。函数忽略 IplImage 头中定义的 colorModel 和 channelSeq 域,所以输入图像的色彩空间应该正确指定 (包括通道的顺序,对RGB空间而言,BGR 意味着 24-位格式,其排列为 B0 G0 R0 B1 G1 R1 ... 层叠,而 RGB 意味着 24-位格式,其排列为 R0 G0 B0 R1 G1 B1 ... 层叠). 函数做如下变换:

RGB 空间内部的变换,如增加/删除 alpha 通道,反相通道顺序,16位 RGB彩色变换(Rx5:Gx6:Rx5),以及灰度图像的变换,使用:
RGB[A]->Gray: Y=0.212671*R + 0.715160*G + 0.072169*B + 0*A
Gray->RGB[A]: R=Y G=Y B=Y A=0

所有可能的图像色彩空间的相互变换公式列举如下:


RGB<=>XYZ (CV_BGR2XYZ, CV_RGB2XYZ, CV_XYZ2BGR, CV_XYZ2RGB):
|X| |0.412411 0.357585 0.180454| |R|
|Y| = |0.212649 0.715169 0.072182|*|G|
|Z| |0.019332 0.119195 0.950390| |B|

|R| | 3.240479 -1.53715 -0.498535| |X|
|G| = |-0.969256 1.875991 0.041556|*|Y|
|B| | 0.055648 -0.204043 1.057311| |Z|


RGB<=>YCrCb (CV_BGR2YCrCb, CV_RGB2YCrCb, CV_YCrCb2BGR, CV_YCrCb2RGB)
Y=0.299*R + 0.587*G + 0.114*B
Cr=(R-Y)*0.713 + 128
Cb=(B-Y)*0.564 + 128

R=Y + 1.403*(Cr - 128)
G=Y - 0.344*(Cr - 128) - 0.714*(Cb - 128)
B=Y + 1.773*(Cb - 128)


RGB=>HSV (CV_BGR2HSV,CV_RGB2HSV)
V=max(R,G,B)
S=(V-min(R,G,B))*255/V if V!=0, 0 otherwise

(G - B)*60/S, if V=R
H= 180+(B - R)*60/S, if V=G
240+(R - G)*60/S, if V=B

if H<0 then H=H+360

使用上面从 0° 到 360° 变化的公式计算色调(hue)值,确保它们被 2 除后能试用于8位。


RGB=>Lab (CV_BGR2Lab, CV_RGB2Lab)
|X| |0.433910 0.376220 0.189860| |R/255|
|Y| = |0.212649 0.715169 0.072182|*|G/255|
|Z| |0.017756 0.109478 0.872915| |B/255|

L = 116*Y1/3 for Y>0.008856
L = 903.3*Y for Y<=0.008856

a = 500*(f(X)-f(Y))
b = 200*(f(Y)-f(Z))
where f(t)=t1/3 for t>0.008856
f(t)=7.787*t+16/116 for t<=0.008856

上面的公式可以参考 http://www.cica.indiana.edu/cica/faq/color_spaces/color.spaces.html

Bayer=>RGB (CV_BayerBG2BGR, CV_BayerGB2BGR, CV_BayerRG2BGR, CV_BayerGR2BGR,
CV_BayerBG2RGB, CV_BayerRG2BGR, CV_BayerGB2RGB, CV_BayerGR2BGR,
CV_BayerRG2RGB, CV_BayerBG2BGR, CV_BayerGR2RGB, CV_BayerGB2BGR)
Bayer 模式被广泛应用于 CCD 和 CMOS 摄像头. 它允许从一个单独平面中得到彩色图像,该平面中的 R/G/B 象素点被安排如下:

R
G
R
G
R

G
B
G
B
G

R
G
R
G
R

G
B
G
B
G

R
G
R
G
R

G
B
G
B
G



The output RGB components of a pixel are interpolated from 1, 2 or 4 neighbors of the pixel having the same color. There are several modifications of the above pattern that can be achieved by shifting the pattern one pixel left and/or one pixel up. The two letters C1 and C2 in the conversion constants CV_BayerC1C22{BGR|RGB} indicate the particular pattern type - these are components from the second row, second and third columns, respectively. For example, the above pattern has very popular "BG" type.


--------------------------------------------------------------------------------

Threshold
对数组元素进行固定阈值操作

void cvThreshold( const CvArr* src, CvArr* dst, double threshold,
double max_value, int threshold_type );


src
原始数组 (单通道, 8-比特 of 32-比特 浮点数).
dst
输出数组,必须与 src 的类型一致,或者为 8-比特.
threshold
阈值
max_value
使用 CV_THRESH_BINARY 和 CV_THRESH_BINARY_INV 的最大值.
threshold_type
阈值类型 (见讨论)
函数 cvThreshold 对单通道数组应用固定阈值操作。典型的是对灰度图像进行阈值操作得到二值图像。(cvCmpS 也可以达到此目的) 或者是去掉噪声,例如过滤很小或很大象素值的图像点。有好几种对图像取阈值的方法,本函数支持的方法由 threshold_type 确定:

threshold_type=CV_THRESH_BINARY:
dst(x,y) = max_value, if src(x,y)>threshold
0, otherwise

threshold_type=CV_THRESH_BINARY_INV:
dst(x,y) = 0, if src(x,y)>threshold
max_value, otherwise

threshold_type=CV_THRESH_TRUNC:
dst(x,y) = threshold, if src(x,y)>threshold
src(x,y), otherwise

threshold_type=CV_THRESH_TOZERO:
dst(x,y) = src(x,y), if (x,y)>threshold
0, otherwise

threshold_type=CV_THRESH_TOZERO_INV:
dst(x,y) = 0, if src(x,y)>threshold
src(x,y), otherwise

下面是图形化的阈值描述:




--------------------------------------------------------------------------------

AdaptiveThreshold
自适应阈值方法

void cvAdaptiveThreshold( const CvArr* src, CvArr* dst, double max_value,
int adaptive_method=CV_ADAPTIVE_THRESH_MEAN_C,
int threshold_type=CV_THRESH_BINARY,
int block_size=3, double param1=5 );


src
输入图像.
dst
输出图像.
max_value
使用 CV_THRESH_BINARY 和 CV_THRESH_BINARY_INV 的最大值.
adaptive_method
自适应阈值算法使用:CV_ADAPTIVE_THRESH_MEAN_C 或 CV_ADAPTIVE_THRESH_GAUSSIAN_C (见讨论).
threshold_type
取阈值类型:必须是下者之一
CV_THRESH_BINARY,
CV_THRESH_BINARY_INV
block_size
用来计算阈值的象素邻域大小: 3, 5, 7, ...
param1
与方法有关的参数。对方法 CV_ADAPTIVE_THRESH_MEAN_C 和 CV_ADAPTIVE_THRESH_GAUSSIAN_C, 它是一个从均值或加权均值提取的常数(见讨论), 尽管它可以是负数。
函数 cvAdaptiveThreshold 将灰度图像变换到二值图像,采用下面公式:

threshold_type=CV_THRESH_BINARY:
dst(x,y) = max_value, if src(x,y)>T(x,y)
0, otherwise

threshold_type=CV_THRESH_BINARY_INV:
dst(x,y) = 0, if src(x,y)>T(x,y)
max_value, otherwise

其中 TI 是为每一个象素点单独计算的阈值

对方法 CV_ADAPTIVE_THRESH_MEAN_C,它是 block_size × block_size 块中的象素点,被参数 param1 所减,得到的均值,

对方法 CV_ADAPTIVE_THRESH_GAUSSIAN_C 它是 block_size × block_size 块中的象素点,被参数 param1 所减,得到的加权和(gaussian)。

5.
RGB不同彩色空间的转换公式
From http://blog.csdn.net/hunnish/archive/2004/09/01/91325.aspx

RGB不同彩色空间的转换公式


彩图与灰度图的相互转换 RGB <-> GRAY:
RGB[A]->Gray: Y=0.212671*R + 0.715160*G + 0.072169*B
Gray->RGB[A]: R=Y G=Y B=Y A=0

其它的所有可能的图像色彩空间的相互变换公式列举如下:


RGB<=>XYZ :
|X| |0.412411 0.357585 0.180454| |R|
|Y| = |0.212649 0.715169 0.072182|*|G|
|Z| |0.019332 0.119195 0.950390| |B|

|R| | 3.240479 -1.53715 -0.498535| |X|
|G| = |-0.969256 1.875991 0.041556|*|Y|
|B| | 0.055648 -0.204043 1.057311| |Z|
RGB<=>YCrCb
Y=0.299*R + 0.587*G + 0.114*B
Cr=(R-Y)*0.713 + 128
Cb=(B-Y)*0.564 + 128

R=Y + 1.403*(Cr - 128)
G=Y - 0.344*(Cr - 128) - 0.714*(Cb - 128)
B=Y + 1.773*(Cb - 128)


RGB=>HSV
V=max(R,G,B)
S=(V-min(R,G,B))*255/V if V!=0, 0 otherwise

(G - B)*60/S, if V=R
H= 180+(B - R)*60/S, if V=G
240+(R - G)*60/S, if V=B

若 H<0,则 H=H+360

使用上面从 0° 到 360° 变化的公式计算色调( hue)值,确保它们被 2 除后能试用于8位。


RGB=>Lab
|X| |0.433910 0.376220 0.189860| |R/255|
|Y| = |0.212649 0.715169 0.072182|*|G/255|
|Z| |0.017756 0.109478 0.872915| |B/255|

L = 116*Y1/3 for Y>0.008856
L = 903.3*Y for Y<=0.008856

a = 500*(f(X)-f(Y))
b = 200*(f(Y)-f(Z))
其中 f(t)=t1/3 for t>0.008856
f(t)=7.787*t+16/116 for t<=0.008856

Monday, September 12, 2005

 

Some notes on OpenCV - 2

1.
Optical flow demo
From http://ai.stanford.edu/~dstavens/cs223b/optical_flow_demo.cpp

/* --Sparse Optical Flow Demo Program--
* Written by David Stavens (dstavens@robotics.stanford.edu)
*/
#include
#include
#include
#include

static const double pi = 3.14159265358979323846;

inline static double square(int a)
{
return a * a;
}

/* This is just an inline that allocates images. I did this to reduce clutter in the
* actual computer vision algorithmic code. Basically it allocates the requested image
* unless that image is already non-NULL. It always leaves a non-NULL image as-is even
* if that image's size, depth, and/or channels are different than the request.
*/
inline static void allocateOnDemand( IplImage **img, CvSize size, int depth, int channels )
{
if ( *img != NULL ) return;

*img = cvCreateImage( size, depth, channels );
if ( *img == NULL )
{
fprintf(stderr, "Error: Couldn't allocate image. Out of memory?\n");
exit(-1);
}
}

int main(void)
{
/* Create an object that decodes the input video stream. */
CvCapture *input_video = cvCaptureFromFile(
"C:\\Documents and Settings\\David Stavens\\Desktop\\223B-Demo\\optical_flow_input.avi"
);
if (input_video == NULL)
{
/* Either the video didn't exist OR it uses a codec OpenCV
* doesn't support.
*/
fprintf(stderr, "Error: Can't open video.\n");
return -1;
}

/* This is a hack. If we don't call this first then getting capture
* properties (below) won't work right. This is an OpenCV bug. We
* ignore the return value here. But it's actually a video frame.
*/
cvQueryFrame( input_video );

/* Read the video's frame size out of the AVI. */
CvSize frame_size;
frame_size.height =
(int) cvGetCaptureProperty( input_video, CV_CAP_PROP_FRAME_HEIGHT );
frame_size.width =
(int) cvGetCaptureProperty( input_video, CV_CAP_PROP_FRAME_WIDTH );

/* Determine the number of frames in the AVI. */
long number_of_frames;
/* Go to the end of the AVI (ie: the fraction is "1") */
cvSetCaptureProperty( input_video, CV_CAP_PROP_POS_AVI_RATIO, 1. );
/* Now that we're at the end, read the AVI position in frames */
number_of_frames = (int) cvGetCaptureProperty( input_video, CV_CAP_PROP_POS_FRAMES );
/* Return to the beginning */
cvSetCaptureProperty( input_video, CV_CAP_PROP_POS_FRAMES, 0. );

/* Create three windows called "Frame N", "Frame N+1", and "Optical Flow"
* for visualizing the output. Have those windows automatically change their
* size to match the output.
*/
cvNamedWindow("Optical Flow", CV_WINDOW_AUTOSIZE);

long current_frame = 0;
while(true)
{
static IplImage *frame = NULL, *frame1 = NULL, *frame1_1C = NULL, *frame2_1C = NULL, *eig_image = NULL, *temp_image = NULL, *pyramid1 = NULL, *pyramid2 = NULL;

/* Go to the frame we want. Important if multiple frames are queried in
* the loop which they of course are for optical flow. Note that the very
* first call to this is actually not needed. (Because the correct position
* is set outsite the for() loop.)
*/
cvSetCaptureProperty( input_video, CV_CAP_PROP_POS_FRAMES, current_frame );

/* Get the next frame of the video.
* IMPORTANT! cvQueryFrame() always returns a pointer to the _same_
* memory location. So successive calls:
* frame1 = cvQueryFrame();
* frame2 = cvQueryFrame();
* frame3 = cvQueryFrame();
* will result in (frame1 == frame2 && frame2 == frame3) being true.
* The solution is to make a copy of the cvQueryFrame() output.
*/
frame = cvQueryFrame( input_video );
if (frame == NULL)
{
/* Why did we get a NULL frame? We shouldn't be at the end. */
fprintf(stderr, "Error: Hmm. The end came sooner than we thought.\n");
return -1;
}
/* Allocate another image if not already allocated.
* Image has ONE challenge of color (ie: monochrome) with 8-bit "color" depth.
* This is the image format OpenCV algorithms actually operate on (mostly).
*/
allocateOnDemand( &frame1_1C, frame_size, IPL_DEPTH_8U, 1 );
/* Convert whatever the AVI image format is into OpenCV's preferred format.
* AND flip the image vertically. Flip is a shameless hack. OpenCV reads
* in AVIs upside-down by default. (No comment :-))
*/
cvConvertImage(frame, frame1_1C, CV_CVTIMG_FLIP);

/* We'll make a full color backup of this frame so that we can draw on it.
* (It's not the best idea to draw on the static memory space of cvQueryFrame().)
*/
allocateOnDemand( &frame1, frame_size, IPL_DEPTH_8U, 3 );
cvConvertImage(frame, frame1, CV_CVTIMG_FLIP);

/* Get the second frame of video. Sample principles as the first. */
frame = cvQueryFrame( input_video );
if (frame == NULL)
{
fprintf(stderr, "Error: Hmm. The end came sooner than we thought.\n");
return -1;
}
allocateOnDemand( &frame2_1C, frame_size, IPL_DEPTH_8U, 1 );
cvConvertImage(frame, frame2_1C, CV_CVTIMG_FLIP);

/* Shi and Tomasi Feature Tracking! */

/* Preparation: Allocate the necessary storage. */
allocateOnDemand( &eig_image, frame_size, IPL_DEPTH_32F, 1 );
allocateOnDemand( &temp_image, frame_size, IPL_DEPTH_32F, 1 );

/* Preparation: This array will contain the features found in frame 1. */
CvPoint2D32f frame1_features[400];

/* Preparation: BEFORE the function call this variable is the array size
* (or the maximum number of features to find). AFTER the function call
* this variable is the number of features actually found.
*/
int number_of_features;

/* I'm hardcoding this at 400. But you should make this a #define so that you can
* change the number of features you use for an accuracy/speed tradeoff analysis.
*/
number_of_features = 400;

/* Actually run the Shi and Tomasi algorithm!!
* "frame1_1C" is the input image.
* "eig_image" and "temp_image" are just workspace for the algorithm.
* The first ".01" specifies the minimum quality of the features (based on the eigenvalues).
* The second ".01" specifies the minimum Euclidean distance between features.
* "NULL" means use the entire input image. You could point to a part of the image.
* WHEN THE ALGORITHM RETURNS:
* "frame1_features" will contain the feature points.
* "number_of_features" will be set to a value <= 400 indicating the number of feature points found.
*/
cvGoodFeaturesToTrack(frame1_1C, eig_image, temp_image, frame1_features, &number_of_features, .01, .01, NULL);

/* Pyramidal Lucas Kanade Optical Flow! */

/* This array will contain the locations of the points from frame 1 in frame 2. */
CvPoint2D32f frame2_features[400];

/* The i-th element of this array will be non-zero if and only if the i-th feature of
* frame 1 was found in frame 2.
*/
char optical_flow_found_feature[400];

/* The i-th element of this array is the error in the optical flow for the i-th feature
* of frame1 as found in frame 2. If the i-th feature was not found (see the array above)
* I think the i-th entry in this array is undefined.
*/
float optical_flow_feature_error[400];

/* This is the window size to use to avoid the aperture problem (see slide "Optical Flow: Overview"). */
CvSize optical_flow_window = cvSize(3,3);

/* This termination criteria tells the algorithm to stop when it has either done 20 iterations or when
* epsilon is better than .3. You can play with these parameters for speed vs. accuracy but these values
* work pretty well in many situations.
*/
CvTermCriteria optical_flow_termination_criteria
= cvTermCriteria( CV_TERMCRIT_ITER | CV_TERMCRIT_EPS, 20, .3 );

/* This is some workspace for the algorithm.
* (The algorithm actually carves the image into pyramids of different resolutions.)
*/
allocateOnDemand( &pyramid1, frame_size, IPL_DEPTH_8U, 1 );
allocateOnDemand( &pyramid2, frame_size, IPL_DEPTH_8U, 1 );

/* Actually run Pyramidal Lucas Kanade Optical Flow!!
* "frame1_1C" is the first frame with the known features.
* "frame2_1C" is the second frame where we want to find the first frame's features.
* "pyramid1" and "pyramid2" are workspace for the algorithm.
* "frame1_features" are the features from the first frame.
* "frame2_features" is the (outputted) locations of those features in the second frame.
* "number_of_features" is the number of features in the frame1_features array.
* "optical_flow_window" is the size of the window to use to avoid the aperture problem.
* "5" is the maximum number of pyramids to use. 0 would be just one level.
* "optical_flow_found_feature" is as described above (non-zero iff feature found by the flow).
* "optical_flow_feature_error" is as described above (error in the flow for this feature).
* "optical_flow_termination_criteria" is as described above (how long the algorithm should look).
* "0" means disable enhancements. (For example, the second aray isn't pre-initialized with guesses.)
*/
cvCalcOpticalFlowPyrLK(frame1_1C, frame2_1C, pyramid1, pyramid2, frame1_features, frame2_features, number_of_features, optical_flow_window, 5, optical_flow_found_feature, optical_flow_feature_error, optical_flow_termination_criteria, 0 );

/* For fun (and debugging :)), let's draw the flow field. */
for(int i = 0; i < number_of_features; i++)
{
/* If Pyramidal Lucas Kanade didn't really find the feature, skip it. */
if ( optical_flow_found_feature[i] == 0 ) continue;

int line_thickness; line_thickness = 1;
/* CV_RGB(red, green, blue) is the red, green, and blue components
* of the color you want, each out of 255.
*/
CvScalar line_color; line_color = CV_RGB(255,0,0);

/* Let's make the flow field look nice with arrows. */

/* The arrows will be a bit too short for a nice visualization because of the high framerate
* (ie: there's not much motion between the frames). So let's lengthen them by a factor of 3.
*/
CvPoint p,q;
p.x = (int) frame1_features[i].x;
p.y = (int) frame1_features[i].y;
q.x = (int) frame2_features[i].x;
q.y = (int) frame2_features[i].y;

double angle; angle = atan2( (double) p.y - q.y, (double) p.x - q.x );
double hypotenuse; hypotenuse = sqrt( square(p.y - q.y) + square(p.x - q.x) );

/* Here we lengthen the arrow by a factor of three. */
q.x = (int) (p.x - 3 * hypotenuse * cos(angle));
q.y = (int) (p.y - 3 * hypotenuse * sin(angle));

/* Now we draw the main line of the arrow. */
/* "frame1" is the frame to draw on.
* "p" is the point where the line begins.
* "q" is the point where the line stops.
* "CV_AA" means antialiased drawing.
* "0" means no fractional bits in the center cooridinate or radius.
*/
cvLine( frame1, p, q, line_color, line_thickness, CV_AA, 0 );
/* Now draw the tips of the arrow. I do some scaling so that the
* tips look proportional to the main line of the arrow.
*/
p.x = (int) (q.x + 9 * cos(angle + pi / 4));
p.y = (int) (q.y + 9 * sin(angle + pi / 4));
cvLine( frame1, p, q, line_color, line_thickness, CV_AA, 0 );
p.x = (int) (q.x + 9 * cos(angle - pi / 4));
p.y = (int) (q.y + 9 * sin(angle - pi / 4));
cvLine( frame1, p, q, line_color, line_thickness, CV_AA, 0 );
}
/* Now display the image we drew on. Recall that "Optical Flow" is the name of
* the window we created above.
*/
cvShowImage("Optical Flow", frame1);
/* And wait for the user to press a key (so the user has time to look at the image).
* If the argument is 0 then it waits forever otherwise it waits that number of milliseconds.
* The return value is the key the user pressed.
*/
int key_pressed;
key_pressed = cvWaitKey(0);

/* If the users pushes "b" or "B" go back one frame.
* Otherwise go forward one frame.
*/
if (key_pressed == 'b' || key_pressed == 'B') current_frame--;
else current_frame++;
/* Don't run past the front/end of the AVI. */
if (current_frame < 0) current_frame = 0;
if (current_frame >= number_of_frames - 1) current_frame = number_of_frames - 2;
}
}

2.
角点检测(corner detection)的源代码
From http://blog.csdn.net/hunnish/archive/2004/08/31/90032.aspx

这是在 Ruadhan 提供的源代码基础上做了一些修改,可以检测图像中的角点。应用环境是:OPENCV BETA 4,VC6 编译运行通过。

运行文件下载地址:

http://www.assuredigit.com/program/corner.exe

==========
#include
#include "cv.h"
#include "highgui.h"
#define max_corners 100

int main( int argc, char** argv )
{
int cornerCount=max_corners;
CvPoint2D32f corners[max_corners];
double qualityLevel;
double minDistance;
IplImage *srcImage = 0, *grayImage = 0, *corners1 = 0, *corners2 = 0;
int i;
CvScalar color = CV_RGB(255,0,0);
char* filename = argc == 2 ? argv[1] : (char*)"..//..//c//pic3.png";

cvNamedWindow( "image", 1 ); // create HighGUI window with name "image"

//Load the image to be processed
srcImage = cvLoadImage(filename,1);

grayImage = cvCreateImage(cvGetSize(srcImage), IPL_DEPTH_8U, 1);

//copy the source image to copy image after converting the format
cvCvtColor(srcImage, grayImage, CV_BGR2GRAY);

//create empty images of same size as the copied images
corners1= cvCreateImage(cvGetSize(srcImage), IPL_DEPTH_32F, 1);
corners2= cvCreateImage(cvGetSize(srcImage),IPL_DEPTH_32F, 1);

cvGoodFeaturesToTrack (grayImage, corners1, corners2, corners,
&cornerCount, 0.05, 5, 0);

printf("num corners found: %d\n", cornerCount);

// draw circles at each corner location in the gray image and
// print out a list the corners
if(cornerCount>0)
{
for (i=0; i {
cvCircle(srcImage, cvPoint((int)(corners[i].x), (int)(corners[i].y)), 6,
color, 2, CV_AA, 0);
}
}

cvShowImage( "image", srcImage );

cvReleaseImage(&srcImage);
cvReleaseImage(&grayImage);
cvReleaseImage(&corners1);
cvReleaseImage(&corners2);

cvWaitKey(0); // wait for key. The function has
return 0;
}

3.
边缘检测

边缘检测(Edge Detection)的源代码(需要OPENCV库的支持)
下面是采用 CANNY 算子进行图像边缘检测的 C/C++ 源代码,在OPENCV BETA 4.0, VC6.0 环境下编译通过。关于OPENCV库的使用方法以及相关问题,请查阅下面的相关文章:

http://forum.assuredigit.com/display_topic_threads.asp?ForumID=11&TopicID=3471

=========

程序开始

=========

#ifdef _CH_
#pragma package
#endif

#ifndef _EiC
#include "cv.h"
#include "highgui.h"
#endif

char wndname[] = "Edge";
char tbarname[] = "Threshold";
int edge_thresh = 1;

IplImage *image = 0, *cedge = 0, *gray = 0, *edge = 0;

// 定义跟踪条的 callback 函数
void on_trackbar(int h)
{
cvSmooth( gray, edge, CV_BLUR, 3, 3, 0 );
cvNot( gray, edge );

// 对灰度图像进行边缘检测
cvCanny(gray, edge, (float)edge_thresh, (float)edge_thresh*3, 3);
cvZero( cedge );
// copy edge points
cvCopy( image, cedge, edge );
// 显示图像
cvShowImage(wndname, cedge);
}

int main( int argc, char** argv )
{
char* filename = argc == 2 ? argv[1] : (char*)"fruits.jpg";

if( (image = cvLoadImage( filename, 1)) == 0 )
return -1;

// Create the output image
cedge = cvCreateImage(cvSize(image->width,image->height), IPL_DEPTH_8U, 3);

// 将彩色图像转换为灰度图像
gray = cvCreateImage(cvSize(image->width,image->height), IPL_DEPTH_8U, 1);
edge = cvCreateImage(cvSize(image->width,image->height), IPL_DEPTH_8U, 1);
cvCvtColor(image, gray, CV_BGR2GRAY);

// Create a window
cvNamedWindow(wndname, 1);

// create a toolbar
cvCreateTrackbar(tbarname, wndname, &edge_thresh, 100, on_trackbar);

// Show the image
on_trackbar(0);

// Wait for a key stroke; the same function arranges events processing
cvWaitKey(0);
cvReleaseImage(&image);
cvReleaseImage(&gray);
cvReleaseImage(&edge);
cvDestroyWindow(wndname);

return 0;
}

#ifdef _EiC
main(1,"edge.c");
#endif

4.
CamShift算法,OpenCV实现
From http://blog.csdn.net/houdy/archive/2004/11/10/175739.aspx
http://blog.csdn.net/houdy/archive/2004/11/10/175844.aspx
http://blog.csdn.net/houdy/archive/2004/11/23/191828.aspx

1--Back Projection

CamShift算法,即"Continuously Apative Mean-Shift"算法,是一种运动跟踪算法。它主要通过视频图像中运动物体的颜色信息来达到跟踪的目的。我把这个算法分解成三个部分,便于理解:
1) Back Projection计算
2) Mean Shift算法
3) CamShift算法
在这里主要讨论Back Projection,在随后的文章中继续讨论后面两个算法。

Back Projection
计算Back Projection的步骤是这样的:
1. 计算被跟踪目标的色彩直方图。在各种色彩空间中,只有HSI空间(或与HSI类似的色彩空间)中的H分量可以表示颜色信息。所以在具体的计算过程中,首先将其他的色彩空间的值转化到HSI空间,然后会其中的H分量做1D直方图计算。
2. 根据获得的色彩直方图将原始图像转化成色彩概率分布图像,这个过程就被称作"Back Projection"。
在OpenCV中的直方图函数中,包含Back Projection的函数,函数原型是:
void cvCalcBackProject(IplImage** img, CvArr** backproject, const CvHistogram* hist);
传递给这个函数的参数有三个:
1. IplImage** img:存放原始图像,输入。
2. CvArr** backproject:存放Back Projection结果,输出。
3. CvHistogram* hist:存放直方图,输入

下面就给出计算Back Projection的OpenCV代码。
1.准备一张只包含被跟踪目标的图片,将色彩空间转化到HSI空间,获得其中的H分量:
IplImage* target=cvLoadImage("target.bmp",-1); //装载图片
IplImage* target_hsv=cvCreateImage( cvGetSize(target), IPL_DEPTH_8U, 3 );
IplImage* target_hue=cvCreateImage( cvGetSize(target), IPL_DEPTH_8U, 3 );
cvCvtColor(target,target_hsv,CV_BGR2HSV); //转化到HSV空间
cvSplit( target_hsv, target_hue, NULL, NULL, NULL ); //获得H分量
2.计算H分量的直方图,即1D直方图:
IplImage* h_plane=cvCreateImage( cvGetSize(target_hsv),IPL_DEPTH_8U,1 );
int hist_size[]={255}; //将H分量的值量化到[0,255]
float* ranges[]={ {0,360} }; //H分量的取值范围是[0,360)
CvHistogram* hist=cvCreateHist(1, hist_size, ranges, 1);
cvCalcHist(&target_hue, hist, 0, NULL);
在这里需要考虑H分量的取值范围的问题,H分量的取值范围是[0,360),这个取值范围的值不能用一个byte来表示,为了能用一个byte表示,需要将H值做适当的量化处理,在这里我们将H分量的范围量化到[0,255].
4.计算Back Projection:
IplImage* rawImage;
//----------------------------------------------
//get from video frame,unsigned byte,one channel
//----------------------------------------------
IplImage* result=cvCreateImage(cvGetSize(rawImage),IPL_DEPTH_8U,1);
cvCalcBackProject(&rawImage,result,hist);
5.结果:result即为我们需要的.

2--Mean Shift算法
这里来到了CamShift算法,OpenCV实现的第二部分,这一次重点讨论Mean Shift算法。
在讨论Mean Shift算法之前,首先讨论在2D概率分布图像中,如何计算某个区域的重心(Mass Center)的问题,重心可以通过以下公式来计算:
1.计算区域内0阶矩
for(int i=0;i< height;i++)
for(int j=0;j< width;j++)
M00+=I(i,j)
2.区域内1阶矩:
for(int i=0;i< height;i++)
for(int j=0;j< width;j++)
{
M10+=i*I(i,j);
M01+=j*I(i,j);
}
3.则Mass Center为:
Xc=M10/M00; Yc=M01/M00
接下来,讨论Mean Shift算法的具体步骤,Mean Shift算法可以分为以下4步:
1.选择窗的大小和初始位置.
2.计算此时窗口内的Mass Center.
3.调整窗口的中心到Mass Center.
4.重复2和3,直到窗口中心"会聚",即每次窗口移动的距离小于一定的阈值。

在OpenCV中,提供Mean Shift算法的函数,函数的原型是:
int cvMeanShift(IplImage* imgprob,CvRect windowIn,
CvTermCriteria criteria,CvConnectedComp* out);

需要的参数为:
1.IplImage* imgprob:2D概率分布图像,传入;
2.CvRect windowIn:初始的窗口,传入;
3.CvTermCriteria criteria:停止迭代的标准,传入;
4.CvConnectedComp* out:查询结果,传出。
(注:构造CvTermCriteria变量需要三个参数,一个是类型,另一个是迭代的最大次数,最后一个表示特定的阈值。例如可以这样构造criteria:criteria=cvTermCriteria(CV_TERMCRIT_ITER|CV_TERMCRIT_EPS,10,0.1)。)

返回的参数:
1.int:迭代的次数。

3--CamShift算法

1.原理
在了解了MeanShift算法以后,我们将MeanShift算法扩展到连续图像序列(一般都是指视频图像序列),这样就形成了CamShift算法。CamShift算法的全称是"Continuously Apaptive Mean-SHIFT",它的基本思想是视频图像的所有帧作MeanShift运算,并将上一帧的结果(即Search Window的中心和大小)作为下一帧MeanShift算法的Search Window的初始值,如此迭代下去,就可以实现对目标的跟踪。整个算法的具体步骤分5步:
Step 1:将整个图像设为搜寻区域。
Step 2:初始话Search Window的大小和位置。
Step 3:计算Search Window内的彩色概率分布,此区域的大小比Search Window要稍微大一点。
Step 4:运行MeanShift。获得Search Window新的位置和大小。
Step 5:在下一帧视频图像中,用Step 3获得的值初始化Search Window的位置和大小。跳转到Step 3继续运行。

2.实现
在OpenCV中,有实现CamShift算法的函数,此函数的原型是:
cvCamShift(IplImage* imgprob, CvRect windowIn,
CvTermCriteria criteria,
CvConnectedComp* out, CvBox2D* box=0);
其中:
imgprob:色彩概率分布图像。
windowIn:Search Window的初始值。
Criteria:用来判断搜寻是否停止的一个标准。
out:保存运算结果,包括新的Search Window的位置和面积。
box:包含被跟踪物体的最小矩形。

说明:
1.在OpenCV 4.0 beta的目录中,有CamShift的例子。遗憾的是这个例子目标的跟踪是半自动的,即需要人手工选定一个目标。我正在努力尝试全自动的目标跟踪,希望可以和大家能在这方面与大家交流。

5.
运动目标跟踪与检测的源代码(CAMSHIFT 算法)
From http://blog.csdn.net/hunnish/archive/2004/09/07/97049.aspx

采用 CAMSHIFT 算法快速跟踪和检测运动目标的 C/C++ 源代码,OPENCV BETA 4.0 版本在其 SAMPLE 中给出了这个例子。算法的简单描述如下(英文):

This application demonstrates a fast, simple color tracking algorithm that can be used to track faces, hands . The CAMSHIFT algorithm is a modification of the Meanshift algorithm which is a robust statistical method of finding the mode (top) of a probability distribution. Both CAMSHIFT and Meanshift algorithms exist in the library. While it is a very fast and simple method of tracking, because CAMSHIFT tracks the center and size of the probability distribution of an object, it is only as good as the probability distribution that you produce for the object. Typically the probability distribution is derived from color via a histogram, although it could be produced from correlation, recognition scores or bolstered by frame differencing or motion detection schemes, or joint probabilities of different colors/motions etc.

In this application, we use only the most simplistic approach: A 1-D Hue histogram is sampled from the object in an HSV color space version of the image. To produce the probability image to track, histogram "back projection" (we replace image pixels by their histogram hue value) is used.

算法的详细情况,请看论文:

http://www.assuredigit.com/incoming/camshift.pdf

关于OPENCV B4.0 库的使用方法以及相关问题,请查阅下面的相关文章:

http://forum.assuredigit.com/display_topic_threads.asp?ForumID=11&TopicID=3471

运行文件下载:

http://www.assuredigit.com/product_tech/Demo_Download_files/camshiftdemo.exe

该运行文件在VC6.0环境下编译通过,是一个 stand-alone 运行程序,不需要OPENCV的DLL库支持。在运行之前,请先连接好USB接口的摄像头。然后可以用鼠标选定欲跟踪目标。

=====

#ifdef _CH_
#pragma package
#endif

#ifndef _EiC
#include "cv.h"
#include "highgui.h"
#include
#include
#endif

IplImage *image = 0, *hsv = 0, *hue = 0, *mask = 0, *backproject = 0, *histimg = 0;
CvHistogram *hist = 0;

int backproject_mode = 0;
int select_object = 0;
int track_object = 0;
int show_hist = 1;
CvPoint origin;
CvRect selection;
CvRect track_window;
CvBox2D track_box; // tracking 返回的区域 box,带角度
CvConnectedComp track_comp;
int hdims = 48; // 划分HIST的个数,越高越精确
float hranges_arr[] = {0,180};
float* hranges = hranges_arr;
int vmin = 10, vmax = 256, smin = 30;

void on_mouse( int event, int x, int y, int flags )
{
if( !image )
return;

if( image->origin )
y = image->height - y;

if( select_object )
{
selection.x = MIN(x,origin.x);
selection.y = MIN(y,origin.y);
selection.width = selection.x + CV_IABS(x - origin.x);
selection.height = selection.y + CV_IABS(y - origin.y);

selection.x = MAX( selection.x, 0 );
selection.y = MAX( selection.y, 0 );
selection.width = MIN( selection.width, image->width );
selection.height = MIN( selection.height, image->height );
selection.width -= selection.x;
selection.height -= selection.y;

}

switch( event )
{
case CV_EVENT_LBUTTONDOWN:
origin = cvPoint(x,y);
selection = cvRect(x,y,0,0);
select_object = 1;
break;
case CV_EVENT_LBUTTONUP:
select_object = 0;
if( selection.width > 0 && selection.height > 0 )
track_object = -1;
#ifdef _DEBUG
printf("\n # 鼠标的选择区域:");
printf("\n X = %d, Y = %d, Width = %d, Height = %d",
selection.x, selection.y, selection.width, selection.height);
#endif
break;
}
}


CvScalar hsv2rgb( float hue )
{
int rgb[3], p, sector;
static const int sector_data[][3]=
{{0,2,1}, {1,2,0}, {1,0,2}, {2,0,1}, {2,1,0}, {0,1,2}};
hue *= 0.033333333333333333333333333333333f;
sector = cvFloor(hue);
p = cvRound(255*(hue - sector));
p ^= sector & 1 ? 255 : 0;

rgb[sector_data[sector][0]] = 255;
rgb[sector_data[sector][1]] = 0;
rgb[sector_data[sector][2]] = p;

#ifdef _DEBUG
printf("\n # Convert HSV to RGB:");
printf("\n HUE = %f", hue);
printf("\n R = %d, G = %d, B = %d", rgb[0],rgb[1],rgb[2]);
#endif

return cvScalar(rgb[2], rgb[1], rgb[0],0);
}

int main( int argc, char** argv )
{
CvCapture* capture = 0;
IplImage* frame = 0;

if( argc == 1 || (argc == 2 && strlen(argv[1]) == 1 && isdigit(argv[1][0])))
capture = cvCaptureFromCAM( argc == 2 ? argv[1][0] - '0' : 0 );
else if( argc == 2 )
capture = cvCaptureFromAVI( argv[1] );

if( !capture )
{
fprintf(stderr,"Could not initialize capturing...\n");
return -1;
}

printf( "Hot keys: \n"
"\tESC - quit the program\n"
"\tc - stop the tracking\n"
"\tb - switch to/from backprojection view\n"
"\th - show/hide object histogram\n"
"To initialize tracking, select the object with mouse\n" );

//cvNamedWindow( "Histogram", 1 );
cvNamedWindow( "CamShiftDemo", 1 );
cvSetMouseCallback( "CamShiftDemo", on_mouse ); // on_mouse 自定义事件
cvCreateTrackbar( "Vmin", "CamShiftDemo", &vmin, 256, 0 );
cvCreateTrackbar( "Vmax", "CamShiftDemo", &vmax, 256, 0 );
cvCreateTrackbar( "Smin", "CamShiftDemo", &smin, 256, 0 );

for(;;)
{
int i, bin_w, c;

frame = cvQueryFrame( capture );
if( !frame )
break;

if( !image )
{
/* allocate all the buffers */
image = cvCreateImage( cvGetSize(frame), 8, 3 );
image->origin = frame->origin;
hsv = cvCreateImage( cvGetSize(frame), 8, 3 );
hue = cvCreateImage( cvGetSize(frame), 8, 1 );
mask = cvCreateImage( cvGetSize(frame), 8, 1 );
backproject = cvCreateImage( cvGetSize(frame), 8, 1 );
hist = cvCreateHist( 1, &hdims, CV_HIST_ARRAY, &hranges, 1 ); // 计算直方图
histimg = cvCreateImage( cvSize(320,200), 8, 3 );
cvZero( histimg );
}

cvCopy( frame, image, 0 );
cvCvtColor( image, hsv, CV_BGR2HSV ); // 彩色空间转换 BGR to HSV

if( track_object )
{
int _vmin = vmin, _vmax = vmax;

cvInRangeS( hsv, cvScalar(0,smin,MIN(_vmin,_vmax),0),
cvScalar(180,256,MAX(_vmin,_vmax),0), mask ); // 得到二值的MASK
cvSplit( hsv, hue, 0, 0, 0 ); // 只提取 HUE 分量

if( track_object < 0 )
{
float max_val = 0.f;
cvSetImageROI( hue, selection ); // 得到选择区域 for ROI
cvSetImageROI( mask, selection ); // 得到选择区域 for mask
cvCalcHist( &hue, hist, 0, mask ); // 计算直方图
cvGetMinMaxHistValue( hist, 0, &max_val, 0, 0 ); // 只找最大值
cvConvertScale( hist->bins, hist->bins, max_val ? 255. / max_val : 0., 0 ); // 缩放 bin 到区间 [0,255]
cvResetImageROI( hue ); // remove ROI
cvResetImageROI( mask );
track_window = selection;
track_object = 1;

cvZero( histimg );
bin_w = histimg->width / hdims; // hdims: 条的个数,则 bin_w 为条的宽度

// 画直方图
for( i = 0; i < hdims; i++ )
{
int val = cvRound( cvGetReal1D(hist->bins,i)*histimg->height/255 );
CvScalar color = hsv2rgb(i*180.f/hdims);
cvRectangle( histimg, cvPoint(i*bin_w,histimg->height),
cvPoint((i+1)*bin_w,histimg->height - val),
color, -1, 8, 0 );
}
}

cvCalcBackProject( &hue, backproject, hist ); // 使用 back project 方法
cvAnd( backproject, mask, backproject, 0 );

// calling CAMSHIFT 算法模块
cvCamShift( backproject, track_window,
cvTermCriteria( CV_TERMCRIT_EPS | CV_TERMCRIT_ITER, 10, 1 ),
&track_comp, &track_box );
track_window = track_comp.rect;

if( backproject_mode )
cvCvtColor( backproject, image, CV_GRAY2BGR ); // 使用backproject灰度图像
if( image->origin )
track_box.angle = -track_box.angle;
cvEllipseBox( image, track_box, CV_RGB(255,0,0), 3, CV_AA, 0 );
}

if( select_object && selection.width > 0 && selection.height > 0 )
{
cvSetImageROI( image, selection );
cvXorS( image, cvScalarAll(255), image, 0 );
cvResetImageROI( image );
}

cvShowImage( "CamShiftDemo", image );
cvShowImage( "Histogram", histimg );

c = cvWaitKey(10);
if( c == 27 )
break; // exit from for-loop
switch( c )
{
case 'b':
backproject_mode ^= 1;
break;
case 'c':
track_object = 0;
cvZero( histimg );
break;
case 'h':
show_hist ^= 1;
if( !show_hist )
cvDestroyWindow( "Histogram" );
else
cvNamedWindow( "Histogram", 1 );
break;
default:
;
}
}

cvReleaseCapture( &capture );
cvDestroyWindow("CamShiftDemo");

return 0;
}

#ifdef _EiC
main(1,"camshiftdemo.c");
#endif

This page is powered by Blogger. Isn't yours?