Cross-media retrieval method based on visual features and semantic features

基于视觉特征和语义特征的跨媒体检索方法

Abstract

本发明针对互联网海量异构数据之间错综复杂的关系的公开了一种基于视觉特征和语义特征的跨媒体检索方法,主要包括以下几个步骤:第一步,使用二次开发的分布式网络爬虫对目标数据源的数据进行抓取;第二步,针对不同数据源,分别编写不同的模板,对网页进行基于模板的信息提取,对数据进行解析去噪,并存入数据库中;第三步,对图片提取特征值并建立索引,建立语义关联图;第四步,使用SVM支持向量机和已经训练过的模型,对内容进行分类;第五步,根据提取出的视觉特征和语义特征,计算不同类型数据之间的相似距离,分析不同类型数据之间的关联性。采用本方法,可以较有效地挖掘出不同类型数据之间的关联性。
The invention discloses a cross-media retrieval method based on visual features and semantic features based on complicated relation among mass isomerous data of internet. The method mainly includes steps of 1, using a secondary developed distributive web crawler for fetching data of a target data source; 2, directing at different data sources, compiling different templates for template-based information extraction on web pages, performing analysis and noise removal on the data and storing the data into a data base; 3, extracting feature values of images and creating an index, and creating a semantic association map; 4, using an SVM (Support Vector Machine) and a trained model for classifying content; 5, based on the extracted visual features and semantic features, calculating similarity distance between different types of data and analyzing the relevance of the different types of data. By adopting the method, relevance among different types of data can be dug effectively.

Claims

Description

Topics

Download Full PDF Version (Non-Commercial Use)

Patent Citations (0)

    Publication numberPublication dateAssigneeTitle

NO-Patent Citations (0)

    Title

Cited By (0)

    Publication numberPublication dateAssigneeTitle