Lucene对数据库索引

hexq · 发表于 2007-3-8 17:21

:victory: 呵呵，想写很久了，开始吧！
一，Lucene相关介绍。这个就大家上网搜索下了吧，有很多资料的。
二，建立开发平台。
   我用的是Myeclipse+Maven2+jdk1.5+tomcat5.05。Maven2（M2）大家可能比较陌生，它是一个类似Ant的工具，不过功能比Ant强大多了，大家也可以上网搜索下相关的资料。
   用M2 建议项目，之后作为工作区的项目导入Myeclipse.
三，索引。
   因为我是要对数据库的数据进行索引，考虑到性能，从数据库取得数据并没有使用hibernate之类的工具，直接用jdbc。
   直接给出索引的代码，有比较详细的说明，一个名为video的数据库
   public static void index() {
   // 取得访问video数据库url
   String url = loadEnv.getProperty("videoUrl");
   String sql = "select * from Video";
   // Jdbc封装了一些jdbc的操作
   Jdbc jdbc = new Jdbc();
   DataHandlerInter dh = new DataHandler();
   // 取得索引文件保存路徑
   String indexDirPath = loadEnv.getProperty("videoIndexDir",);
   File indexDir = new File(indexDirPath);
   // 为更好的支持对中文索引和分词，选择用CJKAnalyzer
   Analyzer CJKAnalyzer = new CJKAnalyzer();
try {

   //true是否覆盖之前的文件
   IndexWriter indexWriter = new IndexWriter(indexDir, CJKAnalyzer,  true);
   indexWriter.setMergeFactor(100);
   indexWriter.setMaxBufferedDocs(100);
   // 只索引这个Field的前5000个字，默认为10000
   // indexWriter.setMaxFieldLength(5000);
   // 从数据库取得所有记录
   ResultSet resset = dh.getAll(url, sql, jdbc);
   System.out.println(resset.getMetaData().getColumnCount());
   // 执行索引之前先删除之前的旧文件
   IndexUtils i = new IndexUtils();
   i.deleteIndex(indexDirPath);
   Date start = new Date();
   // 执行索引
   Document(resset, indexWriter);
   // 优化索引文件
   indexWriter.optimize();
   // 关闭搜索
   indexWriter.close();
   Date end = new Date();
   System.out.println(end.getTime() - start.getTime() + "ms");
  } catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
  } catch (SQLException e) {
// TODO Auto-generated catch block
e.printStackTrace();
  } finally {
try {
jdbc.close();
} catch (SQLException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
  }
}
   索引之前尽量删除之前的索引旧文件
   执行索引，具体是怎么样执行的呢？
public static void Document(ResultSet rest, IndexWriter indexWriter)
throws java.io.IOException {
  try {
while (rest.next()) {
  取得对应字段的数据
String video_title = rest.getString("video_title");//
索引文件的结构跟数据库结构类似，一个Document相当如数据库的一行数据，一个Field相当如数据库的列
Document doc = new Document();
Field field_video_title = new Field("video_title", video_title,
   Field.Store.YES, Field.Index.TOKENIZED,
   Field.TermVector.YES);
doc.add(field_video_title);
继续添加其他Field
................
indexWriter.addDocument(doc);
}
  } catch (SQLException e) {
// TODO Auto-generated catch block
e.printStackTrace();
  }
}
简单的索引大概就是这样，基本的思路就是：取出数据——索引

iptton · 发表于 2007-3-8 18:45

翻过下 AJAX+Lucene 这本书....图书馆有得借.

貌似插楼了。。。楼主应该还有下文吧.

[ 本帖最后由 iptton 于 2007-3-8 18:46 编辑 ]

ycoe · 发表于 2007-3-9 14:16

提示: 作者被禁止或删除内容自动屏蔽

用程序诠释生命 · 发表于 2007-3-18 22:34

貌似楼主从做毕业设计的时候开始就在搞lucene,直到现在
楼主持之以恒的精神值得我学习！！

wool王 · 发表于 2007-3-20 16:21

好久没看技术的东西了～要重拾充电才行了～～

brilon · 发表于 2007-4-2 22:36

到时候再看看。。。。

		自动登录	找回密码
密码			加入后院

ycoe ycoe 当前离线积分 900 IP卡狗仔卡头像被屏蔽	发表于 2007-3-9 14:16 \| 显示全部楼层提示: 作者被禁止或删除内容自动屏蔽
ycoe ycoe 当前离线积分 900 IP卡狗仔卡头像被屏蔽
	回复顶踩使用道具举报显身卡

Lucene对数据库索引

相关帖子

浏览过的版块