作者:原野上的蚂蚁 | 来源:互联网 | 2023-02-03 05:01
我正在尝试使用ApacheOpenNLP1.7构建自定义NER.从可用的文档Here,我开发了以下代码importjava.io.BufferedOutputStream;impo
我正在尝试使用Apache OpenNLP 1.7构建自定义NER.从可用的文档Here,我开发了以下代码
import java.io.BufferedOutputStream;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.nio.charset.Charset;
import opennlp.tools.namefind.NameFinderME;
import opennlp.tools.namefind.NameSample;
import opennlp.tools.namefind.NameSampleDataStream;
import opennlp.tools.namefind.TokenNameFinderFactory;
import opennlp.tools.namefind.TokenNameFinderModel;
import opennlp.tools.util.ObjectStream;
import opennlp.tools.util.PlainTextByLineStream;
import opennlp.tools.util.TrainingParameters;
public class PersonClassifierTrainer {
static String modelFile = "/opt/NLP/data/en-ner-customperson.bin";
public static void main(String[] args) throws IOException {
Charset charset = Charset.forName("UTF-8");
**ObjectStream lineStream = new PlainTextByLineStream(new FileInputStream("/opt/NLP/data/person.train"), charset);**
ObjectStream sampleStream = new NameSampleDataStream(lineStream);
TokenNameFinderModel model;
TokenNameFinderFactory nameFinderFactory = null;
try {
model = NameFinderME.train("en", "person", sampleStream, TrainingParameters.defaultParams(),
nameFinderFactory);
} finally {
sampleStream.close();
}
BufferedOutputStream modelOut = null;
try {
modelOut = new BufferedOutputStream(new FileOutputStream(modelFile));
model.serialize(modelOut);
} finally {
if (modelOut != null)
modelOut.close();
}
}
}
上面突出显示的代码显示 – ‘cast argument’file”to insputstreamfactory’
我被迫投了这个,因为它显示错误.
现在当我运行我的代码时,我收到以下错误
java.io.FileInputStream cannot be cast to opennlp.tools.util.InputStreamFactory
这里有什么遗漏的吗?
编辑1:Person.train文件包含此数据
Hardik is a software Professional. Hardik works at company and is part of development team. Hardik lives in New York
Hardik loves R statistical software
Hardik is a student at ISB
Hardik loves nature
Edit2:我现在得到空指针异常,有什么帮助吗?
解决方法:
您需要一个InputStreamFactory实例来检索您的InputStream.此外,TokenNameFinderFactory不能为null.
public class PersonClassifierTrainer {
static String modelFile = "/opt/NLP/data/en-ner-customperson.bin";
public static void main(String[] args) throws IOException {
InputStreamFactory isf = new InputStreamFactory() {
public InputStream createInputStream() throws IOException {
return new FileInputStream("/opt/NLP/data/person.train");
}
};
Charset charset = Charset.forName("UTF-8");
ObjectStream lineStream = new PlainTextByLineStream(isf, charset);
ObjectStream sampleStream = new NameSampleDataStream(lineStream);
TokenNameFinderModel model;
TokenNameFinderFactory nameFinderFactory = new TokenNameFinderFactory();
try {
model = NameFinderME.train("en", "person", sampleStream, TrainingParameters.defaultParams(),
nameFinderFactory);
} finally {
sampleStream.close();
}
BufferedOutputStream modelOut = null;
try {
modelOut = new BufferedOutputStream(new FileOutputStream(modelFile));
model.serialize(modelOut);
} finally {
if (modelOut != null)
modelOut.close();
}
}
}
编辑1:Person.train文件包含此数据
Hardik is a software Professional. Hardik works at company and is part of development team. Hardik lives in New York
Hardik loves R statistical software
Hardik is a student at ISB
Hardik loves nature