MITIE: library and tools for information extraction

MITIE: MIT Information Extraction

This project provides free (even for commercial use) state-of-the-art information extraction tools. The current release includes tools for performing named entity extraction and binary relation detection as well as tools for training custom extractors and relation detectors.

MITIE is built on top of dlib, a high-performance machine-learning library[1], MITIE makes use of several state-of-the-art techniques including the use of distributional word embeddings[2] and Structural Support Vector Machines[3]. MITIE offers several pre-trained models providing varying levels of support for both English, Spanish, and German trained using a variety of linguistic resources (e.g., CoNLL 2003, ACE, Wikipedia, Freebase, and Gigaword). The core MITIE software is written in C++, but bindings for several other software languages including Python, R, Java, C, and MATLAB allow a user to quickly integrate MITIE into his/her own applications.

Outside projects have created API bindings for OCaml, .NET, .NET Core, and Ruby. There is also an interactive tool for labeling data and training MITIE.

Using MITIE

MITIE's primary API is a C API which is documented in the mitie.h header file. Beyond this, there are many example programs showing how to use MITIE from C, C++, Java, R, or Python 2.7.

Initial Setup

Before you can run the provided examples you will need to download the trained model files which you can do by running:

make MITIE-models

or by simply downloading the MITIE-models-v0.2.tar.bz2 file and extracting it in your MITIE folder. Note that the Spanish and German models are supplied in separate downloads. So if you want to use the Spanish NER model then download MITIE-models-v0.2-Spanish.zip and extract it into your MITIE folder. Similarly for the German model: MITIE-models-v0.2-German.tar.bz2

Using MITIE from the command line

MITIE comes with a basic streaming NER tool. So you can tell MITIE to process each line of a text file independently and output marked up text with the command:

cat sample_text.txt | ./ner_stream MITIE-models/english/ner_model.dat  

The ner_stream executable can be compiled by running make in the top level MITIE folder or by navigating to the tools/ner_stream folder and running make or using CMake to build it which can be done with the following commands:

cd tools/ner_stream
mkdir build
cd build
cmake ..
cmake --build . --config Release

Compiling MITIE as a shared library

On a UNIX like system, this can be accomplished by running make in the top level MITIE folder or by running:

cd mitielib
make

This produces shared and static library files in the mitielib folder. Or you can use CMake to compile a shared library by typing:

cd mitielib
mkdir build
cd build
cmake ..
cmake --build . --config Release --target install

Either of these methods will create a MITIE shared library in the mitielib folder.

Compiling MITIE using OpenBLAS

If you compile MITIE using cmake then it will automatically find and use any optimized BLAS libraries on your machine. However, if you compile using regular make then you have to manually locate your BLAS libaries or DLIB will default to its built in, but slower, BLAS implementation. Therefore, to use OpenBLAS when compiling without cmake, locate libopenblas.a and libgfortran.a, then run make as follows:

cd mitielib 
make BLAS_PATH=/path/to/openblas.a LIBGFORTRAN_PATH=/path/to/libfortran.a

Note that if your BLAS libraries are not in standard locations cmake will fail to find them. However, you can tell it what folder to look in by replacing cmake .. with a statement such as:

cmake -DCMAKE_LIBRARY_PATH=/home/me/place/i/put/blas/lib ..

Using MITIE from a Python 2.7 program

Once you have built the MITIE shared library, you can go to the examples/python folder and simply run any of the Python scripts. Each script is a tutorial explaining some aspect of MITIE: named entity recognition and relation extraction, training a custom NER tool, or training a custom relation extractor.

You can also install mitie direcly from github with this command: pip install git+https://github.com/mit-nlp/MITIE.git.

Using MITIE from R

MITIE can be installed as an R package. See the README for more details.

Using MITIE from a C program

There are example C programs in the examples/C folder. To compile of them you simply go into those folders and run make. Or use CMake like so:

cd examples/C/ner
mkdir build
cd build
cmake ..
cmake --build . --config Release

Using MITIE from a C++ program

There are example C++ programs in the examples/cpp folder. To compile any of them you simply go into those folders and run make. Or use CMake like so:

cd examples/cpp/ner
mkdir build
cd build
cmake ..
cmake --build . --config Release

Using MITIE from a Java program

There is an example Java program in the examples/java folder. Before you can run it you must compile MITIE's java interface which you can do like so:

cd mitielib/java
mkdir build
cd build
cmake ..
cmake --build . --config Release --target install

That will place a javamitie shared library and jar file into the mitielib folder. Once you have those two files you can run the example program in examples/java by running run_ner.bat if you are on Windows or run_ner.sh if you are on a POSIX system like Linux or OS X.

Also note that you must have Swig 1.3.40 or newer, CMake 2.8.4 or newer, and the Java JDK installed to compile the MITIE interface. Finally, note that if you are using 64bit Java on Windows then you will need to use a command like:

cmake -G "Visual Studio 10 Win64" ..

instead of cmake .. so that Visual Studio knows to make a 64bit library.

Running MITIE's unit tests

You can run a simple regression test to validate your build. Do this by running the following command from the top level MITIE folder:

make test

make test builds both the example programs and downloads required example models. If you require a non-standard C++ compiler, change CC in examples/C/makefile and in tools/ner_stream/makefile.

Precompiled Python 2.7 binaries

We have built Python 2.7 binaries packaged with sample models for 64bit Linux and Windows (both 32 and 64 bit version of Python). You can download the precompiled package here: Precompiled MITIE 0.2

Precompiled Java 64bit binaries

We have built Java binaries for the 64bit JVM which work on Linux and Windows. You can download the precompiled package here: Precompiled Java MITIE 0.3. In the file is an examples/java folder. You can run the example by executing the provided .bat or .sh file.

License

MITIE is licensed under the Boost Software License - Version 1.0 - August 17th, 2003.

Permission is hereby granted, free of charge, to any person or organization obtaining a copy of the software and accompanying documentation covered by this license (the "Software") to use, reproduce, display, distribute, execute, and transmit the Software, and to prepare derivative works of the Software, and to permit third-parties to whom the Software is furnished to do so, all subject to the following:

The copyright notices in the Software and this entire statement, including the above license grant, this restriction and the following disclaimer, must be included in all copies of the Software, in whole or in part, and all derivative works of the Software, unless such copies or derivative works are solely in the form of machine-executable object code generated by a source language processor.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, TITLE AND NON-INFRINGEMENT. IN NO EVENT SHALL THE COPYRIGHT HOLDERS OR ANYONE DISTRIBUTING THE SOFTWARE BE LIABLE FOR ANY DAMAGES OR OTHER LIABILITY, WHETHER IN CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

References

[1] Davis E. King. Dlib-ml: A Machine Learning Toolkit. Journal of Machine Learning Research 10, pp. 1755-1758, 2009.

[2] Paramveer Dhillon, Dean Foster and Lyle Ungar, Eigenwords: Spectral Word Embeddings, Journal of Machine Learning Research (JMLR), 16, 2015.

[3] T. Joachims, T. Finley, Chun-Nam Yu, Cutting-Plane Training of Structural SVMs, Machine Learning, 77(1):27-59, 2009.

Comments
  • MITIE install fails on Windows 10

    MITIE install fails on Windows 10

    Hi,

    I'm trying to install MITIE backend on windows 10 using pip install git+https://github.com/mit-nlp/MITIE.git. However the install fails at the build step with the following error message. `Collecting git+https://github.com/mit-nlp/MITIE.git Cloning https://github.com/mit-nlp/MITIE.git to c:\users\njones\appdata\local\temp\pip-97mlgx-build Installing collected packages: mitie Running setup.py install for mitie: started Running setup.py install for mitie: finished with status 'error' Complete output from command c:\python27\python.exe -u -c "import setuptools, tokenize;file='c:\users\njones\appdata\local\temp\pip-97mlgx-build\setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record c:\users\njones\appdata\local\temp\pip-ydwrn7-record\install-record.txt --single-version-externally-managed --compile: running install running build error: [Error 2] The system cannot find the file specified


    Command "c:\python27\python.exe -u -c "import setuptools, tokenize;file='c:\users\njones\appdata\local\temp\pip-97mlgx-build\setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record c:\users\njones\appdata\local\temp\pip-ydwrn7-record\install-record.txt --single-version-externally-managed --compile" failed with error code 1 in c:\users\njones\appdata\local\temp\pip-97mlgx-build`

    Any suggestions on I could be doing wrong would be really helpful!

    Thanks

  • Training ner on a a new corpus

    Training ner on a a new corpus

    Is there any memory leak? its taking a lot of memory for a very few training samples . Its getting killed after printing this

    num feats in chunker model: 4095 train: precision, recall, f1-score: 0.984615 0.984615 0.984615 now do training num training samples: 198

    I observed the memory usage and saw that it kept on increasing gradually once it reaches here, as if in each iteration some memory is getting filled garbage.

  • Add text classification support

    Add text classification support

    Hi Davis, as discussed, I simply use the average feature vectors of each word as the feature vector for the entire document. Overall, there are a few new files to support this new feature, but not any single line deletion on existing codes. Therefore, all the existing features should not be affected, and future improvements are expected based on these new files.

    Regarding the performance in its current form, I test it based on a small internal mail classification dataset, the result is surprisingly good (e.g., more than 90% F1-score).

    I would really appreciate if you can spare some time to review the code, and give me your valuable comments. Thanks.

  • Errors Building MITIE for Java

    Errors Building MITIE for Java

    /Users/davidlaxer/MITIE/dlib/dlib/gui_widgets/nativefont.h:29:10: fatal error: 'X11/Xlocale.h' file not found

    include <X11/Xlocale.h>

         ^
    

    1 error generated.

    OS X 10.10.4.

    David-Laxers-MacBook-Pro:MITIE davidlaxer$ xcode-select --install xcode-select: error: command line tools are already installed, use "Software Update" to install updates

    David-Laxers-MacBook-Pro:java davidlaxer$ pwd /Users/davidlaxer/MITIE/mitielib/java

    David-Laxers-MacBook-Pro:java davidlaxer$ java -version java version "1.8.0_05" Java(TM) SE Runtime Environment (build 1.8.0_05-b13) Java HotSpot(TM) 64-Bit Server VM (build 25.5-b02, mixed mode) David-Laxers-MacBook-Pro:java davidlaxer$

    -- The C compiler identification is AppleClang 6.1.0.6020053 -- The CXX compiler identification is AppleClang 6.1.0.6020053

    David-Laxers-MacBook-Pro:java davidlaxer$ mkdir build David-Laxers-MacBook-Pro:java davidlaxer$ cmake .. -- The C compiler identification is AppleClang 6.1.0.6020053 -- The CXX compiler identification is AppleClang 6.1.0.6020053 -- Check for working C compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc -- Check for working C compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc -- works -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Detecting C compile features -- Detecting C compile features - done -- Check for working CXX compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/c++ -- Check for working CXX compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/c++ -- works -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Detecting CXX compile features -- Detecting CXX compile features - done -- Looking for png_create_read_struct -- Looking for png_create_read_struct - found -- Looking for jpeg_read_header -- Looking for jpeg_read_header - found -- Searching for BLAS and LAPACK -- Looking for sys/types.h -- Looking for sys/types.h - found -- Looking for stdint.h -- Looking for stdint.h - found -- Looking for stddef.h -- Looking for stddef.h - found -- Check size of void* -- Check size of void* - done -- Found OpenBLAS library -- Looking for sgetrf_single -- Looking for sgetrf_single - not found -- Found LAPACK library -- Looking for cblas_ddot -- Looking for cblas_ddot - found -- Check for STD namespace -- Check for STD namespace - found -- Looking for C++ include iostream -- Looking for C++ include iostream - found -- Configuring done CMake Warning (dev): Policy CMP0042 is not set: MACOSX_RPATH is enabled by default. Run "cmake --help-policy CMP0042" for policy details. Use the cmake_policy command to set the policy and suppress this warning.

    MACOSX_RPATH is not specified for the following targets:

    mitie

    This warning is for project developers. Use -Wno-dev to suppress it.

    -- Generating done -- Build files have been written to: /Users/davidlaxer/MITIE/mitielib/java David-Laxers-MacBook-Pro:java davidlaxer$ cmake --build . --config Release --target install Scanning dependencies of target dlib [ 0%] Building CXX object dlib_build/CMakeFiles/dlib.dir/base64/base64_kernel_1.o [ 1%] Building CXX object dlib_build/CMakeFiles/dlib.dir/bigint/bigint_kernel_1.o [ 2%] Building CXX object dlib_build/CMakeFiles/dlib.dir/bigint/bigint_kernel_2.o [ 3%] Building CXX object dlib_build/CMakeFiles/dlib.dir/bit_stream/bit_stream_kernel_1.o [ 4%] Building CXX object dlib_build/CMakeFiles/dlib.dir/entropy_decoder/entropy_decoder_kernel_1.o [ 5%] Building CXX object dlib_build/CMakeFiles/dlib.dir/entropy_decoder/entropy_decoder_kernel_2.o [ 6%] Building CXX object dlib_build/CMakeFiles/dlib.dir/entropy_encoder/entropy_encoder_kernel_1.o [ 7%] Building CXX object dlib_build/CMakeFiles/dlib.dir/entropy_encoder/entropy_encoder_kernel_2.o [ 8%] Building CXX object dlib_build/CMakeFiles/dlib.dir/md5/md5_kernel_1.o [ 9%] Building CXX object dlib_build/CMakeFiles/dlib.dir/tokenizer/tokenizer_kernel_1.o [ 10%] Building CXX object dlib_build/CMakeFiles/dlib.dir/unicode/unicode.o [ 11%] Building CXX object dlib_build/CMakeFiles/dlib.dir/data_io/image_dataset_metadata.o [ 12%] Building CXX object dlib_build/CMakeFiles/dlib.dir/sockets/sockets_kernel_1.o [ 13%] Building CXX object dlib_build/CMakeFiles/dlib.dir/bsp/bsp.o [ 14%] Building CXX object dlib_build/CMakeFiles/dlib.dir/dir_nav/dir_nav_kernel_1.o [ 15%] Building CXX object dlib_build/CMakeFiles/dlib.dir/dir_nav/dir_nav_kernel_2.o [ 16%] Building CXX object dlib_build/CMakeFiles/dlib.dir/dir_nav/dir_nav_extensions.o [ 17%] Building CXX object dlib_build/CMakeFiles/dlib.dir/linker/linker_kernel_1.o [ 18%] Building CXX object dlib_build/CMakeFiles/dlib.dir/logger/extra_logger_headers.o [ 19%] Building CXX object dlib_build/CMakeFiles/dlib.dir/logger/logger_kernel_1.o [ 20%] Building CXX object dlib_build/CMakeFiles/dlib.dir/logger/logger_config_file.o [ 20%] Building CXX object dlib_build/CMakeFiles/dlib.dir/misc_api/misc_api_kernel_1.o [ 21%] Building CXX object dlib_build/CMakeFiles/dlib.dir/misc_api/misc_api_kernel_2.o [ 22%] Building CXX object dlib_build/CMakeFiles/dlib.dir/sockets/sockets_extensions.o [ 23%] Building CXX object dlib_build/CMakeFiles/dlib.dir/sockets/sockets_kernel_2.o [ 24%] Building CXX object dlib_build/CMakeFiles/dlib.dir/sockstreambuf/sockstreambuf.o [ 25%] Building CXX object dlib_build/CMakeFiles/dlib.dir/sockstreambuf/sockstreambuf_unbuffered.o [ 26%] Building CXX object dlib_build/CMakeFiles/dlib.dir/server/server_kernel.o [ 27%] Building CXX object dlib_build/CMakeFiles/dlib.dir/server/server_iostream.o [ 28%] Building CXX object dlib_build/CMakeFiles/dlib.dir/server/server_http.o [ 29%] Building CXX object dlib_build/CMakeFiles/dlib.dir/threads/multithreaded_object_extension.o [ 30%] Building CXX object dlib_build/CMakeFiles/dlib.dir/threads/threaded_object_extension.o [ 31%] Building CXX object dlib_build/CMakeFiles/dlib.dir/threads/threads_kernel_1.o [ 32%] Building CXX object dlib_build/CMakeFiles/dlib.dir/threads/threads_kernel_2.o [ 33%] Building CXX object dlib_build/CMakeFiles/dlib.dir/threads/threads_kernel_shared.o [ 34%] Building CXX object dlib_build/CMakeFiles/dlib.dir/threads/thread_pool_extension.o [ 35%] Building CXX object dlib_build/CMakeFiles/dlib.dir/timer/timer.o [ 36%] Building CXX object dlib_build/CMakeFiles/dlib.dir/stack_trace.o [ 37%] Building CXX object dlib_build/CMakeFiles/dlib.dir/gui_widgets/fonts.o In file included from /Users/davidlaxer/MITIE/dlib/dlib/gui_widgets/fonts.cpp:14: /Users/davidlaxer/MITIE/dlib/dlib/gui_widgets/nativefont.h:29:10: fatal error: 'X11/Xlocale.h' file not found

    include <X11/Xlocale.h>

         ^
    

    1 error generated. dlib_build/CMakeFiles/dlib.dir/build.make:974: recipe for target 'dlib_build/CMakeFiles/dlib.dir/gui_widgets/fonts.o' failed gmake[2]: *** [dlib_build/CMakeFiles/dlib.dir/gui_widgets/fonts.o] Error 1 CMakeFiles/Makefile2:122: recipe for target 'dlib_build/CMakeFiles/dlib.dir/all' failed gmake[1]: *** [dlib_build/CMakeFiles/dlib.dir/all] Error 2 Makefile:127: recipe for target 'all' failed gmake: *** [all] Error 2 David-Laxers-MacBook-Pro:java davidlaxer$

  • UTF-8 problems

    UTF-8 problems

    Hi,

    First of all thank let me thank you for this great tool. We are using MITIE via python 2.7. To my best knowledge we have to convert our strings from unicode to plain bytes before passing them to MITIE. When using tokenize_with_offset this can lead to some offset detected in the middle of some unicode character spanning over multiple bytes which results in "UnicodeDecodeError: 'utf8' codec can't decode bytes in position 4-5: unexpected end of data" after attempt for decode.

    Any ideas?

    Many thanks, Jakub

  • Does MITIE support Python 3.6.3?

    Does MITIE support Python 3.6.3?

    Hello,

    I need to use Python 3.6.3 for another python library on Windows, so just wondering if MITIE supports Python 3 and if not, how difficult is it to support it? Thanks!

  • python API for text categorizer

    python API for text categorizer

    Ok so not done yet but wanted to start discussion. Have quite a few editor artefacts looking at the changes here.

    Currently if I try to call the add function on an instance of the text_categorizer_trainer I get an error thrown in the checked_cast method in mitie.cpp (see examples/python/train_text_categorizer.py )

  • How to make multiple models share the same extractor?

    How to make multiple models share the same extractor?

    Hi Davis,

    Thanks for your help always.

    We always want to reduce the memory usage. Since normally we can not control extractor, so at least we hope that multiple models can share the same extractor.

    With the current C++ implementation without using pointer, it seems that there is no way to share the extractor among multiple models. I tried to write the following code in three cases.

    TotalWordFeatureExtractor totalWordFeatureExtractor = TotalWordFeatureExtractor.getEnglishExtractor();
    NamedEntityExtractor ner = new NamedEntityExtractor(file.getAbsolutePath(), totalWordFeatureExtractor);
    

    The above code consumes around 680 MB JVM memory.

    TotalWordFeatureExtractor totalWordFeatureExtractor = TotalWordFeatureExtractor.getEnglishExtractor();
    NamedEntityExtractor ner = new NamedEntityExtractor(file.getAbsolutePath(), totalWordFeatureExtractor);
    NamedEntityExtractor ner2 = new NamedEntityExtractor(file.getAbsolutePath(), totalWordFeatureExtractor);
    

    The above code consumes around 975 MB JVM memory as following.

    screen shot 2016-01-09 at 7 47 15 pm
    TotalWordFeatureExtractor totalWordFeatureExtractor = TotalWordFeatureExtractor.getEnglishExtractor();
    NamedEntityExtractor ner = new NamedEntityExtractor(file.getAbsolutePath(), totalWordFeatureExtractor);
    NamedEntityExtractor ner2 = new NamedEntityExtractor(file.getAbsolutePath(), totalWordFeatureExtractor);
    NamedEntityExtractor ner3 = new NamedEntityExtractor(file.getAbsolutePath(), totalWordFeatureExtractor);
    

    The above code consumes around 1.26 GB JVM memory.

    For the detailed code, please refer to the following link. https://github.com/wihoho/MITIE/blob/master/mitielib/java/maven/src/test/java/edu/mit/ll/mitie/NamedEntityExtractorTest.java#L41

    Obviously, there is not what we want. The ideal case is that the memory shall still be around 690 MB even there are three different models. So I assume that using pointer in the C++ code will be the only way to overcome this issue. We would like to seek your opinions on resolving this issue because actually we are not good at C++.

    Thank you.

  • Exposing scores from the named entity exctractor.

    Exposing scores from the named entity exctractor.

    Exposing scores in mitie::named_entity_extractor by adding a predict method, which follows the design patterns of the underlying dlib::multiclass_linear_decision_function.

  • Is it possible to reduce the size of the model

    Is it possible to reduce the size of the model

    Hi, I've come across this library, and found it is really amazing! The accuracy is even better than Stanford NER demo!

    Although I understand it contains a high dimensional space with over 500,000 dimensions, is it possible to reduce the model size?

  • Mitie installation failing on Windows 10

    Mitie installation failing on Windows 10

    I tried executing "cmake --build . --config Release --target install". But getting following error.

    The system cannot find the file specified CMake Error: Generator: execution of make failed. Make command was: "nmake" "/NOLOGO" "install"

    Can you please help me.

  • How does mitie deal with the segmentation of OOV

    How does mitie deal with the segmentation of OOV

    Expected Behavior

    Hi,I want to know how does mitie deal with the segmentation of OOV. In fact, two of my train example like this: 1.The daily life of the [League Of Legends](name) on November 10 (chinese: [英雄联盟](name)11.10的日活) 2. The daily life of the [Tomb Raider3](name) on November 10 (chinese: [古墓丽影3](name)11.10的日活) My training sample is in Chinese which contains many entities related to the game name. Some game names contain numbers, some have no numbers,like "古墓丽影3" and ”英雄联盟“.In the example above , I want mitie to identify the entities as "古墓丽影3" and the ”英雄联盟“. 11.10 is a simple representation of the date,which should not be include.

    Current Behavior

    I label the entity correctly.However, the first sample is often identified as ”英雄联盟11" rather than ”英雄联盟". How can I deal with this problem? I try to add several data,but It's work. Should I add more data ?

    • Version: 0.7.0
    • Where did you get MITIE: pip install
    • Platform: windows64 and linux64
  • extract_entities returns score of 0

    extract_entities returns score of 0

    Expected Behavior

    So, I'm using the named_entity_extractor, trained it on some data, then extracting entities from some data the model has never seen before using the extract_entities. Expecting to get back the extracted entities with their scores ranging 0-1

    Current Behavior

    The entities are extracted correctly, but the score is 0.

    Steps to Reproduce

    Train model, give it new data, get back score == 0

    • Version: downloaded on 19th of July 2019
    • Where did you get MITIE: Github
    • Platform: 64 bit
  • “std::bad_alloc”: am I using too much memory?

    “std::bad_alloc”: am I using too much memory?

    Behaviour and Step to Reproduce

    When I running on 8GB RAM, and 16GB swap, it uses full ram and 10GB swap. but still ~5GB free on the swap. Why It raises std::bad_alloc instead of using the rest of swap

    • Version: 0.7.0
    • Where did you get MITIE: Clone latest
    • Platform: Ubuntu 16.04
    • Compiler: GNU Make 4.1
  • MITIE Training fails to generate entity_extrator.dat  for more examples

    MITIE Training fails to generate entity_extrator.dat for more examples

    We are using Windows 10 PRO 64 Bit, with RASA .13.0 and Mitie .5.0 version. My RAM is 16GB and Processor : CPU 2.30Ghz, PageFile says: 25677MB used, 938 available.

    I'm running MITIE with 180 examples with 24 threads ,It took 4.5 hours and throw exception. [ I have uploaded the exception message what i get.] mitie_issue

    mitie.py,line 271, in save_disk if(_f.mitie_save_named_entity_extractor_pure_model(filename, self._obj)!=0): OSError: exception: access viloation reading 0x00000000000..0000030

    Also my model_20180918-150254 contains only training_data.json, other files like entity_extractor.dat,intent_classifier.dat, metadata.json,regex_feaurizer.json are not generated.

    But when i test the same with only 2-3 examples, it's all good.

  • AttributeError: function 'mitie_extract_entities_with_extractor' not found

    AttributeError: function 'mitie_extract_entities_with_extractor' not found

    Hi, When I try to run ner.py, I get the following error File "C:\Users\xxxx\rasa_nlu\MITIE\examples\python\ner.py", line 15, in from mitie import * File "C:\Users\xxxx\rasa_nlu\MITIE\examples\python\mitielib\mitie.py", line 61, in f.mitie_extract_entities_with_extractor.restype = ctypes.c_void_p File "C:\Program Files\Python37\lib\ctypes_init.py", line 369, in getattr func = self.getitem(name) File "C:\Program Files\Python37\lib\ctypes_init_.py", line 374, in getitem func = self._FuncPtr((name_or_ordinal, self)) AttributeError: function 'mitie_extract_entities_with_extractor' not found

    I see references to mitie_extract_entities_with_extractor in mitie.cpp and mitie.h which are in C:\Users\xxxx\rasa_nlu\MITIE\mitielib\src and C:\Users\xxxx\rasa_nlu\MITIE\mitielib\include

    why is it not able to get to the function call?

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library,  for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

eXtreme Gradient Boosting Community | Documentation | Resources | Contributors | Release Notes XGBoost is an optimized distributed gradient boosting l

Dec 30, 2022
C-based/Cached/Core Computer Vision Library, A Modern Computer Vision Library

Build Status Travis CI VM: Linux x64: Raspberry Pi 3: Jetson TX2: Backstory I set to build ccv with a minimalism inspiration. That was back in 2010, o

Jan 6, 2023
Edge ML Library - High-performance Compute Library for On-device Machine Learning Inference
 Edge ML Library - High-performance Compute Library for On-device Machine Learning Inference

Edge ML Library (EMLL) offers optimized basic routines like general matrix multiplications (GEMM) and quantizations, to speed up machine learning (ML) inference on ARM-based devices. EMLL supports fp32, fp16 and int8 data types. EMLL accelerates on-device NMT, ASR and OCR engines of Youdao, Inc.

Jan 7, 2023
A lightweight C++ machine learning library for embedded electronics and robotics.

Fido Fido is an lightweight, highly modular C++ machine learning library for embedded electronics and robotics. Fido is especially suited for robotic

Dec 17, 2022
A C++ standalone library for machine learning

Flashlight: Fast, Flexible Machine Learning in C++ Quickstart | Installation | Documentation Flashlight is a fast, flexible machine learning library w

Jan 8, 2023
libsvm websitelibsvm - A simple, easy-to-use, efficient library for Support Vector Machines. [BSD-3-Clause] website

Libsvm is a simple, easy-to-use, and efficient software for SVM classification and regression. It solves C-SVM classification, nu-SVM classification,

Jan 2, 2023
mlpack: a scalable C++ machine learning library --
mlpack: a scalable C++ machine learning library --

a fast, flexible machine learning library Home | Documentation | Doxygen | Community | Help | IRC Chat Download: current stable version (3.4.2) mlpack

Dec 30, 2022
Open Source Computer Vision Library

OpenCV: Open Source Computer Vision Library Resources Homepage: https://opencv.org Courses: https://opencv.org/courses Docs: https://docs.opencv.org/m

Jan 1, 2023
oneAPI Data Analytics Library (oneDAL)
oneAPI Data Analytics Library (oneDAL)

Intel® oneAPI Data Analytics Library Installation | Documentation | Support | Examples | Samples | How to Contribute Intel® oneAPI Data Analytics Libr

Dec 30, 2022
A C library for product recommendations/suggestions using collaborative filtering (CF)

Recommender A C library for product recommendations/suggestions using collaborative filtering (CF). Recommender analyzes the feedback of some users (i

Dec 29, 2022
RNNLIB is a recurrent neural network library for sequence learning problems. Forked from Alex Graves work http://sourceforge.net/projects/rnnl/

Origin The original RNNLIB is hosted at http://sourceforge.net/projects/rnnl while this "fork" is created to repeat results for the online handwriting

Dec 26, 2022
An open library of computer vision algorithms

VLFeat -- Vision Lab Features Library Version 0.9.21 The VLFeat open source library implements popular computer vision algorithms specialising in imag

Dec 29, 2022
FoLiA library for C++

Libfolia: FoLiA Library for C++ Libfolia (c) CLS/ILK 2010 - 2021 Centre for Language Studies, Radboud University Nijmegen Induction of Linguistic Know

Dec 31, 2021
Flashlight is a C++ standalone library for machine learning
Flashlight is a C++ standalone library for machine learning

Flashlight is a fast, flexible machine learning library written entirely in C++ from the Facebook AI Research Speech team and the creators of Torch and Deep Speech.

Jan 8, 2023
An optimized neural network operator library for chips base on Xuantie CPU.

简介 CSI-NN2 是 T-HEAD 提供的一组针对无剑 SoC 平台的神经网络库 API。抽象了各种常用的网络层的接口,并且提供一系列已优化的二进制库。 CSI-NN2 的特性: 开源 c 代码版本的参考实现。 提供玄铁 CPU 的汇编优化实现。

Jan 5, 2023
C++ NN 🧠 A simple Neural Network library written in C++

C++ NN ?? A simple Neural Network library written in C++ Installation ??

Dec 13, 2022
ML++ - A library created to revitalize C++ as a machine learning front end
ML++ - A library created to revitalize C++ as a machine learning front end

ML++ Machine learning is a vast and exiciting discipline, garnering attention from specialists of many fields. Unfortunately, for C++ programmers and

Dec 31, 2022
SSL_SLAM2: Lightweight 3-D Localization and Mapping for Solid-State LiDAR (mapping and localization separated) ICRA 2021
SSL_SLAM2: Lightweight 3-D Localization and Mapping for Solid-State LiDAR (mapping and localization separated)  ICRA 2021

SSL_SLAM2 Lightweight 3-D Localization and Mapping for Solid-State LiDAR (Intel Realsense L515 as an example) This repo is an extension work of SSL_SL

Dec 27, 2022