A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

Website | Documentation | Tutorials | Installation | Release Notes

GitHub license PyPI version Conda Version GitHub issues Telegram

CatBoost is a machine learning method based on gradient boosting over decision trees.

Main advantages of CatBoost:

Get Started and Documentation

All CatBoost documentation is available here.

Install CatBoost by following the guide for the

Next you may want to investigate:

If you cannot open documentation in your browser try adding yastatic.net and yastat.net to the list of allowed domains in your privacy badger.

Catboost models in production

If you want to evaluate Catboost model in your application read model api documentation.

Questions and bug reports

Help to Make CatBoost Better

  • Check out open problems and help wanted issues to see what can be improved, or open an issue if you want something.
  • Add your stories and experience to Awesome CatBoost.
  • To contribute to CatBoost you need to first read CLA text and add to your pull request, that you agree to the terms of the CLA. More information can be found in CONTRIBUTING.md
  • Instructions for contributors can be found here.

News

Latest news are published on twitter.

Reference Paper

Anna Veronika Dorogush, Andrey Gulin, Gleb Gusev, Nikita Kazeev, Liudmila Ostroumova Prokhorenkova, Aleksandr Vorobev "Fighting biases with dynamic boosting". arXiv:1706.09516, 2017.

Anna Veronika Dorogush, Vasily Ershov, Andrey Gulin "CatBoost: gradient boosting with categorical features support". Workshop on ML Systems at NIPS 2017.

License

© YANDEX LLC, 2017-2019. Licensed under the Apache License, Version 2.0. See LICENSE file for more details.

Owner
CatBoost
CatBoost is a fast, scalable, high performance gradient boosting on decision trees library. Used for ranking, classification, regression and other ML tasks.
CatBoost
Comments
  • UnicodeDecodeError: 'ascii' codec can't decode byte 0xcd in position 9: ordinal not in range(128)

    UnicodeDecodeError: 'ascii' codec can't decode byte 0xcd in position 9: ordinal not in range(128)

    Problem:UnicodeDecodeError: 'ascii' codec can't decode byte 0xcd in position 9: ordinal not in range(128) catboost version: catboost 0.25 Operating System:win10

    When I use setup.py to install Catboost, this error occurs, and if I look closely it is divided into two parts: 1. Using CUDA to create _catboost.pyd will cause an error like 'UnicodeDecodeError:' ASCII 'codec can't decode byte 0xCD in position 9: Ordinal not in range(128). 2. Do not use the CUDA to create _catboost. pyd, there will be "subprocess. CalledProcessError:Command '['D:\anaconda3\python.exe', 'D:\learn\catboost-master\ya', 'make', 'D:\learn\catboost-master\catboost\python-package\..\..\catboost\python-package\catboost', '--no-src-links', '--output', 'D:\ learn\ catboost-master\catboost\python-package\build\temp.win-amd64-3.8\Release', '-dpython_config =python3-config',' -duse_arcadia_python =no', '-dos_sdk =local', '-r','-DNO_DEBUGINFO', '-DHAVE_CUDA= NO '] returned non-zero exit status 1." I also tried converting _catboost.pyx from GitHub to _catboost.pyd using 'python setup.py build_ext --inplace' directly, but I got the same error as when installing CatBoost.

    C:\Users\王普聪>pip install -e D:\learn\catboost-master\catboost\python-package
    Obtaining file:///D:/learn/catboost-master/catboost/python-package
    Requirement already satisfied: graphviz in d:\anaconda3\lib\site-packages (from catboost==0.24.4) (0.16)
    Requirement already satisfied: plotly in d:\anaconda3\lib\site-packages (from catboost==0.24.4) (4.14.3)
    Requirement already satisfied: six in d:\anaconda3\lib\site-packages (from catboost==0.24.4) (1.15.0)
    Requirement already satisfied: matplotlib in d:\anaconda3\lib\site-packages (from catboost==0.24.4) (3.2.2)
    Requirement already satisfied: numpy>=1.16.0 in d:\anaconda3\lib\site-packages (from catboost==0.24.4) (1.18.5)
    Requirement already satisfied: pandas>=0.24 in d:\anaconda3\lib\site-packages (from catboost==0.24.4) (1.0.5)
    Requirement already satisfied: scipy in d:\anaconda3\lib\site-packages (from catboost==0.24.4) (1.5.0)
    Requirement already satisfied: retrying>=1.3.3 in d:\anaconda3\lib\site-packages (from plotly->catboost==0.24.4) (1.3.3)
    Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in d:\anaconda3\lib\site-packages (from matplotlib->catboost==0.24.4) (2.4.7)
    Requirement already satisfied: cycler>=0.10 in d:\anaconda3\lib\site-packages (from matplotlib->catboost==0.24.4) (0.10.0)
    Requirement already satisfied: kiwisolver>=1.0.1 in d:\anaconda3\lib\site-packages (from matplotlib->catboost==0.24.4) (1.2.0)
    Requirement already satisfied: python-dateutil>=2.1 in d:\anaconda3\lib\site-packages (from matplotlib->catboost==0.24.4) (2.8.1)
    Requirement already satisfied: pytz>=2017.2 in d:\anaconda3\lib\site-packages (from pandas>=0.24->catboost==0.24.4) (2020.1)
    Installing collected packages: catboost
      Running setup.py develop for catboost
        ERROR: Command errored out with exit status 1:
         command: 'D:\anaconda3\python.exe' -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'D:\\learn\\catboost-master\\catboost\\python-package\\setup.py'"'"'; __file__='"'"'D:\\learn\\catboost-master\\catboost\\python-package\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' develop --no-deps
             cwd: D:\learn\catboost-master\catboost\python-package\
        Complete output (159 lines):
        running develop
        15:30:22 I Targeting for CUDA support with C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1
        running egg_info
        writing catboost.egg-info\PKG-INFO
        writing dependency_links to catboost.egg-info\dependency_links.txt
        writing requirements to catboost.egg-info\requires.txt
        writing top-level names to catboost.egg-info\top_level.txt
        15:30:24 I Targeting for CUDA support with C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1
        reading manifest file 'catboost.egg-info\SOURCES.txt'
        writing manifest file 'catboost.egg-info\SOURCES.txt'
        running build_ext
        15:30:24 I Targeting for CUDA support with C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1
        15:30:24 I Buildling _catboost.pyd with ymake
        15:30:24 I EXECUTE: D:\anaconda3\python.exe D:\learn\catboost-master\ya make D:\learn\catboost-master\catboost\python-package\..\..\catboost\python-package\catboost --no-src-links --output D:\learn\catboost-master\catboost\python-package\build\temp.win-amd64-3.8\Release -DPYTHON_CONFIG=python3-config -DUSE_ARCADIA_PYTHON=no -DOS_SDK=local -r -DNO_DEBUGINFO -DHAVE_CUDA=yes "-DCUDA_ROOT=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1"
        Output root is subdirectory of Arcadia root, this may cause non-idempotent build
        Traceback (most recent call last):
          File "devtools/ya/app.py", line 422, in configure_exit_interceptor
            yield
          File "devtools/ya/app.py", line 65, in helper
            return action(args)
          File "devtools/ya/entry/entry.py", line 55, in do_main
            res = handler.handle(handler, args, prefix=['ya'])
          File "devtools/ya/core/handler.py", line 159, in handle
            return handler.handle(self, args[1:], prefix + [name])
          File "devtools/ya/core/dispatch.py", line 37, in handle
            return self.command().handle(root_handler, args, prefix)
          File "devtools/ya/core/handler.py", line 341, in handle
            return self._action(params)
          File "devtools/ya/app.py", line 92, in helper
            return action(ctx.params)
          File "devtools/ya/build/build_handler.py", line 85, in do_ya_make
            builder = ya_make.YaMake(params, app_ctx)
          File "devtools/ya/build/ya_make.py", line 895, in __init__
            self.ctx = Context(self.opts, app_ctx=app_ctx, graph=graph, tests=tests, stripped_tests=stripped_tests, configure_errors=configure_errors, make_files=make_files, lite_graph=lite_graph)
          File "devtools/ya/build/ya_make.py", line 574, in __init__
            self.graph, self.tests, self.stripped_tests, self.configure_errors, self.make_files = _build_graph_and_tests(self.opts, app_ctx)
          File "devtools/ya/build/ya_make.py", line 258, in _build_graph_and_tests
            graph, tests, stripped_tests, gh, make_files = lg.build_graph_and_tests(opts, check=True, ev_listener=ev_listener, display=display)
          File "devtools/ya/build/graph.py", line 1688, in build_graph_and_tests
            return _build_graph_and_tests(opts, check, ev_listener, exit_stack, display)
          File "devtools/ya/build/graph.py", line 1992, in _build_graph_and_tests
            real_ymake_bin = tools.tool('ymake')
          File "devtools/ya/yalibrary/tools/__init__.py", line 220, in tool
            return toolchain.find(name, with_params, for_platform, cache=cache)
          File "devtools/ya/yalibrary/tools/__init__.py", line 158, in find
            executable = cur_bottle[executable_name]  # if executable_name is None it's Ok
          File "devtools/ya/yalibrary/tools/__init__.py", line 64, in __getitem__
            path = self.resolve()
          File "devtools/ya/yalibrary/tools/__init__.py", line 46, in resolve
            return self.__fetcher.fetch_if_need(self.__formula["match"], tared, binname, cache=cache).where
          File "devtools/ya/yalibrary/fetcher/__init__.py", line 385, in fetch_if_need
            self.__c[key] = self._fetch_if_need(*args, **kwargs)
          File "devtools/ya/yalibrary/fetcher/__init__.py", line 452, in _fetch_if_need
            if self._fetch(name, tared, lambda x: name.lower() in x.lower(), binname):
          File "devtools/ya/yalibrary/fetcher/__init__.py", line 368, in _fetch
            _install(res_path, do_install)
          File "devtools/ya/yalibrary/fetcher/__init__.py", line 104, in _install
            fs_handler(install_guard)
          File "devtools/ya/yalibrary/fetcher/__init__.py", line 95, in fs_handler
            func(install_guard)
          File "devtools/ya/yalibrary/fetcher/__init__.py", line 350, in do_install
            deploy_params=(UNTAR, resource_info if resource_info else {"file_name": "FILE"}, ""))
          File "devtools/ya/yalibrary/fetcher/__init__.py", line 137, in _deploy_tool
            exts.archive.extract_from_tar(archive, extract_to)
          File "devtools/ya/exts/archive.py", line 16, in extract_from_tar
            archive.extract_tar(tar_file_path, output_dir)
          File "library/python/archive/__init__.py", line 62, in extract_tar
            output_dir = encode(output_dir, ENCODING)
          File "library/python/archive/__init__.py", line 58, in encode
            return value.encode(encoding)
        UnicodeDecodeError: 'ascii' codec can't decode byte 0xcd in position 9: ordinal not in range(128)
        15:30:37 E Cannot build _catboost.pyd with CUDA support, will build without CUDA
        15:30:37 I EXECUTE: D:\anaconda3\python.exe D:\learn\catboost-master\ya make D:\learn\catboost-master\catboost\python-package\..\..\catboost\python-package\catboost --no-src-links --output D:\learn\catboost-master\catboost\python-package\build\temp.win-amd64-3.8\Release -DPYTHON_CONFIG=python3-config -DUSE_ARCADIA_PYTHON=no -DOS_SDK=local -r -DNO_DEBUGINFO -DHAVE_CUDA=no
        Output root is subdirectory of Arcadia root, this may cause non-idempotent build
        Traceback (most recent call last):
          File "devtools/ya/app.py", line 422, in configure_exit_interceptor
            yield
          File "devtools/ya/app.py", line 65, in helper
            return action(args)
          File "devtools/ya/entry/entry.py", line 55, in do_main
            res = handler.handle(handler, args, prefix=['ya'])
          File "devtools/ya/core/handler.py", line 159, in handle
            return handler.handle(self, args[1:], prefix + [name])
          File "devtools/ya/core/dispatch.py", line 37, in handle
            return self.command().handle(root_handler, args, prefix)
          File "devtools/ya/core/handler.py", line 341, in handle
            return self._action(params)
          File "devtools/ya/app.py", line 92, in helper
            return action(ctx.params)
          File "devtools/ya/build/build_handler.py", line 85, in do_ya_make
            builder = ya_make.YaMake(params, app_ctx)
          File "devtools/ya/build/ya_make.py", line 895, in __init__
            self.ctx = Context(self.opts, app_ctx=app_ctx, graph=graph, tests=tests, stripped_tests=stripped_tests, configure_errors=configure_errors, make_files=make_files, lite_graph=lite_graph)
          File "devtools/ya/build/ya_make.py", line 574, in __init__
            self.graph, self.tests, self.stripped_tests, self.configure_errors, self.make_files = _build_graph_and_tests(self.opts, app_ctx)
          File "devtools/ya/build/ya_make.py", line 258, in _build_graph_and_tests
            graph, tests, stripped_tests, gh, make_files = lg.build_graph_and_tests(opts, check=True, ev_listener=ev_listener, display=display)
          File "devtools/ya/build/graph.py", line 1688, in build_graph_and_tests
            return _build_graph_and_tests(opts, check, ev_listener, exit_stack, display)
          File "devtools/ya/build/graph.py", line 1992, in _build_graph_and_tests
            real_ymake_bin = tools.tool('ymake')
          File "devtools/ya/yalibrary/tools/__init__.py", line 220, in tool
            return toolchain.find(name, with_params, for_platform, cache=cache)
          File "devtools/ya/yalibrary/tools/__init__.py", line 158, in find
            executable = cur_bottle[executable_name]  # if executable_name is None it's Ok
          File "devtools/ya/yalibrary/tools/__init__.py", line 64, in __getitem__
            path = self.resolve()
          File "devtools/ya/yalibrary/tools/__init__.py", line 46, in resolve
            return self.__fetcher.fetch_if_need(self.__formula["match"], tared, binname, cache=cache).where
          File "devtools/ya/yalibrary/fetcher/__init__.py", line 385, in fetch_if_need
            self.__c[key] = self._fetch_if_need(*args, **kwargs)
          File "devtools/ya/yalibrary/fetcher/__init__.py", line 452, in _fetch_if_need
            if self._fetch(name, tared, lambda x: name.lower() in x.lower(), binname):
          File "devtools/ya/yalibrary/fetcher/__init__.py", line 368, in _fetch
            _install(res_path, do_install)
          File "devtools/ya/yalibrary/fetcher/__init__.py", line 104, in _install
            fs_handler(install_guard)
          File "devtools/ya/yalibrary/fetcher/__init__.py", line 95, in fs_handler
            func(install_guard)
          File "devtools/ya/yalibrary/fetcher/__init__.py", line 350, in do_install
            deploy_params=(UNTAR, resource_info if resource_info else {"file_name": "FILE"}, ""))
          File "devtools/ya/yalibrary/fetcher/__init__.py", line 137, in _deploy_tool
            exts.archive.extract_from_tar(archive, extract_to)
          File "devtools/ya/exts/archive.py", line 16, in extract_from_tar
            archive.extract_tar(tar_file_path, output_dir)
          File "library/python/archive/__init__.py", line 62, in extract_tar
            output_dir = encode(output_dir, ENCODING)
          File "library/python/archive/__init__.py", line 58, in encode
            return value.encode(encoding)
        UnicodeDecodeError: 'ascii' codec can't decode byte 0xcd in position 9: ordinal not in range(128)
        Traceback (most recent call last):
          File "<string>", line 1, in <module>
          File "D:\learn\catboost-master\catboost\python-package\setup.py", line 259, in <module>
            setup(
          File "D:\anaconda3\lib\site-packages\setuptools\__init__.py", line 153, in setup
            return distutils.core.setup(**attrs)
          File "D:\anaconda3\lib\distutils\core.py", line 148, in setup
            dist.run_commands()
          File "D:\anaconda3\lib\distutils\dist.py", line 966, in run_commands
            self.run_command(cmd)
          File "D:\anaconda3\lib\distutils\dist.py", line 985, in run_command
            cmd_obj.run()
          File "D:\anaconda3\lib\site-packages\setuptools\command\develop.py", line 34, in run
            self.install_for_development()
          File "D:\anaconda3\lib\site-packages\setuptools\command\develop.py", line 136, in install_for_development
            self.run_command('build_ext')
          File "D:\anaconda3\lib\distutils\cmd.py", line 313, in run_command
            self.distribution.run_command(command)
          File "D:\anaconda3\lib\distutils\dist.py", line 985, in run_command
            cmd_obj.run()
          File "D:\learn\catboost-master\catboost\python-package\setup.py", line 186, in run
            self.build_with_ymake(topsrc_dir, build_dir, catboost_ext, put_dir, verbose, dry_run)
          File "D:\learn\catboost-master\catboost\python-package\setup.py", line 219, in build_with_ymake
            logging_execute(ymake_cmd + ['-DHAVE_CUDA=no'], verbose, dry_run)
          File "D:\learn\catboost-master\catboost\python-package\setup.py", line 62, in logging_execute
            subprocess.check_call(cmd, universal_newlines=True)
          File "D:\anaconda3\lib\subprocess.py", line 364, in check_call
            raise CalledProcessError(retcode, cmd)
        subprocess.CalledProcessError: Command '['D:\\anaconda3\\python.exe', 'D:\\learn\\catboost-master\\ya', 'make', 'D:\\learn\\catboost-master\\catboost\\python-package\\..\\..\\catboost\\python-package\\catboost', '--no-src-links', '--output', 'D:\\learn\\catboost-master\\catboost\\python-package\\build\\temp.win-amd64-3.8\\Release', '-DPYTHON_CONFIG=python3-config', '-DUSE_ARCADIA_PYTHON=no', '-DOS_SDK=local', '-r', '-DNO_DEBUGINFO', '-DHAVE_CUDA=no']' returned non-zero exit status 1.
        ----------------------------------------
    ERROR: Command errored out with exit status 1: 'D:\anaconda3\python.exe' -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'D:\\learn\\catboost-master\\catboost\\python-package\\setup.py'"'"'; __file__='"'"'D:\\learn\\catboost-master\\catboost\\python-package\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' develop --no-deps Check the logs for full command output.
    
  • User description is used by default. Move metric creation metric to corresponding class factories.

    User description is used by default. Move metric creation metric to corresponding class factories.

    Each metric now uses user-specified parameters in their descriptions by default.

    Design

    TMetric now stores a TMap<TString, TString> of user parameters, which are used to construct a metric description (e.g. MetricName:key1=value1;key2=value2). This implementation is defined in the base class and is now the default behaviour for building metric descriptions.

    Some of specifiv GetDescription method implementations are kept in order to be consistent with the existing behaviour.

    Note

    UserQuerywiseMetric now uses the options in its representation as well.

  • Sum of shap values does not equal to the prediction

    Sum of shap values does not equal to the prediction

    Problem: Sum of shap values does not equal to the prediction catboost version: 0.18.1 Operating System: Ubuntu 19.10 CPU: i7-8565U

    It only happens sometimes but we find that the of shap values does not equal to the prediction. Please let us know how we can provide further information

  • How catboost handle with big data?

    How catboost handle with big data?

    Hi! I try to use catboost in kaggle competition. https://www.kaggle.com/c/talkingdata-adtracking-fraud-detection The size of my train set about 40m rows with 14 features. When i try to train model, kernel always dies without any errors...

  • Unknown class labels

    Unknown class labels

    I'm beginner using boosting models ,I'm trying to implement catboost . My input data has 6 categorical features and 2 numerical feature . My target variable is numerical data. I'm running on GPU . I'm facing the problem below please help me. Cannot chare data due privacy issue.

    Traceback (most recent call last): File "/work/ilt/css8222/cat_boost/cat_boost.py", line 127, in save_snapshot = True File "/fibus/fs2/15/css8222/.local/lib/python3.6/site-packages/catboost/core.py", line 4718, in fit silent, early_stopping_rounds, save_snapshot, snapshot_file, snapshot_interval, init_model, callbacks, log_cout, log_cerr) File "/fibus/fs2/15/css8222/.local/lib/python3.6/site-packages/catboost/core.py", line 2042, in _fit train_params["init_model"] File "/fibus/fs2/15/css8222/.local/lib/python3.6/site-packages/catboost/core.py", line 1464, in _train self._object._train(train_pool, test_pool, params, allow_clear_pool, init_model._object if init_model else None) File "_catboost.pyx", line 4393, in _catboost._CatBoost._train File "_catboost.pyx", line 4442, in _catboost._CatBoost._train _catboost.CatBoostError: catboost/private/libs/target/target_converter.cpp:226: Unknown class label: "14289"

  • Faster SHAP values for small batches

    Faster SHAP values for small batches

    For small batches use direct SHAP values calculation. Direct algorithm (without precalculation) is faster when (where DocumentsNumber < MeanLeafCount), because for preprocessing we find SHAP values for MeanLeafCount documents.

    (algorithm from https://arxiv.org/abs/1802.03888)

    With preprocessing final complexity was O(NT(D+F))+O(TL^2 D^2) where N is the number of documents(objects), T - number of trees, D - average tree depth, F - average number of features in tree, L - average number of leaves in tree. But if the batch is small we can use default algorithm with complexity O(NTLD^2), which is better when N < L.

    Example: On dataset gisette (https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html) with 100 first features train CatBoostRegressor(iterations=500, depth=6, random_seed=42) and then use get_feature_importance to find SHAP values for the first object in test.

    Old:

    • 0.32 s

    New:

    • shap_mode="Auto" or "NoPreCalc"- 0.015 s
    • shap_mode="UsePreCalc" - 0.32 s (this is like it was before)

    I hereby agree to the terms of the CLA available at: link

  • Tutorial for ranking modes in CatBoost

    Tutorial for ranking modes in CatBoost

    Hello.

    Looks like the current version of CatBoost supports learning to rank. There are some clues about it in the documentation, but I couldn't find any minimal working examples. I wonder which methods should be considered as a baseline approach and what are the prerequisites?

    Should we use YetiRank as the training metric and just provide a query id as the Pool group_id parameter? What other CatBoost parameters should be taken into account specifically for a ranking problem?

    Thank you!

  • GPU yields worse metric than CPU

    GPU yields worse metric than CPU

    Problem:various measurements become worse when I switch from CPU to GPU catboost version:0.22 Operating System:Linux 4.4.0-1100-aws x86_64 CPU: Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz

    GPU: Tesla M60

    I wanted to reduce the training time and so I specified 'task_type' as 'GPU'. I immediately noticed that its metrics got worse. The only change I made was setting task_type as GPU. The rest are the same.

    The training dataset has 1.2M rows and 218 columns. Among these 218 columns, 42 are categorical features. The rest are floats or integers, no text features. The validation dataset has 120K rows.

    The following are the parameters for the CPU version: {'nan_mode': 'Min', 'eval_metric': 'Logloss', 'combinations_ctr': ['Borders:CtrBorderCount=15:CtrBorderType=Uniform:TargetBorderCount=1:TargetBorderType=MinEntropy:Prior=0/1:Prior=0.5/1:Prior=1/1', 'Counter:CtrBorderCount=15:CtrBorderType=Uniform:Prior=0/1'], 'iterations': 1000, 'sampling_frequency': 'PerTree', 'fold_permutation_block': 0, 'leaf_estimation_method': 'Newton', 'od_pval': 0, 'counter_calc_method': 'SkipTest', 'grow_policy': 'SymmetricTree', 'boosting_type': 'Plain', 'model_shrink_mode': 'Constant', 'feature_border_type': 'GreedyLogSum', 'ctr_leaf_count_limit': 18446744073709551615, 'bayesian_matrix_reg': 0.10000000149011612, 'one_hot_max_size': 2, 'l2_leaf_reg': 3, 'random_strength': 1, 'od_type': 'Iter', 'rsm': 1, 'boost_from_average': False, 'max_ctr_complexity': 4, 'model_size_reg': 0.5, 'simple_ctr': ['Borders:CtrBorderCount=15:CtrBorderType=Uniform:TargetBorderCount=1:TargetBorderType=MinEntropy:Prior=0/1:Prior=0.5/1:Prior=1/1', 'Counter:CtrBorderCount=15:CtrBorderType=Uniform:Prior=0/1'], 'subsample': 0.800000011920929, 'use_best_model': True, 'od_wait': 35, 'class_names': [0, 1], 'random_seed': 42, 'depth': 6, 'ctr_target_border_count': 1, 'has_time': False, 'store_all_simple_ctr': False, 'border_count': 254, 'classes_count': 0, 'sparse_features_conflict_fraction': 0, 'leaf_estimation_backtracking': 'AnyImprovement', 'best_model_min_trees': 1, 'model_shrink_rate': 0, 'min_data_in_leaf': 1, 'loss_function': 'Logloss', 'learning_rate': 0.30000001192092896, 'score_function': 'Cosine', 'task_type': 'CPU', 'leaf_estimation_iterations': 10, 'bootstrap_type': 'MVS', 'max_leaves': 64, 'permutation_count': 4}

    The following are the parameters for the GPU version: {'nan_mode': 'Min', 'gpu_ram_part': 0.95, 'eval_metric': 'Logloss', 'combinations_ctr': ['Borders:CtrBorderCount=15:CtrBorderType=Uniform:TargetBorderCount=1:TargetBorderType=MinEntropy:Prior=0/1:Prior=0.5/1:Prior=1/1', 'FeatureFreq:CtrBorderCount=15:CtrBorderType=Median:Prior=0/1'], 'iterations': 1000, 'fold_permutation_block': 64, 'leaf_estimation_method': 'Newton', 'observations_to_bootstrap': 'TestOnly', 'od_pval': 0, 'counter_calc_method': 'SkipTest', 'grow_policy': 'SymmetricTree', 'boosting_type': 'Plain', 'ctr_history_unit': 'Sample', 'feature_border_type': 'GreedyLogSum', 'bayesian_matrix_reg': 0.10000000149011612, 'one_hot_max_size': 2, 'devices': '-1', 'pinned_memory_bytes': '104857600', 'l2_leaf_reg': 3, 'random_strength': 1, 'od_type': 'Iter', 'rsm': 1, 'boost_from_average': False, 'fold_size_loss_normalization': False, 'max_ctr_complexity': 4, 'gpu_cat_features_storage': 'GpuRam', 'simple_ctr': ['Borders:CtrBorderCount=15:CtrBorderType=Uniform:TargetBorderCount=1:TargetBorderType=MinEntropy:Prior=0/1:Prior=0.5/1:Prior=1/1', 'FeatureFreq:CtrBorderCount=15:CtrBorderType=MinEntropy:Prior=0/1'], 'use_best_model': True, 'od_wait': 35, 'class_names': [0, 1], 'random_seed': 42, 'depth': 6, 'ctr_target_border_count': 1, 'has_time': False, 'border_count': 128, 'min_fold_size': 100, 'data_partition': 'FeatureParallel', 'bagging_temperature': 1, 'classes_count': 0, 'leaf_estimation_backtracking': 'AnyImprovement', 'best_model_min_trees': 1, 'min_data_in_leaf': 1, 'add_ridge_penalty_to_loss_function': False, 'loss_function': 'Logloss', 'learning_rate': 0.30000001192092896, 'score_function': 'Cosine', 'task_type': 'GPU', 'leaf_estimation_iterations': 10, 'bootstrap_type': 'Bayesian', 'max_leaves': 64, 'permutation_count': 4}

  • Flag not copied unnecessarily with blank and whitespace

    Flag not copied unnecessarily with blank and whitespace

    Before submitting a pull request, please do the following steps:

    1. Read instructions for contributors here.
    2. Run ya make in catboost folder to make sure the code builds.
    3. Add tests that test your change.
    4. Run tests using ya make -t -A command.
    5. If you haven't already, complete the CLA. I hereby agree to the terms of the CLA available at https://yandex.ru/legal/cla/?lang=en.
  • Issue trying to compile with specified gcc version

    Issue trying to compile with specified gcc version

    I'm trying to compile the catboost python wheel on my system. The default gcc version I have is 8, but I also have 7 installed so I'm trying to use that by setting the CC and CXX environment variables. However, when running:

    python mk_wheel.py -DCUDA_ROOT="/opt/cuda"
    

    I get the message:

    Info: Attention! Using system user-defined compiler: g++-7 (check CC and CXX env vars).
    Cross compilation with system CXX is not supported
    

    catboost version: git master Operating System: Linux CPU: i7 GPU: GTX 1080

    Thanks!

  • error: no such file or directory: 'PYTHON-NOT-FOUND' ubuntu

    error: no such file or directory: 'PYTHON-NOT-FOUND' ubuntu

    I am installing catboost for Python3 and obtain such an error: error: no such file or directory: 'PYTHON-NOT-FOUND'

    ubuntu 16.04 python 3.6.4 Anaconda CUDA8 GeForce GTX480 amd64

    Catboost was compiled with parameters

    find ./ -type f -name ya.make -exec sed -i 's/compute_30/compute_20/g' {} +

    then

    sudo ../../../ya make -r -DUSE_ARCADIA_PYTHON=no -DPYTHON_CONFIG=python3-config -DCUDA_ROOT=/usr/local/cuda-8.0

    ending with error error: no such file or directory: 'PYTHON-NOT-FOUND'

    1. With setting parameter -DPYTHON_CONFIG=python2-config the compilation process gained success, but the library couldn't be imported into python 3.6 script
  • [Spark] Prediction results are differences between non-Spark CatBoost Classifier and Spark CatBoostClassifier

    [Spark] Prediction results are differences between non-Spark CatBoost Classifier and Spark CatBoostClassifier

    Problem:

    • We are comparing the binary models trained by non-spark CatBoost Classifier and Spark CatBoostClassifier
    • Features are all float32 number, and make the same values for both models
    • Labels are float type, 2 values (0.0 and 1.0)
    • We are using Python PySpark CatBoost Classifier
    • Implementation in non-spark CatBoost Classifier
    model = CatBoostClassifier(
        allow_writing_files=False, class_weights=None, od_wait=200, od_type="Iter"
    )
    model.fit(
        x_train,
        y_train,
        eval_set=(x_validation, y_validation),
        cat_features=None,
    )
    
    • Implementation in PySpark CatBoost Classifier
    classifier = catboost_spark.CatBoostClassifier(
        allowWritingFiles=False,
        odWait=200,
        odType=catboost_spark.EOverfittingDetectorType.Iter,
    )
    
    # df_train_mllib = self.__create_vectors(dataset_pd=dataset_train_pd)
    # df_validation_mllib = self.__create_vectors(dataset_pd=dataset_validation_pd)
    
    # train_pool = catboost_spark.Pool(df_train_mllib)
    # eval_pool = catboost_spark.Pool(df_validation_mllib)
    
    model = classifier.fit(train_pool, [eval_pool])
    

    One more time that train pool and eval pool for both models are same.

    • When we printed out the iterations, we can see the different between 2 models

      • Spark Classifier
      0:	learn: 0.6912789	test: 0.6935195	best: 0.6935195 (0)	total: 55.8ms	remaining: 55.8s
      1:	learn: 0.6894027	test: 0.6923814	best: 0.6923814 (1)	total: 102ms	remaining: 50.9s
      2:	learn: 0.6875403	test: 0.6911004	best: 0.6911004 (2)	total: 148ms	remaining: 49.1s
      3:	learn: 0.6856659	test: 0.6908598	best: 0.6908598 (3)	total: 190ms	remaining: 47.2s
      4:	learn: 0.6837205	test: 0.6894618	best: 0.6894618 (4)	total: 242ms	remaining: 48.1s
      5:	learn: 0.6823006	test: 0.6893671	best: 0.6893671 (5)	total: 283ms	remaining: 46.9s
      6:	learn: 0.6807997	test: 0.6884971	best: 0.6884971 (6)	total: 325ms	remaining: 46s
      7:	learn: 0.6790929	test: 0.6892320	best: 0.6884971 (6)	total: 366ms	remaining: 45.4s
      8:	learn: 0.6777336	test: 0.6883773	best: 0.6883773 (8)	total: 408ms	remaining: 44.9s
      9:	learn: 0.6763152	test: 0.6877831	best: 0.6877831 (9)	total: 449ms	remaining: 44.5s
      10:	learn: 0.6754351	test: 0.6871970	best: 0.6871970 (10)	total: 491ms	remaining: 44.1s
      11:	learn: 0.6740250	test: 0.6876101	best: 0.6871970 (10)	total: 531ms	remaining: 43.7s
      12:	learn: 0.6725701	test: 0.6869153	best: 0.6869153 (12)	total: 571ms	remaining: 43.3s
      13:	learn: 0.6714343	test: 0.6877582	best: 0.6869153 (12)	total: 616ms	remaining: 43.4s
      14:	learn: 0.6702436	test: 0.6877021	best: 0.6869153 (12)	total: 663ms	remaining: 43.5s
      15:	learn: 0.6688952	test: 0.6873304	best: 0.6869153 (12)	total: 705ms	remaining: 43.4s
      16:	learn: 0.6675385	test: 0.6871014	best: 0.6869153 (12)	total: 750ms	remaining: 43.4s
      17:	learn: 0.6664372	test: 0.6870376	best: 0.6869153 (12)	total: 791ms	remaining: 43.2s
      18:	learn: 0.6654951	test: 0.6867115	best: 0.6867115 (18)	total: 832ms	remaining: 42.9s
      19:	learn: 0.6640508	test: 0.6868121	best: 0.6867115 (18)	total: 878ms	remaining: 43s
      20:	learn: 0.6629780	test: 0.6863598	best: 0.6863598 (20)	total: 921ms	remaining: 42.9s
      21:	learn: 0.6616907	test: 0.6861838	best: 0.6861838 (21)	total: 961ms	remaining: 42.7s
      22:	learn: 0.6606452	test: 0.6839722	best: 0.6839722 (22)	total: 1s	remaining: 42.7s
      23:	learn: 0.6594586	test: 0.6844369	best: 0.6839722 (22)	total: 1.05s	remaining: 42.6s
      24:	learn: 0.6581857	test: 0.6805598	best: 0.6805598 (24)	total: 1.09s	remaining: 42.6s
      25:	learn: 0.6572553	test: 0.6803274	best: 0.6803274 (25)	total: 1.14s	remaining: 42.6s
      26:	learn: 0.6564329	test: 0.6798682	best: 0.6798682 (26)	total: 1.18s	remaining: 42.6s
      27:	learn: 0.6555215	test: 0.6791039	best: 0.6791039 (27)	total: 1.23s	remaining: 42.5s
      28:	learn: 0.6546388	test: 0.6788199	best: 0.6788199 (28)	total: 1.27s	remaining: 42.7s
      29:	learn: 0.6539215	test: 0.6787342	best: 0.6787342 (29)	total: 1.32s	remaining: 42.7s
      30:	learn: 0.6529801	test: 0.6803639	best: 0.6787342 (29)	total: 1.37s	remaining: 42.9s
      31:	learn: 0.6521993	test: 0.6795314	best: 0.6787342 (29)	total: 1.41s	remaining: 42.8s
      32:	learn: 0.6512033	test: 0.6790517	best: 0.6787342 (29)	total: 1.46s	remaining: 42.7s
      33:	learn: 0.6503687	test: 0.6787654	best: 0.6787342 (29)	total: 1.5s	remaining: 42.6s
      34:	learn: 0.6494875	test: 0.6778926	best: 0.6778926 (34)	total: 1.54s	remaining: 42.5s
      35:	learn: 0.6485735	test: 0.6780201	best: 0.6778926 (34)	total: 1.58s	remaining: 42.4s
      36:	learn: 0.6478175	test: 0.6751499	best: 0.6751499 (36)	total: 1.62s	remaining: 42.3s
      37:	learn: 0.6467330	test: 0.6768365	best: 0.6751499 (36)	total: 1.67s	remaining: 42.3s
      38:	learn: 0.6458817	test: 0.6767081	best: 0.6751499 (36)	total: 1.71s	remaining: 42.3s
      39:	learn: 0.6450321	test: 0.6764039	best: 0.6751499 (36)	total: 1.76s	remaining: 42.2s
      40:	learn: 0.6443497	test: 0.6760467	best: 0.6751499 (36)	total: 1.8s	remaining: 42.2s
      41:	learn: 0.6437456	test: 0.6760822	best: 0.6751499 (36)	total: 1.85s	remaining: 42.1s
      42:	learn: 0.6430526	test: 0.6765914	best: 0.6751499 (36)	total: 1.89s	remaining: 42.1s
      43:	learn: 0.6422143	test: 0.6759258	best: 0.6751499 (36)	total: 1.93s	remaining: 42s
      44:	learn: 0.6413319	test: 0.6758742	best: 0.6751499 (36)	total: 1.97s	remaining: 41.9s
      45:	learn: 0.6403083	test: 0.6750221	best: 0.6750221 (45)	total: 2.02s	remaining: 41.8s
      46:	learn: 0.6395981	test: 0.6749273	best: 0.6749273 (46)	total: 2.06s	remaining: 41.8s
      47:	learn: 0.6388226	test: 0.6746229	best: 0.6746229 (47)	total: 2.1s	remaining: 41.7s
      48:	learn: 0.6380817	test: 0.6736904	best: 0.6736904 (48)	total: 2.14s	remaining: 41.6s
      49:	learn: 0.6373135	test: 0.6740067	best: 0.6736904 (48)	total: 2.19s	remaining: 41.5s
      50:	learn: 0.6363781	test: 0.6735399	best: 0.6735399 (50)	total: 2.23s	remaining: 41.6s
      51:	learn: 0.6354556	test: 0.6737733	best: 0.6735399 (50)	total: 2.28s	remaining: 41.6s
      52:	learn: 0.6344730	test: 0.6734196	best: 0.6734196 (52)	total: 2.32s	remaining: 41.5s
      53:	learn: 0.6339920	test: 0.6715583	best: 0.6715583 (53)	total: 2.37s	remaining: 41.4s
      54:	learn: 0.6333584	test: 0.6708338	best: 0.6708338 (54)	total: 2.41s	remaining: 41.4s
      55:	learn: 0.6326832	test: 0.6714800	best: 0.6708338 (54)	total: 2.45s	remaining: 41.3s
      56:	learn: 0.6318443	test: 0.6718966	best: 0.6708338 (54)	total: 2.49s	remaining: 41.3s
      57:	learn: 0.6312278	test: 0.6721472	best: 0.6708338 (54)	total: 2.54s	remaining: 41.3s
      58:	learn: 0.6302898	test: 0.6729916	best: 0.6708338 (54)	total: 2.58s	remaining: 41.2s
      59:	learn: 0.6296935	test: 0.6728686	best: 0.6708338 (54)	total: 2.62s	remaining: 41.1s
      60:	learn: 0.6289042	test: 0.6730011	best: 0.6708338 (54)	total: 2.67s	remaining: 41.1s
      61:	learn: 0.6279936	test: 0.6727784	best: 0.6708338 (54)	total: 2.72s	remaining: 41.1s
      62:	learn: 0.6271193	test: 0.6724544	best: 0.6708338 (54)	total: 2.76s	remaining: 41.1s
      63:	learn: 0.6263708	test: 0.6741395	best: 0.6708338 (54)	total: 2.81s	remaining: 41.2s
      64:	learn: 0.6255529	test: 0.6751161	best: 0.6708338 (54)	total: 2.85s	remaining: 41.1s
      65:	learn: 0.6246584	test: 0.6764730	best: 0.6708338 (54)	total: 2.9s	remaining: 41s
      66:	learn: 0.6237307	test: 0.6783710	best: 0.6708338 (54)	total: 2.94s	remaining: 41s
      67:	learn: 0.6228742	test: 0.6785862	best: 0.6708338 (54)	total: 2.99s	remaining: 41s
      68:	learn: 0.6223962	test: 0.6789516	best: 0.6708338 (54)	total: 3.03s	remaining: 40.9s
      69:	learn: 0.6217433	test: 0.6810806	best: 0.6708338 (54)	total: 3.09s	remaining: 41s
      70:	learn: 0.6211139	test: 0.6817986	best: 0.6708338 (54)	total: 3.13s	remaining: 40.9s
      71:	learn: 0.6201478	test: 0.6819012	best: 0.6708338 (54)	total: 3.17s	remaining: 40.9s
      72:	learn: 0.6192426	test: 0.6799481	best: 0.6708338 (54)	total: 3.21s	remaining: 40.8s
      73:	learn: 0.6186729	test: 0.6797435	best: 0.6708338 (54)	total: 3.25s	remaining: 40.7s
      74:	learn: 0.6181005	test: 0.6823492	best: 0.6708338 (54)	total: 3.3s	remaining: 40.7s
      75:	learn: 0.6174807	test: 0.6823164	best: 0.6708338 (54)	total: 3.35s	remaining: 40.7s
      76:	learn: 0.6167961	test: 0.6821867	best: 0.6708338 (54)	total: 3.39s	remaining: 40.7s
      77:	learn: 0.6164334	test: 0.6820092	best: 0.6708338 (54)	total: 3.44s	remaining: 40.7s
      78:	learn: 0.6159383	test: 0.6823203	best: 0.6708338 (54)	total: 3.49s	remaining: 40.6s
      79:	learn: 0.6150851	test: 0.6820539	best: 0.6708338 (54)	total: 3.53s	remaining: 40.6s
      80:	learn: 0.6140526	test: 0.6765233	best: 0.6708338 (54)	total: 3.57s	remaining: 40.6s
      81:	learn: 0.6132080	test: 0.6757866	best: 0.6708338 (54)	total: 3.62s	remaining: 40.5s
      82:	learn: 0.6125713	test: 0.6759641	best: 0.6708338 (54)	total: 3.66s	remaining: 40.4s
      83:	learn: 0.6119246	test: 0.6767981	best: 0.6708338 (54)	total: 3.71s	remaining: 40.5s
      84:	learn: 0.6113541	test: 0.6768011	best: 0.6708338 (54)	total: 3.75s	remaining: 40.4s
      85:	learn: 0.6106838	test: 0.6772000	best: 0.6708338 (54)	total: 3.79s	remaining: 40.3s
      86:	learn: 0.6102525	test: 0.6769882	best: 0.6708338 (54)	total: 3.85s	remaining: 40.4s
      87:	learn: 0.6097658	test: 0.6756917	best: 0.6708338 (54)	total: 3.88s	remaining: 40.3s
      88:	learn: 0.6091941	test: 0.6759529	best: 0.6708338 (54)	total: 3.93s	remaining: 40.2s
      89:	learn: 0.6083162	test: 0.6767683	best: 0.6708338 (54)	total: 3.97s	remaining: 40.1s
      90:	learn: 0.6074754	test: 0.6760534	best: 0.6708338 (54)	total: 4.01s	remaining: 40.1s
      91:	learn: 0.6067836	test: 0.6762996	best: 0.6708338 (54)	total: 4.06s	remaining: 40s
      92:	learn: 0.6060776	test: 0.6799431	best: 0.6708338 (54)	total: 4.11s	remaining: 40s
      93:	learn: 0.6053025	test: 0.6789991	best: 0.6708338 (54)	total: 4.15s	remaining: 40s
      94:	learn: 0.6045803	test: 0.6786682	best: 0.6708338 (54)	total: 4.19s	remaining: 39.9s
      95:	learn: 0.6038557	test: 0.6788973	best: 0.6708338 (54)	total: 4.26s	remaining: 40.1s
      96:	learn: 0.6032559	test: 0.6784953	best: 0.6708338 (54)	total: 4.3s	remaining: 40s
      97:	learn: 0.6028120	test: 0.6783369	best: 0.6708338 (54)	total: 4.36s	remaining: 40.1s
      98:	learn: 0.6020351	test: 0.6782910	best: 0.6708338 (54)	total: 4.4s	remaining: 40s
      99:	learn: 0.6014072	test: 0.6781848	best: 0.6708338 (54)	total: 4.44s	remaining: 40s
      100:	learn: 0.6008861	test: 0.6785125	best: 0.6708338 (54)	total: 4.49s	remaining: 39.9s
      101:	learn: 0.6004698	test: 0.6786498	best: 0.6708338 (54)	total: 4.53s	remaining: 39.9s
      102:	learn: 0.5998648	test: 0.6766817	best: 0.6708338 (54)	total: 4.57s	remaining: 39.8s
      103:	learn: 0.5992708	test: 0.6768716	best: 0.6708338 (54)	total: 4.63s	remaining: 39.9s
      104:	learn: 0.5984763	test: 0.6766307	best: 0.6708338 (54)	total: 4.67s	remaining: 39.8s
      105:	learn: 0.5977721	test: 0.6764687	best: 0.6708338 (54)	total: 4.71s	remaining: 39.7s
      106:	learn: 0.5972141	test: 0.6823747	best: 0.6708338 (54)	total: 4.75s	remaining: 39.7s
      107:	learn: 0.5967131	test: 0.6814171	best: 0.6708338 (54)	total: 4.79s	remaining: 39.6s
      108:	learn: 0.5959124	test: 0.6817456	best: 0.6708338 (54)	total: 4.84s	remaining: 39.6s
      109:	learn: 0.5953595	test: 0.6820769	best: 0.6708338 (54)	total: 4.88s	remaining: 39.5s
      110:	learn: 0.5946695	test: 0.6823466	best: 0.6708338 (54)	total: 4.93s	remaining: 39.5s
      111:	learn: 0.5939315	test: 0.6789164	best: 0.6708338 (54)	total: 4.97s	remaining: 39.4s
      112:	learn: 0.5935362	test: 0.6786802	best: 0.6708338 (54)	total: 5.01s	remaining: 39.4s
      113:	learn: 0.5931459	test: 0.6786031	best: 0.6708338 (54)	total: 5.06s	remaining: 39.3s
      114:	learn: 0.5923800	test: 0.6790277	best: 0.6708338 (54)	total: 5.11s	remaining: 39.3s
      115:	learn: 0.5919186	test: 0.6788697	best: 0.6708338 (54)	total: 5.15s	remaining: 39.3s
      116:	learn: 0.5911993	test: 0.6790558	best: 0.6708338 (54)	total: 5.19s	remaining: 39.2s
      117:	learn: 0.5905563	test: 0.6784888	best: 0.6708338 (54)	total: 5.24s	remaining: 39.1s
      118:	learn: 0.5899400	test: 0.6777929	best: 0.6708338 (54)	total: 5.27s	remaining: 39.1s
      119:	learn: 0.5893317	test: 0.6790057	best: 0.6708338 (54)	total: 5.32s	remaining: 39s
      120:	learn: 0.5887807	test: 0.6790496	best: 0.6708338 (54)	total: 5.38s	remaining: 39.1s
      121:	learn: 0.5884151	test: 0.6779099	best: 0.6708338 (54)	total: 5.42s	remaining: 39s
      122:	learn: 0.5878021	test: 0.6770564	best: 0.6708338 (54)	total: 5.46s	remaining: 39s
      123:	learn: 0.5871252	test: 0.6774903	best: 0.6708338 (54)	total: 5.51s	remaining: 38.9s
      124:	learn: 0.5865454	test: 0.6771139	best: 0.6708338 (54)	total: 5.55s	remaining: 38.9s
      125:	learn: 0.5859662	test: 0.6766828	best: 0.6708338 (54)	total: 5.59s	remaining: 38.8s
      126:	learn: 0.5854305	test: 0.6774108	best: 0.6708338 (54)	total: 5.64s	remaining: 38.8s
      127:	learn: 0.5848778	test: 0.6798829	best: 0.6708338 (54)	total: 5.68s	remaining: 38.7s
      128:	learn: 0.5844061	test: 0.6803586	best: 0.6708338 (54)	total: 5.72s	remaining: 38.7s
      129:	learn: 0.5837886	test: 0.6802275	best: 0.6708338 (54)	total: 5.77s	remaining: 38.6s
      130:	learn: 0.5832358	test: 0.6801507	best: 0.6708338 (54)	total: 5.81s	remaining: 38.6s
      131:	learn: 0.5825646	test: 0.6800255	best: 0.6708338 (54)	total: 5.86s	remaining: 38.5s
      132:	learn: 0.5819655	test: 0.6797985	best: 0.6708338 (54)	total: 5.91s	remaining: 38.5s
      133:	learn: 0.5812139	test: 0.6793095	best: 0.6708338 (54)	total: 5.95s	remaining: 38.5s
      134:	learn: 0.5807122	test: 0.6792111	best: 0.6708338 (54)	total: 5.99s	remaining: 38.4s
      135:	learn: 0.5801429	test: 0.6809774	best: 0.6708338 (54)	total: 6.04s	remaining: 38.4s
      136:	learn: 0.5797015	test: 0.6855537	best: 0.6708338 (54)	total: 6.09s	remaining: 38.4s
      137:	learn: 0.5792533	test: 0.6856699	best: 0.6708338 (54)	total: 6.14s	remaining: 38.3s
      138:	learn: 0.5786684	test: 0.6859470	best: 0.6708338 (54)	total: 6.18s	remaining: 38.3s
      139:	learn: 0.5780678	test: 0.6849164	best: 0.6708338 (54)	total: 6.23s	remaining: 38.3s
      140:	learn: 0.5776351	test: 0.6863465	best: 0.6708338 (54)	total: 6.28s	remaining: 38.3s
      141:	learn: 0.5771406	test: 0.6845452	best: 0.6708338 (54)	total: 6.32s	remaining: 38.2s
      142:	learn: 0.5768380	test: 0.6844341	best: 0.6708338 (54)	total: 6.37s	remaining: 38.1s
      143:	learn: 0.5760396	test: 0.6820937	best: 0.6708338 (54)	total: 6.41s	remaining: 38.1s
      144:	learn: 0.5753353	test: 0.6820442	best: 0.6708338 (54)	total: 6.45s	remaining: 38s
      145:	learn: 0.5748246	test: 0.6799547	best: 0.6708338 (54)	total: 6.49s	remaining: 38s
      146:	learn: 0.5741409	test: 0.6798879	best: 0.6708338 (54)	total: 6.54s	remaining: 37.9s
      147:	learn: 0.5732806	test: 0.6798573	best: 0.6708338 (54)	total: 6.58s	remaining: 37.9s
      148:	learn: 0.5729663	test: 0.6795704	best: 0.6708338 (54)	total: 6.62s	remaining: 37.8s
      149:	learn: 0.5724919	test: 0.6796541	best: 0.6708338 (54)	total: 6.67s	remaining: 37.8s
      150:	learn: 0.5719755	test: 0.6796964	best: 0.6708338 (54)	total: 6.71s	remaining: 37.7s
      151:	learn: 0.5715089	test: 0.6798997	best: 0.6708338 (54)	total: 6.75s	remaining: 37.7s
      152:	learn: 0.5709906	test: 0.6807696	best: 0.6708338 (54)	total: 6.8s	remaining: 37.7s
      153:	learn: 0.5703478	test: 0.6807600	best: 0.6708338 (54)	total: 6.85s	remaining: 37.6s
      154:	learn: 0.5698456	test: 0.6806626	best: 0.6708338 (54)	total: 6.89s	remaining: 37.6s
      155:	learn: 0.5692716	test: 0.6804102	best: 0.6708338 (54)	total: 6.94s	remaining: 37.5s
      156:	learn: 0.5687896	test: 0.6805487	best: 0.6708338 (54)	total: 6.99s	remaining: 37.5s
      157:	learn: 0.5682218	test: 0.6803679	best: 0.6708338 (54)	total: 7.03s	remaining: 37.5s
      158:	learn: 0.5675544	test: 0.6806191	best: 0.6708338 (54)	total: 7.07s	remaining: 37.4s
      159:	learn: 0.5669630	test: 0.6807238	best: 0.6708338 (54)	total: 7.12s	remaining: 37.4s
      160:	learn: 0.5662936	test: 0.6803131	best: 0.6708338 (54)	total: 7.16s	remaining: 37.3s
      161:	learn: 0.5658069	test: 0.6800763	best: 0.6708338 (54)	total: 7.2s	remaining: 37.3s
      162:	learn: 0.5654265	test: 0.6814816	best: 0.6708338 (54)	total: 7.25s	remaining: 37.2s
      163:	learn: 0.5648960	test: 0.6812570	best: 0.6708338 (54)	total: 7.29s	remaining: 37.2s
      164:	learn: 0.5644253	test: 0.6817415	best: 0.6708338 (54)	total: 7.35s	remaining: 37.2s
      165:	learn: 0.5638221	test: 0.6810606	best: 0.6708338 (54)	total: 7.39s	remaining: 37.2s
      166:	learn: 0.5631722	test: 0.6811122	best: 0.6708338 (54)	total: 7.44s	remaining: 37.1s
      167:	learn: 0.5627281	test: 0.6811131	best: 0.6708338 (54)	total: 7.49s	remaining: 37.1s
      168:	learn: 0.5622415	test: 0.6811405	best: 0.6708338 (54)	total: 7.53s	remaining: 37s
      169:	learn: 0.5616149	test: 0.6814074	best: 0.6708338 (54)	total: 7.58s	remaining: 37s
      170:	learn: 0.5611135	test: 0.6822038	best: 0.6708338 (54)	total: 7.62s	remaining: 37s
      171:	learn: 0.5604589	test: 0.6824128	best: 0.6708338 (54)	total: 7.67s	remaining: 36.9s
      172:	learn: 0.5600334	test: 0.6846501	best: 0.6708338 (54)	total: 7.71s	remaining: 36.9s
      173:	learn: 0.5594031	test: 0.6848046	best: 0.6708338 (54)	total: 7.76s	remaining: 36.8s
      174:	learn: 0.5585984	test: 0.6858990	best: 0.6708338 (54)	total: 7.8s	remaining: 36.8s
      175:	learn: 0.5580189	test: 0.6873289	best: 0.6708338 (54)	total: 7.84s	remaining: 36.7s
      176:	learn: 0.5575972	test: 0.6876593	best: 0.6708338 (54)	total: 7.89s	remaining: 36.7s
      177:	learn: 0.5570794	test: 0.6875783	best: 0.6708338 (54)	total: 7.93s	remaining: 36.6s
      178:	learn: 0.5566993	test: 0.6878010	best: 0.6708338 (54)	total: 7.98s	remaining: 36.6s
      179:	learn: 0.5561192	test: 0.6871698	best: 0.6708338 (54)	total: 8.02s	remaining: 36.5s
      180:	learn: 0.5555289	test: 0.6869620	best: 0.6708338 (54)	total: 8.07s	remaining: 36.5s
      181:	learn: 0.5550720	test: 0.6871535	best: 0.6708338 (54)	total: 8.11s	remaining: 36.5s
      182:	learn: 0.5547376	test: 0.6872892	best: 0.6708338 (54)	total: 8.15s	remaining: 36.4s
      183:	learn: 0.5542799	test: 0.6879634	best: 0.6708338 (54)	total: 8.2s	remaining: 36.4s
      184:	learn: 0.5537989	test: 0.6880317	best: 0.6708338 (54)	total: 8.25s	remaining: 36.3s
      185:	learn: 0.5534086	test: 0.6885426	best: 0.6708338 (54)	total: 8.29s	remaining: 36.3s
      186:	learn: 0.5527962	test: 0.6886879	best: 0.6708338 (54)	total: 8.35s	remaining: 36.3s
      187:	learn: 0.5522852	test: 0.6889461	best: 0.6708338 (54)	total: 8.39s	remaining: 36.2s
      188:	learn: 0.5518738	test: 0.6891663	best: 0.6708338 (54)	total: 8.44s	remaining: 36.2s
      189:	learn: 0.5514225	test: 0.6892149	best: 0.6708338 (54)	total: 8.48s	remaining: 36.2s
      190:	learn: 0.5508098	test: 0.6887317	best: 0.6708338 (54)	total: 8.52s	remaining: 36.1s
      191:	learn: 0.5503110	test: 0.6887860	best: 0.6708338 (54)	total: 8.56s	remaining: 36s
      192:	learn: 0.5497937	test: 0.6886964	best: 0.6708338 (54)	total: 8.61s	remaining: 36s
      193:	learn: 0.5492096	test: 0.6866649	best: 0.6708338 (54)	total: 8.65s	remaining: 35.9s
      194:	learn: 0.5486162	test: 0.6864622	best: 0.6708338 (54)	total: 8.69s	remaining: 35.9s
      195:	learn: 0.5480402	test: 0.6857167	best: 0.6708338 (54)	total: 8.74s	remaining: 35.9s
      196:	learn: 0.5473759	test: 0.6860206	best: 0.6708338 (54)	total: 8.79s	remaining: 35.8s
      197:	learn: 0.5469958	test: 0.6858485	best: 0.6708338 (54)	total: 8.82s	remaining: 35.7s
      198:	learn: 0.5464737	test: 0.6861737	best: 0.6708338 (54)	total: 8.87s	remaining: 35.7s
      199:	learn: 0.5460784	test: 0.6862507	best: 0.6708338 (54)	total: 8.91s	remaining: 35.7s
      200:	learn: 0.5454849	test: 0.6861803	best: 0.6708338 (54)	total: 8.96s	remaining: 35.6s
      201:	learn: 0.5448646	test: 0.6863811	best: 0.6708338 (54)	total: 9.01s	remaining: 35.6s
      202:	learn: 0.5443217	test: 0.6857599	best: 0.6708338 (54)	total: 9.05s	remaining: 35.5s
      203:	learn: 0.5433801	test: 0.6854647	best: 0.6708338 (54)	total: 9.09s	remaining: 35.5s
      204:	learn: 0.5426812	test: 0.6857160	best: 0.6708338 (54)	total: 9.14s	remaining: 35.5s
      205:	learn: 0.5422495	test: 0.6859606	best: 0.6708338 (54)	total: 9.18s	remaining: 35.4s
      206:	learn: 0.5419361	test: 0.6850063	best: 0.6708338 (54)	total: 9.22s	remaining: 35.3s
      207:	learn: 0.5416025	test: 0.6849040	best: 0.6708338 (54)	total: 9.28s	remaining: 35.3s
      208:	learn: 0.5410440	test: 0.6841608	best: 0.6708338 (54)	total: 9.33s	remaining: 35.3s
      209:	learn: 0.5405185	test: 0.6847786	best: 0.6708338 (54)	total: 9.38s	remaining: 35.3s
      210:	learn: 0.5400153	test: 0.6850153	best: 0.6708338 (54)	total: 9.42s	remaining: 35.2s
      211:	learn: 0.5395379	test: 0.6873379	best: 0.6708338 (54)	total: 9.46s	remaining: 35.2s
      212:	learn: 0.5391725	test: 0.6866267	best: 0.6708338 (54)	total: 9.5s	remaining: 35.1s
      213:	learn: 0.5386132	test: 0.6868718	best: 0.6708338 (54)	total: 9.55s	remaining: 35.1s
      214:	learn: 0.5381395	test: 0.6866683	best: 0.6708338 (54)	total: 9.59s	remaining: 35s
      215:	learn: 0.5377527	test: 0.6867055	best: 0.6708338 (54)	total: 9.64s	remaining: 35s
      216:	learn: 0.5373577	test: 0.6842922	best: 0.6708338 (54)	total: 9.68s	remaining: 34.9s
      217:	learn: 0.5367204	test: 0.6838881	best: 0.6708338 (54)	total: 9.72s	remaining: 34.9s
      218:	learn: 0.5361804	test: 0.6874548	best: 0.6708338 (54)	total: 9.77s	remaining: 34.8s
      219:	learn: 0.5356838	test: 0.6862339	best: 0.6708338 (54)	total: 9.81s	remaining: 34.8s
      220:	learn: 0.5352508	test: 0.6863552	best: 0.6708338 (54)	total: 9.85s	remaining: 34.7s
      221:	learn: 0.5346342	test: 0.6890256	best: 0.6708338 (54)	total: 9.9s	remaining: 34.7s
      222:	learn: 0.5342269	test: 0.6892772	best: 0.6708338 (54)	total: 9.94s	remaining: 34.6s
      223:	learn: 0.5338861	test: 0.6893114	best: 0.6708338 (54)	total: 9.98s	remaining: 34.6s
      224:	learn: 0.5334012	test: 0.6899953	best: 0.6708338 (54)	total: 10s	remaining: 34.6s
      225:	learn: 0.5329204	test: 0.6899721	best: 0.6708338 (54)	total: 10.1s	remaining: 34.5s
      226:	learn: 0.5321826	test: 0.6943232	best: 0.6708338 (54)	total: 10.1s	remaining: 34.5s
      227:	learn: 0.5314597	test: 0.6935079	best: 0.6708338 (54)	total: 10.2s	remaining: 34.4s
      228:	learn: 0.5307959	test: 0.6930993	best: 0.6708338 (54)	total: 10.2s	remaining: 34.4s
      229:	learn: 0.5304366	test: 0.6937990	best: 0.6708338 (54)	total: 10.3s	remaining: 34.4s
      230:	learn: 0.5302056	test: 0.6935523	best: 0.6708338 (54)	total: 10.3s	remaining: 34.3s
      231:	learn: 0.5298294	test: 0.6962496	best: 0.6708338 (54)	total: 10.4s	remaining: 34.3s
      232:	learn: 0.5295184	test: 0.6956543	best: 0.6708338 (54)	total: 10.4s	remaining: 34.3s
      233:	learn: 0.5290360	test: 0.6958039	best: 0.6708338 (54)	total: 10.5s	remaining: 34.2s
      234:	learn: 0.5282906	test: 0.6965831	best: 0.6708338 (54)	total: 10.5s	remaining: 34.2s
      235:	learn: 0.5278897	test: 0.6979110	best: 0.6708338 (54)	total: 10.5s	remaining: 34.1s
      236:	learn: 0.5272961	test: 0.6976967	best: 0.6708338 (54)	total: 10.6s	remaining: 34.1s
      237:	learn: 0.5269725	test: 0.6973229	best: 0.6708338 (54)	total: 10.6s	remaining: 34s
      238:	learn: 0.5261907	test: 0.6974910	best: 0.6708338 (54)	total: 10.7s	remaining: 34s
      239:	learn: 0.5256204	test: 0.6972928	best: 0.6708338 (54)	total: 10.7s	remaining: 33.9s
      240:	learn: 0.5250955	test: 0.6972942	best: 0.6708338 (54)	total: 10.8s	remaining: 33.9s
      241:	learn: 0.5243552	test: 0.6969447	best: 0.6708338 (54)	total: 10.8s	remaining: 33.9s
      242:	learn: 0.5238102	test: 0.6969795	best: 0.6708338 (54)	total: 10.9s	remaining: 33.9s
      243:	learn: 0.5233838	test: 0.6969380	best: 0.6708338 (54)	total: 10.9s	remaining: 33.8s
      244:	learn: 0.5229325	test: 0.6971260	best: 0.6708338 (54)	total: 11s	remaining: 33.8s
      245:	learn: 0.5224451	test: 0.6964847	best: 0.6708338 (54)	total: 11s	remaining: 33.7s
      246:	learn: 0.5220226	test: 0.6961412	best: 0.6708338 (54)	total: 11s	remaining: 33.7s
      247:	learn: 0.5212209	test: 0.6960978	best: 0.6708338 (54)	total: 11.1s	remaining: 33.6s
      248:	learn: 0.5207914	test: 0.6947099	best: 0.6708338 (54)	total: 11.1s	remaining: 33.6s
      249:	learn: 0.5202801	test: 0.6942288	best: 0.6708338 (54)	total: 11.2s	remaining: 33.6s
      250:	learn: 0.5196362	test: 0.6969195	best: 0.6708338 (54)	total: 11.2s	remaining: 33.5s
      251:	learn: 0.5192439	test: 0.6970638	best: 0.6708338 (54)	total: 11.3s	remaining: 33.4s
      252:	learn: 0.5188561	test: 0.6972076	best: 0.6708338 (54)	total: 11.3s	remaining: 33.4s
      253:	learn: 0.5180696	test: 0.6969640	best: 0.6708338 (54)	total: 11.4s	remaining: 33.4s
      254:	learn: 0.5175975	test: 0.6974117	best: 0.6708338 (54)	total: 11.4s	remaining: 33.4s
      Stopped by overfitting detector  (200 iterations wait)
      
      bestTest = 0.6708337826
      bestIteration = 54
      
      Shrink model to first 55 iterations.
      QueryFullTime: 9.931531
      QueryExecutionTime: 5.46783
      Skipping test eval output
      0.1928848812 min passed
      
      • Non-spark Classifier
      Learning rate set to 0.04234
      0:	learn: 0.6907333	test: 0.6934055	best: 0.6934055 (0)	total: 69.9ms	remaining: 1m 9s
      1:	learn: 0.6883369	test: 0.6918864	best: 0.6918864 (1)	total: 72.4ms	remaining: 36.1s
      2:	learn: 0.6859388	test: 0.6909591	best: 0.6909591 (2)	total: 76.1ms	remaining: 25.3s
      3:	learn: 0.6835561	test: 0.6927643	best: 0.6909591 (2)	total: 78.2ms	remaining: 19.5s
      4:	learn: 0.6813031	test: 0.6915998	best: 0.6909591 (2)	total: 80.8ms	remaining: 16.1s
      5:	learn: 0.6791838	test: 0.6965053	best: 0.6909591 (2)	total: 83.6ms	remaining: 13.8s
      6:	learn: 0.6766805	test: 0.6928038	best: 0.6909591 (2)	total: 86.3ms	remaining: 12.2s
      7:	learn: 0.6750271	test: 0.6915810	best: 0.6909591 (2)	total: 88.9ms	remaining: 11s
      8:	learn: 0.6730072	test: 0.6937443	best: 0.6909591 (2)	total: 92.4ms	remaining: 10.2s
      9:	learn: 0.6711396	test: 0.6930318	best: 0.6909591 (2)	total: 95.8ms	remaining: 9.48s
      10:	learn: 0.6695155	test: 0.6932250	best: 0.6909591 (2)	total: 98.5ms	remaining: 8.85s
      11:	learn: 0.6680512	test: 0.6917149	best: 0.6909591 (2)	total: 101ms	remaining: 8.32s
      12:	learn: 0.6660460	test: 0.6899282	best: 0.6899282 (12)	total: 104ms	remaining: 7.87s
      13:	learn: 0.6643208	test: 0.6909595	best: 0.6899282 (12)	total: 106ms	remaining: 7.46s
      14:	learn: 0.6632039	test: 0.6910474	best: 0.6899282 (12)	total: 109ms	remaining: 7.18s
      15:	learn: 0.6622477	test: 0.6909892	best: 0.6899282 (12)	total: 112ms	remaining: 6.88s
      16:	learn: 0.6603088	test: 0.6912450	best: 0.6899282 (12)	total: 115ms	remaining: 6.63s
      17:	learn: 0.6591247	test: 0.6916949	best: 0.6899282 (12)	total: 117ms	remaining: 6.38s
      18:	learn: 0.6579200	test: 0.6921221	best: 0.6899282 (12)	total: 119ms	remaining: 6.17s
      19:	learn: 0.6567383	test: 0.6917684	best: 0.6899282 (12)	total: 122ms	remaining: 5.98s
      20:	learn: 0.6550785	test: 0.6900009	best: 0.6899282 (12)	total: 126ms	remaining: 5.86s
      21:	learn: 0.6540419	test: 0.6900928	best: 0.6899282 (12)	total: 128ms	remaining: 5.7s
      22:	learn: 0.6530462	test: 0.6897946	best: 0.6897946 (22)	total: 131ms	remaining: 5.57s
      23:	learn: 0.6517368	test: 0.6888947	best: 0.6888947 (23)	total: 134ms	remaining: 5.44s
      24:	learn: 0.6506819	test: 0.6887078	best: 0.6887078 (24)	total: 137ms	remaining: 5.34s
      25:	learn: 0.6493603	test: 0.6897206	best: 0.6887078 (24)	total: 140ms	remaining: 5.23s
      26:	learn: 0.6483546	test: 0.6887405	best: 0.6887078 (24)	total: 142ms	remaining: 5.12s
      27:	learn: 0.6471033	test: 0.6888086	best: 0.6887078 (24)	total: 145ms	remaining: 5.03s
      28:	learn: 0.6455769	test: 0.6820825	best: 0.6820825 (28)	total: 148ms	remaining: 4.94s
      29:	learn: 0.6442585	test: 0.6851333	best: 0.6820825 (28)	total: 151ms	remaining: 4.89s
      30:	learn: 0.6431458	test: 0.6867496	best: 0.6820825 (28)	total: 154ms	remaining: 4.81s
      31:	learn: 0.6416989	test: 0.6867359	best: 0.6820825 (28)	total: 159ms	remaining: 4.8s
      32:	learn: 0.6410456	test: 0.6850177	best: 0.6820825 (28)	total: 162ms	remaining: 4.75s
      33:	learn: 0.6398840	test: 0.6843240	best: 0.6820825 (28)	total: 165ms	remaining: 4.7s
      34:	learn: 0.6388452	test: 0.6842851	best: 0.6820825 (28)	total: 168ms	remaining: 4.63s
      35:	learn: 0.6377152	test: 0.6848173	best: 0.6820825 (28)	total: 171ms	remaining: 4.57s
      36:	learn: 0.6369207	test: 0.6858828	best: 0.6820825 (28)	total: 175ms	remaining: 4.56s
      37:	learn: 0.6356414	test: 0.6866025	best: 0.6820825 (28)	total: 178ms	remaining: 4.51s
      38:	learn: 0.6345916	test: 0.6891812	best: 0.6820825 (28)	total: 180ms	remaining: 4.44s
      39:	learn: 0.6335415	test: 0.6878497	best: 0.6820825 (28)	total: 183ms	remaining: 4.4s
      40:	learn: 0.6318602	test: 0.6877215	best: 0.6820825 (28)	total: 186ms	remaining: 4.36s
      41:	learn: 0.6311923	test: 0.6875604	best: 0.6820825 (28)	total: 189ms	remaining: 4.32s
      42:	learn: 0.6301560	test: 0.6898103	best: 0.6820825 (28)	total: 192ms	remaining: 4.28s
      43:	learn: 0.6291824	test: 0.6907460	best: 0.6820825 (28)	total: 195ms	remaining: 4.24s
      44:	learn: 0.6281965	test: 0.6884538	best: 0.6820825 (28)	total: 197ms	remaining: 4.19s
      45:	learn: 0.6270622	test: 0.6887461	best: 0.6820825 (28)	total: 200ms	remaining: 4.15s
      46:	learn: 0.6264081	test: 0.6875784	best: 0.6820825 (28)	total: 203ms	remaining: 4.11s
      47:	learn: 0.6247239	test: 0.6868999	best: 0.6820825 (28)	total: 205ms	remaining: 4.07s
      48:	learn: 0.6232683	test: 0.6910637	best: 0.6820825 (28)	total: 208ms	remaining: 4.04s
      49:	learn: 0.6224106	test: 0.6888958	best: 0.6820825 (28)	total: 212ms	remaining: 4.02s
      50:	learn: 0.6213637	test: 0.6887125	best: 0.6820825 (28)	total: 215ms	remaining: 4s
      51:	learn: 0.6204001	test: 0.6903890	best: 0.6820825 (28)	total: 217ms	remaining: 3.96s
      52:	learn: 0.6195087	test: 0.6908570	best: 0.6820825 (28)	total: 220ms	remaining: 3.93s
      53:	learn: 0.6184032	test: 0.6909350	best: 0.6820825 (28)	total: 223ms	remaining: 3.9s
      54:	learn: 0.6173451	test: 0.6904791	best: 0.6820825 (28)	total: 226ms	remaining: 3.88s
      55:	learn: 0.6163927	test: 0.6901005	best: 0.6820825 (28)	total: 228ms	remaining: 3.85s
      56:	learn: 0.6155238	test: 0.6903411	best: 0.6820825 (28)	total: 231ms	remaining: 3.82s
      57:	learn: 0.6144956	test: 0.6909613	best: 0.6820825 (28)	total: 234ms	remaining: 3.79s
      58:	learn: 0.6135984	test: 0.6912862	best: 0.6820825 (28)	total: 236ms	remaining: 3.77s
      59:	learn: 0.6126525	test: 0.6905357	best: 0.6820825 (28)	total: 240ms	remaining: 3.75s
      60:	learn: 0.6119720	test: 0.6925464	best: 0.6820825 (28)	total: 243ms	remaining: 3.73s
      61:	learn: 0.6110449	test: 0.6940346	best: 0.6820825 (28)	total: 246ms	remaining: 3.72s
      62:	learn: 0.6101492	test: 0.6928039	best: 0.6820825 (28)	total: 249ms	remaining: 3.7s
      63:	learn: 0.6094447	test: 0.6933093	best: 0.6820825 (28)	total: 252ms	remaining: 3.68s
      64:	learn: 0.6085363	test: 0.6935958	best: 0.6820825 (28)	total: 256ms	remaining: 3.68s
      65:	learn: 0.6076747	test: 0.6909924	best: 0.6820825 (28)	total: 258ms	remaining: 3.66s
      66:	learn: 0.6069710	test: 0.6908047	best: 0.6820825 (28)	total: 261ms	remaining: 3.63s
      67:	learn: 0.6062350	test: 0.6907710	best: 0.6820825 (28)	total: 265ms	remaining: 3.63s
      68:	learn: 0.6053901	test: 0.6907428	best: 0.6820825 (28)	total: 267ms	remaining: 3.6s
      69:	learn: 0.6049370	test: 0.6914058	best: 0.6820825 (28)	total: 269ms	remaining: 3.58s
      70:	learn: 0.6040745	test: 0.6904328	best: 0.6820825 (28)	total: 274ms	remaining: 3.58s
      71:	learn: 0.6032473	test: 0.6907755	best: 0.6820825 (28)	total: 277ms	remaining: 3.56s
      72:	learn: 0.6025703	test: 0.6908070	best: 0.6820825 (28)	total: 279ms	remaining: 3.54s
      73:	learn: 0.6017591	test: 0.6971870	best: 0.6820825 (28)	total: 282ms	remaining: 3.53s
      74:	learn: 0.6008522	test: 0.6962943	best: 0.6820825 (28)	total: 285ms	remaining: 3.51s
      75:	learn: 0.6003696	test: 0.6957674	best: 0.6820825 (28)	total: 287ms	remaining: 3.49s
      76:	learn: 0.5997602	test: 0.6959874	best: 0.6820825 (28)	total: 289ms	remaining: 3.47s
      77:	learn: 0.5990430	test: 0.6961270	best: 0.6820825 (28)	total: 292ms	remaining: 3.45s
      78:	learn: 0.5980603	test: 0.6964653	best: 0.6820825 (28)	total: 294ms	remaining: 3.43s
      79:	learn: 0.5971378	test: 0.6972346	best: 0.6820825 (28)	total: 297ms	remaining: 3.41s
      80:	learn: 0.5961766	test: 0.6967014	best: 0.6820825 (28)	total: 299ms	remaining: 3.39s
      81:	learn: 0.5953091	test: 0.6977083	best: 0.6820825 (28)	total: 301ms	remaining: 3.37s
      82:	learn: 0.5946787	test: 0.6979004	best: 0.6820825 (28)	total: 304ms	remaining: 3.36s
      83:	learn: 0.5939259	test: 0.6985828	best: 0.6820825 (28)	total: 307ms	remaining: 3.35s
      84:	learn: 0.5930713	test: 0.6995604	best: 0.6820825 (28)	total: 310ms	remaining: 3.34s
      85:	learn: 0.5923855	test: 0.7005564	best: 0.6820825 (28)	total: 313ms	remaining: 3.32s
      86:	learn: 0.5917679	test: 0.7010898	best: 0.6820825 (28)	total: 315ms	remaining: 3.31s
      87:	learn: 0.5911494	test: 0.7020453	best: 0.6820825 (28)	total: 318ms	remaining: 3.29s
      88:	learn: 0.5903601	test: 0.7016751	best: 0.6820825 (28)	total: 321ms	remaining: 3.28s
      89:	learn: 0.5895654	test: 0.7013664	best: 0.6820825 (28)	total: 323ms	remaining: 3.27s
      90:	learn: 0.5886291	test: 0.7006702	best: 0.6820825 (28)	total: 326ms	remaining: 3.25s
      91:	learn: 0.5878539	test: 0.6998758	best: 0.6820825 (28)	total: 328ms	remaining: 3.23s
      92:	learn: 0.5872643	test: 0.7052477	best: 0.6820825 (28)	total: 330ms	remaining: 3.22s
      93:	learn: 0.5865489	test: 0.7040952	best: 0.6820825 (28)	total: 333ms	remaining: 3.21s
      94:	learn: 0.5854644	test: 0.7043313	best: 0.6820825 (28)	total: 335ms	remaining: 3.19s
      95:	learn: 0.5846715	test: 0.7047481	best: 0.6820825 (28)	total: 339ms	remaining: 3.19s
      96:	learn: 0.5835842	test: 0.7048192	best: 0.6820825 (28)	total: 341ms	remaining: 3.18s
      97:	learn: 0.5828492	test: 0.7011359	best: 0.6820825 (28)	total: 344ms	remaining: 3.17s
      98:	learn: 0.5821135	test: 0.7006202	best: 0.6820825 (28)	total: 347ms	remaining: 3.16s
      99:	learn: 0.5815861	test: 0.7040518	best: 0.6820825 (28)	total: 349ms	remaining: 3.15s
      100:	learn: 0.5806286	test: 0.7023472	best: 0.6820825 (28)	total: 352ms	remaining: 3.13s
      101:	learn: 0.5802173	test: 0.7024793	best: 0.6820825 (28)	total: 355ms	remaining: 3.12s
      102:	learn: 0.5794499	test: 0.7028918	best: 0.6820825 (28)	total: 357ms	remaining: 3.11s
      103:	learn: 0.5784966	test: 0.7008130	best: 0.6820825 (28)	total: 360ms	remaining: 3.1s
      104:	learn: 0.5776136	test: 0.7052093	best: 0.6820825 (28)	total: 363ms	remaining: 3.1s
      105:	learn: 0.5766977	test: 0.7048313	best: 0.6820825 (28)	total: 366ms	remaining: 3.09s
      106:	learn: 0.5759125	test: 0.7050794	best: 0.6820825 (28)	total: 369ms	remaining: 3.08s
      107:	learn: 0.5751291	test: 0.7047060	best: 0.6820825 (28)	total: 372ms	remaining: 3.07s
      108:	learn: 0.5746224	test: 0.7047158	best: 0.6820825 (28)	total: 375ms	remaining: 3.06s
      109:	learn: 0.5738598	test: 0.7013574	best: 0.6820825 (28)	total: 378ms	remaining: 3.06s
      110:	learn: 0.5730496	test: 0.7010558	best: 0.6820825 (28)	total: 380ms	remaining: 3.04s
      111:	learn: 0.5727233	test: 0.6992313	best: 0.6820825 (28)	total: 383ms	remaining: 3.04s
      112:	learn: 0.5718348	test: 0.6974199	best: 0.6820825 (28)	total: 386ms	remaining: 3.03s
      113:	learn: 0.5715423	test: 0.6970987	best: 0.6820825 (28)	total: 388ms	remaining: 3.01s
      114:	learn: 0.5711113	test: 0.6957460	best: 0.6820825 (28)	total: 390ms	remaining: 3s
      115:	learn: 0.5701694	test: 0.6946096	best: 0.6820825 (28)	total: 393ms	remaining: 2.99s
      116:	learn: 0.5691910	test: 0.6940757	best: 0.6820825 (28)	total: 396ms	remaining: 2.98s
      117:	learn: 0.5686365	test: 0.6938791	best: 0.6820825 (28)	total: 398ms	remaining: 2.98s
      118:	learn: 0.5676760	test: 0.6960747	best: 0.6820825 (28)	total: 401ms	remaining: 2.97s
      119:	learn: 0.5667890	test: 0.6961724	best: 0.6820825 (28)	total: 404ms	remaining: 2.96s
      120:	learn: 0.5656830	test: 0.6989301	best: 0.6820825 (28)	total: 407ms	remaining: 2.95s
      121:	learn: 0.5650255	test: 0.6991684	best: 0.6820825 (28)	total: 409ms	remaining: 2.94s
      122:	learn: 0.5643491	test: 0.6998029	best: 0.6820825 (28)	total: 412ms	remaining: 2.94s
      123:	learn: 0.5634402	test: 0.6994904	best: 0.6820825 (28)	total: 415ms	remaining: 2.93s
      124:	learn: 0.5628026	test: 0.7007443	best: 0.6820825 (28)	total: 418ms	remaining: 2.92s
      125:	learn: 0.5619433	test: 0.7009373	best: 0.6820825 (28)	total: 421ms	remaining: 2.92s
      126:	learn: 0.5611121	test: 0.7010796	best: 0.6820825 (28)	total: 423ms	remaining: 2.91s
      127:	learn: 0.5606990	test: 0.7009158	best: 0.6820825 (28)	total: 426ms	remaining: 2.9s
      128:	learn: 0.5600109	test: 0.7007270	best: 0.6820825 (28)	total: 429ms	remaining: 2.89s
      129:	learn: 0.5594155	test: 0.7006392	best: 0.6820825 (28)	total: 432ms	remaining: 2.89s
      130:	learn: 0.5586232	test: 0.6998949	best: 0.6820825 (28)	total: 435ms	remaining: 2.88s
      131:	learn: 0.5578833	test: 0.7003671	best: 0.6820825 (28)	total: 438ms	remaining: 2.88s
      132:	learn: 0.5570367	test: 0.6983423	best: 0.6820825 (28)	total: 441ms	remaining: 2.87s
      133:	learn: 0.5559726	test: 0.6979099	best: 0.6820825 (28)	total: 443ms	remaining: 2.86s
      134:	learn: 0.5555009	test: 0.6977401	best: 0.6820825 (28)	total: 446ms	remaining: 2.85s
      135:	learn: 0.5545237	test: 0.6977384	best: 0.6820825 (28)	total: 448ms	remaining: 2.85s
      136:	learn: 0.5539702	test: 0.6975929	best: 0.6820825 (28)	total: 454ms	remaining: 2.86s
      137:	learn: 0.5532214	test: 0.6968573	best: 0.6820825 (28)	total: 457ms	remaining: 2.85s
      138:	learn: 0.5527601	test: 0.6962040	best: 0.6820825 (28)	total: 460ms	remaining: 2.85s
      139:	learn: 0.5522160	test: 0.6969309	best: 0.6820825 (28)	total: 463ms	remaining: 2.84s
      140:	learn: 0.5516769	test: 0.6965893	best: 0.6820825 (28)	total: 466ms	remaining: 2.84s
      141:	learn: 0.5509960	test: 0.6959529	best: 0.6820825 (28)	total: 469ms	remaining: 2.84s
      142:	learn: 0.5504708	test: 0.6956021	best: 0.6820825 (28)	total: 473ms	remaining: 2.83s
      143:	learn: 0.5498621	test: 0.6954586	best: 0.6820825 (28)	total: 475ms	remaining: 2.82s
      144:	learn: 0.5493028	test: 0.6955069	best: 0.6820825 (28)	total: 478ms	remaining: 2.82s
      145:	learn: 0.5486227	test: 0.6957255	best: 0.6820825 (28)	total: 481ms	remaining: 2.81s
      146:	learn: 0.5478340	test: 0.6960894	best: 0.6820825 (28)	total: 484ms	remaining: 2.81s
      147:	learn: 0.5467598	test: 0.6936223	best: 0.6820825 (28)	total: 488ms	remaining: 2.81s
      148:	learn: 0.5462445	test: 0.6929327	best: 0.6820825 (28)	total: 490ms	remaining: 2.8s
      149:	learn: 0.5458343	test: 0.6927735	best: 0.6820825 (28)	total: 493ms	remaining: 2.79s
      150:	learn: 0.5454337	test: 0.6920412	best: 0.6820825 (28)	total: 496ms	remaining: 2.79s
      151:	learn: 0.5447957	test: 0.6918695	best: 0.6820825 (28)	total: 498ms	remaining: 2.78s
      152:	learn: 0.5440001	test: 0.6915209	best: 0.6820825 (28)	total: 501ms	remaining: 2.77s
      153:	learn: 0.5433346	test: 0.6910254	best: 0.6820825 (28)	total: 504ms	remaining: 2.77s
      154:	learn: 0.5423738	test: 0.6904851	best: 0.6820825 (28)	total: 506ms	remaining: 2.76s
      155:	learn: 0.5416471	test: 0.6906824	best: 0.6820825 (28)	total: 509ms	remaining: 2.75s
      156:	learn: 0.5408823	test: 0.6921022	best: 0.6820825 (28)	total: 512ms	remaining: 2.75s
      157:	learn: 0.5403773	test: 0.6931616	best: 0.6820825 (28)	total: 515ms	remaining: 2.74s
      158:	learn: 0.5394566	test: 0.6930207	best: 0.6820825 (28)	total: 518ms	remaining: 2.74s
      159:	learn: 0.5388343	test: 0.6910640	best: 0.6820825 (28)	total: 521ms	remaining: 2.73s
      160:	learn: 0.5382452	test: 0.6907257	best: 0.6820825 (28)	total: 524ms	remaining: 2.73s
      161:	learn: 0.5372114	test: 0.6921950	best: 0.6820825 (28)	total: 527ms	remaining: 2.72s
      162:	learn: 0.5365585	test: 0.6921951	best: 0.6820825 (28)	total: 529ms	remaining: 2.72s
      163:	learn: 0.5360099	test: 0.6923580	best: 0.6820825 (28)	total: 532ms	remaining: 2.71s
      164:	learn: 0.5353113	test: 0.6953615	best: 0.6820825 (28)	total: 534ms	remaining: 2.7s
      165:	learn: 0.5345420	test: 0.6949696	best: 0.6820825 (28)	total: 537ms	remaining: 2.7s
      166:	learn: 0.5339014	test: 0.6946505	best: 0.6820825 (28)	total: 541ms	remaining: 2.7s
      167:	learn: 0.5332565	test: 0.6929266	best: 0.6820825 (28)	total: 548ms	remaining: 2.71s
      168:	learn: 0.5322938	test: 0.6927435	best: 0.6820825 (28)	total: 552ms	remaining: 2.71s
      169:	learn: 0.5316334	test: 0.6927825	best: 0.6820825 (28)	total: 568ms	remaining: 2.77s
      170:	learn: 0.5310227	test: 0.6929270	best: 0.6820825 (28)	total: 579ms	remaining: 2.81s
      171:	learn: 0.5304531	test: 0.6930307	best: 0.6820825 (28)	total: 593ms	remaining: 2.86s
      172:	learn: 0.5297225	test: 0.6934924	best: 0.6820825 (28)	total: 599ms	remaining: 2.86s
      173:	learn: 0.5292559	test: 0.6936669	best: 0.6820825 (28)	total: 604ms	remaining: 2.87s
      174:	learn: 0.5287439	test: 0.6898421	best: 0.6820825 (28)	total: 607ms	remaining: 2.86s
      175:	learn: 0.5282159	test: 0.6898464	best: 0.6820825 (28)	total: 610ms	remaining: 2.86s
      176:	learn: 0.5277093	test: 0.6894578	best: 0.6820825 (28)	total: 613ms	remaining: 2.85s
      177:	learn: 0.5269364	test: 0.6902541	best: 0.6820825 (28)	total: 628ms	remaining: 2.9s
      178:	learn: 0.5266259	test: 0.6899436	best: 0.6820825 (28)	total: 638ms	remaining: 2.93s
      179:	learn: 0.5261969	test: 0.6905175	best: 0.6820825 (28)	total: 643ms	remaining: 2.93s
      180:	learn: 0.5257401	test: 0.6904667	best: 0.6820825 (28)	total: 646ms	remaining: 2.92s
      181:	learn: 0.5252072	test: 0.6903978	best: 0.6820825 (28)	total: 656ms	remaining: 2.95s
      182:	learn: 0.5242076	test: 0.6906026	best: 0.6820825 (28)	total: 672ms	remaining: 3s
      183:	learn: 0.5237014	test: 0.6942173	best: 0.6820825 (28)	total: 679ms	remaining: 3.01s
      184:	learn: 0.5229451	test: 0.6942846	best: 0.6820825 (28)	total: 685ms	remaining: 3.02s
      185:	learn: 0.5226447	test: 0.6943438	best: 0.6820825 (28)	total: 688ms	remaining: 3.01s
      186:	learn: 0.5221322	test: 0.6944431	best: 0.6820825 (28)	total: 695ms	remaining: 3.02s
      187:	learn: 0.5218055	test: 0.6943754	best: 0.6820825 (28)	total: 704ms	remaining: 3.04s
      188:	learn: 0.5209297	test: 0.6980298	best: 0.6820825 (28)	total: 708ms	remaining: 3.04s
      189:	learn: 0.5202584	test: 0.6964868	best: 0.6820825 (28)	total: 711ms	remaining: 3.03s
      190:	learn: 0.5196308	test: 0.6969298	best: 0.6820825 (28)	total: 717ms	remaining: 3.03s
      191:	learn: 0.5187765	test: 0.6966370	best: 0.6820825 (28)	total: 721ms	remaining: 3.03s
      192:	learn: 0.5179999	test: 0.6950249	best: 0.6820825 (28)	total: 725ms	remaining: 3.03s
      193:	learn: 0.5171991	test: 0.6949122	best: 0.6820825 (28)	total: 728ms	remaining: 3.03s
      194:	learn: 0.5162728	test: 0.6950643	best: 0.6820825 (28)	total: 732ms	remaining: 3.02s
      195:	learn: 0.5155336	test: 0.6960303	best: 0.6820825 (28)	total: 737ms	remaining: 3.02s
      196:	learn: 0.5148652	test: 0.6947453	best: 0.6820825 (28)	total: 742ms	remaining: 3.03s
      197:	learn: 0.5142374	test: 0.6947048	best: 0.6820825 (28)	total: 745ms	remaining: 3.02s
      198:	learn: 0.5134916	test: 0.6952227	best: 0.6820825 (28)	total: 754ms	remaining: 3.04s
      199:	learn: 0.5127784	test: 0.6942819	best: 0.6820825 (28)	total: 760ms	remaining: 3.04s
      200:	learn: 0.5122417	test: 0.6944406	best: 0.6820825 (28)	total: 763ms	remaining: 3.03s
      201:	learn: 0.5113561	test: 0.6939148	best: 0.6820825 (28)	total: 770ms	remaining: 3.04s
      202:	learn: 0.5106719	test: 0.6963778	best: 0.6820825 (28)	total: 773ms	remaining: 3.04s
      203:	learn: 0.5098049	test: 0.6963791	best: 0.6820825 (28)	total: 777ms	remaining: 3.03s
      204:	learn: 0.5088517	test: 0.6968276	best: 0.6820825 (28)	total: 784ms	remaining: 3.04s
      205:	learn: 0.5081794	test: 0.6964851	best: 0.6820825 (28)	total: 786ms	remaining: 3.03s
      206:	learn: 0.5072833	test: 0.6966282	best: 0.6820825 (28)	total: 790ms	remaining: 3.02s
      207:	learn: 0.5063986	test: 0.6966250	best: 0.6820825 (28)	total: 802ms	remaining: 3.05s
      208:	learn: 0.5054927	test: 0.6966309	best: 0.6820825 (28)	total: 806ms	remaining: 3.05s
      209:	learn: 0.5044721	test: 0.6968884	best: 0.6820825 (28)	total: 810ms	remaining: 3.04s
      210:	learn: 0.5037375	test: 0.6965957	best: 0.6820825 (28)	total: 817ms	remaining: 3.06s
      211:	learn: 0.5029714	test: 0.6982931	best: 0.6820825 (28)	total: 821ms	remaining: 3.05s
      212:	learn: 0.5023642	test: 0.6979011	best: 0.6820825 (28)	total: 824ms	remaining: 3.04s
      213:	learn: 0.5017572	test: 0.6978887	best: 0.6820825 (28)	total: 831ms	remaining: 3.05s
      214:	learn: 0.5010461	test: 0.6973608	best: 0.6820825 (28)	total: 836ms	remaining: 3.05s
      215:	learn: 0.5002560	test: 0.7013019	best: 0.6820825 (28)	total: 840ms	remaining: 3.05s
      216:	learn: 0.4996972	test: 0.7016139	best: 0.6820825 (28)	total: 843ms	remaining: 3.04s
      217:	learn: 0.4990953	test: 0.7007054	best: 0.6820825 (28)	total: 850ms	remaining: 3.05s
      218:	learn: 0.4982966	test: 0.6997111	best: 0.6820825 (28)	total: 852ms	remaining: 3.04s
      219:	learn: 0.4978683	test: 0.6992566	best: 0.6820825 (28)	total: 856ms	remaining: 3.03s
      220:	learn: 0.4974437	test: 0.6995211	best: 0.6820825 (28)	total: 858ms	remaining: 3.02s
      221:	learn: 0.4970631	test: 0.7030353	best: 0.6820825 (28)	total: 864ms	remaining: 3.03s
      222:	learn: 0.4966227	test: 0.7028744	best: 0.6820825 (28)	total: 868ms	remaining: 3.02s
      223:	learn: 0.4956044	test: 0.6982349	best: 0.6820825 (28)	total: 870ms	remaining: 3.02s
      224:	learn: 0.4952185	test: 0.6983255	best: 0.6820825 (28)	total: 873ms	remaining: 3.01s
      225:	learn: 0.4945476	test: 0.6986151	best: 0.6820825 (28)	total: 876ms	remaining: 3s
      226:	learn: 0.4936436	test: 0.6979039	best: 0.6820825 (28)	total: 878ms	remaining: 2.99s
      227:	learn: 0.4928331	test: 0.6976089	best: 0.6820825 (28)	total: 881ms	remaining: 2.98s
      228:	learn: 0.4926211	test: 0.6978966	best: 0.6820825 (28)	total: 884ms	remaining: 2.98s
      Stopped by overfitting detector  (200 iterations wait)
      
      bestTest = 0.6820825216
      bestIteration = 28
      
      Shrink model to first 29 iterations.
      
    • The prediction results with identical values are also different, I think that is caused by the above difference.

    Could you give some advice or some points to check why that caused the 2 models built by Spark and non-Spark Classifier different?

    catboost version:

    • Non-Spark Version: catboost = "1.0.4"
    • Spark Version: catboost-spark_3.1_2.12-1.0.4

    Operating System: CentOS 7 CPU: Both models are running based on CPU GPU: No

  • Catboost 1.0.5 for JVM not available in Maven

    Catboost 1.0.5 for JVM not available in Maven

    Problem: The java artifacts have not been published to maven central catboost version: 1.0.5

    Check https://mvnrepository.com/artifact/ai.catboost/catboost-prediction to see only up to 1.0.4 are available.

    BTW, I'm looking into updating the version to also have it working on M1. Would the 1.0.5 Java artifacts have the M1 dylibs?

  • CatBoost GPU Gridsearch very slow in comparison to XGBOOST

    CatBoost GPU Gridsearch very slow in comparison to XGBOOST

    Problem: When comparing Catboost on GPU with other Algorithms like XGBoost in a gridsearch with a lot of candidates >=50, CatBoost severly underperforms in terms of computing time.

    This means for the same number of candidates Catboost (on gpu) takes at least 10x more time than XGBoost (on gpu) and LGBM (on cpu).

    The boosting type is set to plain, iterations to 1000. I also set the Scikit gridsearch n_jobs=1, since it calculates on the GPU and is advised in the docs.

    When monitoring catboost I saw that after all iterations are completed the algorithm stops for a significant amount of time, which might be one of the reasons for the bad performance. The training set starts with a size of 10.000 samples and 50 features in round 1, which doubles every round until it reaches 300k samples (HalvingGridsearch).

    The mean fit time for each fold (cv=5) is around 60 seconds, independent from the size of data, so it is rougly the same for 10k samples and 300k samples. It is my intuition that catboost takes all the 1000 iterations, no matter how large the dataset is, which might lead to this behaviour?

    I also saw, while GPU memory consumption was at max at all times (12 GB per GPU), GPU utilization was fairly low between 3-30%. Maybe catboost is not really optimized for such "small" datasets?

    You might want to look into that - or maybe you have a hint for me what the reason could be or how I can improve the performance in terms of computation time?

    This also happens with the internal gridsearch of catboost btw-

    catboost version: 1.0.5 Operating System: Ubuntu Linux x64 CPU: Intel Xeon 28 Cores GPU: 2x Nvidia Titan X

  • YetiRank fails with 'Targets are required for YetiRank loss function.'

    YetiRank fails with 'Targets are required for YetiRank loss function.'

    Problem:

    catboost (in R) training with YetiRank fails with an error: Targets are required for YetiRank loss function.. But works with any other loss function in the same dataset.

    Reproducible example:

    library(catboost)
    
    data = data.frame(
      "target" = rnorm(10000),
      "x" = rnorm(10000),
      "x_2" = rnorm(10000),
      "group" = rep(1:200, each = 50)
    )
    
    data_pool <- catboost.load_pool(
      data[, c("x", "x_2")],
      label = data$target,
      group_id = as.integer(data$group)
    )
    
    
    # Works
    catboost_cv <- catboost.cv(
      pool = data_pool,
      params = list(
        "has_time" = TRUE,
        "iterations" = 20,
        "loss_function" = 'PairLogit',
        "od_type" = "Iter",
        "od_wait" = 5,
        "learning_rate" = 0.005,
        "random_seed" = 123L,
        "verbose" = 0
      )
    )
    
    # Works
    catboost_cv <- catboost.cv(
      pool = data_pool,
      params = list(
        "has_time" = TRUE,
        "iterations" = 20,
        "loss_function" = 'StochasticFilter',
        "od_type" = "Iter",
        "od_wait" = 5,
        "learning_rate" = 0.005,
        "random_seed" = 123L,
        "verbose" = 0
      )
    )
    
    # Fails
    catboost_cv <- catboost.cv(
      pool = data_pool,
      params = list(
        "has_time" = TRUE,
        "iterations" = 20,
        "loss_function" = 'YetiRank',
        "od_type" = "Iter",
        "od_wait" = 5,
        "learning_rate" = 0.005,
        "random_seed" = 123L,
        "verbose" = 0
      )
    )
    

    Error:

    Error in catboost.cv(pool = data_pool, params = list(has_time = TRUE,  : 
      catboost/libs/train_lib/options_helper.cpp:88: Targets are required for YetiRank loss function.
    

    But strangely this (train not cv on YetiRank) works:

    # Works
    catboost.train(
      data_pool,
      params = list(
        "has_time" = TRUE,
        "iterations" = 20,
        "loss_function" = 'YetiRank',
        "od_type" = "Iter",
        "od_wait" = 5,
        "learning_rate" = 0.005,
        "random_seed" = 123L,
        "verbose" = 0
      )
    )
    

    Session info:

    R version 4.1.3 (2022-03-10)
    Platform: x86_64-redhat-linux-gnu (64-bit)
    Running under: CentOS Stream 8
    
    Matrix products: default
    BLAS/LAPACK: /usr/lib64/libopenblas-r0.3.15.so
    
    locale:
     [1] LC_CTYPE=C.utf8       LC_NUMERIC=C          LC_TIME=C.utf8       
     [4] LC_COLLATE=C.utf8     LC_MONETARY=C.utf8    LC_MESSAGES=C.utf8   
     [7] LC_PAPER=C.utf8       LC_NAME=C             LC_ADDRESS=C         
    [10] LC_TELEPHONE=C        LC_MEASUREMENT=C.utf8 LC_IDENTIFICATION=C  
    
    attached base packages:
    [1] stats     graphics  grDevices utils     datasets  methods   base     
    
    other attached packages:
    [1] catboost_1.0.5
    
    loaded via a namespace (and not attached):
    [1] compiler_4.1.3 tools_4.1.3    jsonlite_1.8.0
    

    catboost version: catboost_1.0.5 Operating System: CentOS Stream 8 R version 4.1.3 (2022-03-10)

  • provide constant link for latest version

    provide constant link for latest version

    Hey, thanks for the great package

    catboost is included in our R package mlr3extralearners and in our CI we have to install two packages just to grep the latest version of catboost. Is it possible to provide a permanent link to the latest catboost version?

  • model.score with parametrized NDCG metric

    model.score with parametrized NDCG metric

    Problem: Training setup was the following,

    params = {
        "iterations": 200, 
        'loss_function': 'YetiRank',
        'custom_metric': ['NDCG:top=5;type=Base;denominator=LogPosition', 'MAP:top=5'],
    }  
    model = CatBoostRanker(**params)  
    model.fit(train_pool, eval_set=val_pool)
    

    when I try to evaluate metrics on test,

    model.score(X_test, y_test, group_id=X_test[group_col],
                top=5, type='Base', denominator='LogPosition')
    
    image

    It looks like internally the metric is called without the parameters passed either during training or during score call

    catboost version: 1.0.5 Operating System: macOS Monterey 12.2 CPU: 2,3 GHz Quad-Core Intel Core i7 GPU: None

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library,  for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

eXtreme Gradient Boosting Community | Documentation | Resources | Contributors | Release Notes XGBoost is an optimized distributed gradient boosting l

May 11, 2022
An open source machine learning library for performing regression tasks using RVM technique.

Introduction neonrvm is an open source machine learning library for performing regression tasks using RVM technique. It is written in C programming la

Apr 8, 2022
Learning embeddings for classification, retrieval and ranking.
Learning embeddings for classification, retrieval and ranking.

StarSpace StarSpace is a general-purpose neural model for efficient learning of entity embeddings for solving a wide variety of problems: Learning wor

May 19, 2022
High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.
High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.

What is xLearn? xLearn is a high performance, easy-to-use, and scalable machine learning package that contains linear model (LR), factorization machin

May 11, 2022
Python Inference Script is a Python package that enables developers to author machine learning workflows in Python and deploy without Python.
Python Inference Script is a Python package that enables developers to author machine learning workflows in Python and deploy without Python.

Python Inference Script(PyIS) Python Inference Script is a Python package that enables developers to author machine learning workflows in Python and d

Feb 23, 2022
heuristically and dynamically sample (more) uniformly from large decision trees of unknown shape

PROBLEM STATEMENT When writing a randomized generator for some file format in a general-purpose programming language, we can view the resulting progra

Feb 15, 2022
Deep Scalable Sparse Tensor Network Engine (DSSTNE) is an Amazon developed library for building Deep Learning (DL) machine learning (ML) models

Amazon DSSTNE: Deep Scalable Sparse Tensor Network Engine DSSTNE (pronounced "Destiny") is an open source software library for training and deploying

May 17, 2022
Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.
Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.

This is the Vowpal Wabbit fast online learning code. Why Vowpal Wabbit? Vowpal Wabbit is a machine learning system which pushes the frontier of machin

May 18, 2022
SecMML: Secure MPC(multi-party computation) Machine Learning Framework
SecMML: Secure MPC(multi-party computation) Machine Learning Framework

SecMML 介绍 SecMML是FudanMPL(Multi-Party Computation + Machine Learning)的一个分支,是用于训练机器学习模型的高效可扩展的安全多方计算(MPC)框架,基于BGW协议实现。此框架可以应用到三个及以上参与方联合训练的场景中。目前,SecMM

May 2, 2022
Fast, differentiable sorting and ranking in PyTorch
Fast, differentiable sorting and ranking in PyTorch

Torchsort Fast, differentiable sorting and ranking in PyTorch. Pure PyTorch implementation of Fast Differentiable Sorting and Ranking (Blondel et al.)

May 4, 2022
Edge ML Library - High-performance Compute Library for On-device Machine Learning Inference
 Edge ML Library - High-performance Compute Library for On-device Machine Learning Inference

Edge ML Library (EMLL) offers optimized basic routines like general matrix multiplications (GEMM) and quantizations, to speed up machine learning (ML) inference on ARM-based devices. EMLL supports fp32, fp16 and int8 data types. EMLL accelerates on-device NMT, ASR and OCR engines of Youdao, Inc.

May 10, 2022
A lightweight 2D Pose model can be deployed on Linux/Window/Android, supports CPU/GPU inference acceleration, and can be detected in real time on ordinary mobile phones.
A lightweight 2D Pose model  can be deployed on Linux/Window/Android, supports CPU/GPU inference acceleration, and can be detected in real time on ordinary mobile phones.

A lightweight 2D Pose model can be deployed on Linux/Window/Android, supports CPU/GPU inference acceleration, and can be detected in real time on ordinary mobile phones.

May 6, 2022
A flexible, high-performance serving system for machine learning models

XGBoost Serving This is a fork of TensorFlow Serving, extended with the support for XGBoost, alphaFM and alphaFM_softmax frameworks. For more informat

May 19, 2022
Toy path tracer for my own learning purposes (CPU/GPU, C++/C#, Win/Mac/Wasm, DX11/Metal, also Unity)
Toy path tracer for my own learning purposes (CPU/GPU, C++/C#, Win/Mac/Wasm, DX11/Metal, also Unity)

Toy Path Tracer Toy path tracer for my own learning purposes, using various approaches/techs. Somewhat based on Peter Shirley's Ray Tracing in One Wee

May 18, 2022
A program developed using MPI for distributed computation of Histogram for large data and their performance anaysis on multi-core systems
A program developed using MPI for distributed computation of Histogram for large data and their performance anaysis on multi-core systems

mpi-histo A program developed using MPI for distributed computation of Histogram for large data and their performance anaysis on multi-core systems. T

Dec 21, 2021
CTranslate2 is a fast inference engine for OpenNMT-py and OpenNMT-tf models supporting both CPU and GPU executio

CTranslate2 is a fast inference engine for OpenNMT-py and OpenNMT-tf models supporting both CPU and GPU execution. The goal is to provide comprehensive inference features and be the most efficient and cost-effective solution to deploy standard neural machine translation systems such as Transformer models.

May 15, 2022
a fast and user-friendly runtime for transformer inference (Bert, Albert, GPT2, Decoders, etc) on CPU and GPU.
a fast and user-friendly runtime for transformer inference (Bert, Albert, GPT2, Decoders, etc) on CPU and GPU.

a fast and user-friendly runtime for transformer inference (Bert, Albert, GPT2, Decoders, etc) on CPU and GPU.

May 13, 2022
fast face classification
fast face classification

Fast Face Classification (F²C)—— An Efficient Training Approach for Very Large Scale Face Recognition Training on ultra-large-scale datasets is time-c

May 20, 2022
An open-source, low-code machine learning library in Python
An open-source, low-code machine learning library in Python

An open-source, low-code machine learning library in Python ?? Version 2.3.6 out now! Check out the release notes here. Official • Docs • Install • Tu

May 17, 2022