Open Source Computer Vision Library

OpenCV: Open Source Computer Vision Library

Resources

Contributing

Please read the contribution guidelines before starting work on a pull request.

Summary of the guidelines:

  • One pull request per issue;
  • Choose the right base branch;
  • Include tests and documentation;
  • Clean up "oops" commits before submitting;
  • Follow the coding style guide.
Comments
  • CUDA backend for the DNN module

    CUDA backend for the DNN module

    More up-to-date info available here (unofficial)


    How to use build and use the CUDA backend?

    How to use multiple GPUs?

    There are many ways to make use of multiple GPUs. Here is one which I think is the safest and the least complex solution. It makes use of the fact that the CUDA runtime library maintains a separate CUDA context for each CPU thread.

    Suppose you have N devices.

    Create N threads.
    Assign a CUDA device to each thread by calling cudaSetDevice or cv::cuda::setDevice in that thread. Each thread is now associated with a device.
    You can create any number of cv::dnn::Net objects in any of those threads and the network will use the device associated with that thread for memory and computation.
    

    Benchmarks

    Demo Video: https://www.youtube.com/watch?v=ljCfluWYymM

    Project summary/benchmarks: https://gist.github.com/YashasSamaga/a84cf2826ab2dc755005321fe17cd15d

    Support Matrix for this PR ## Current Support Matrix: (not updated)

    Blip | Meaning ---- | --------- ✔️ | supports all the configurations that are supported by all the existing backends (and might support more than what's currently supported) 🔵 | partially supported (fallback to CPU for unsupported configurations) :x: | not supported (fallback to CPU)

    Layer | Status | Constraints | Notes ---------------------------------------- | ------ | ------------- | -------------- Activations | ✔️ Batch Normalization | ✔️ Blank Layer | ✔️ Concat Layer | ✔️ Const Layer | ✔️ Convolution 2d | ✔️ | | asymmetric padding is disabled in layer constructor but the backend supports it Convolution 3d | ✔️ | | asymmetric padding is disabled in layer constructor but the backend supports it Crop and resize | :x: | Crop Layer | ✔️ | | forwarded to Slice Layer Detection Output Layer | :x: | Deconvolution 2d | 🔵 | padding configuration should not lead to extra uneven padding Deconvolution 3d | 🔵 | padding configuration should not lead to extra uneven padding Elementwise Layers | ✔️ | Eltwise Layer | ✔️ | Flatten Layer | ✔️ | Fully Connected Layer | ✔️ | Input Layer | :x: | Interp Layer | ✔️ | Local Response Normalization | ✔️ | Max Unpooling 2d | ✔️ | Max Unpooling 3d | ✔️ | MVN Layer | :x: | Normalize Layer | 🔵 | Only L1 and L2 norm supported Padding Layer | ✔️ Permute Layer | ✔️ Pooling 2d | 🔵 | Only max and average pooling supported | supports asymmetric padding Pooling 3d | 🔵 | Only max and average pooling supported | supports asymmetric padding Prior Box Layer | ✔️ Proposal Layer | :x: Region Layer | ✔️ | NMS performed using CPU Reorg Layer | ✔️ | Reshape Layer | ✔️ | Resize Layer | ✔️ Scale Layer | ✔️ Shift Layer | ✔️ | | forwarded to Scale Layer Shuffle Channel Layer | ✔️ Slice Layer | ✔️ Softmax Layer | ✔️ Split Layer | ✔️ LSTM Layer | :x:

    Known issues:

    1. Tests for some of the SSD based networks fail on Jetson Nano

    References: #14585

    Results:

    • https://github.com/opencv/opencv/pull/14827#issuecomment-522229894
    • https://github.com/opencv/opencv/pull/14827#issuecomment-523456312
    force_builders_only=Custom,linux,docs
    buildworker:Custom=linux-4
    docker_image:Custom=ubuntu-cuda:18.04
    
  • Issues with recognition whilst using IP Stream only

    Issues with recognition whilst using IP Stream only

    System information (version)
    • OpenCV => 3.1
    • Operating System / Platform => Linux
    Detailed description

    I will try ask in here as got no response on the forum. I have been working with OpenCV in an application since last year. The first version I was capturing the frame from a webcam and using Haarcascades and without an issue it would recognise a face nearly every time.

    I came into some issues with getting a stable web based stream going, after trying multiple solutions I moved to a new way. This is still using the exact same webcam except Linux Motion is accessing it and OpenCV is now connecting to the mjpeg stream from Linux Motion through a secure Nginx server, the stream is on the same device as the OpenCV script.

    Since doing this the quality of the stream has increased massively, but, it now no longer detects faces, at all hardly, I have compared screen shots of frames from when OpenCV was accessing the webcam and frames from when OpenCV is accessing the stream, and apart from the improved quality of the frames there really is not any difference, yet OpenCV refuses to identify a face, it is literally a case where I have to move the camera around and hold a position to identify a face, before I could be walking past on the other side of the room and it would detect my face.

    After trying everything i could think of and find on Google, I went back to the webcam, instantly it was detecting my face in whatever position, in what ever light. I have tried multiple other ways of streaming to the web again but still not successful so have moved back to the Motion stream again to try work this out.

    Can anyone shed any light on this, it does not make sense to me that an improvement in quality suddenly breaks facial identification. I have tried playing with the frame settings, resolution, contrast, hue, brightness and nothing I can do seems to work.

    Steps to reproduce
    self.OpenCVCapture.open('http://MOTION_STREAM_ADDRESS/stream.mjpg')
    self.OpenCVCapture.set(5, 30) 
    self.OpenCVCapture.set(3,1280)
    self.OpenCVCapture.set(4,720)
    self.OpenCVCapture.set(10,1)
    

    Then run through Haarcascades for detection.

  • OpenCV 3.1.0 simple VideoCapture and waitKey crashes after a while on OS X 10.11.2

    OpenCV 3.1.0 simple VideoCapture and waitKey crashes after a while on OS X 10.11.2

    OpenCV 3.1.0 is installed through brew install opencv3 --with-contirb --with-qt5 and the following program crashes after a while:

    #include <opencv2/core.hpp>
    #include <opencv2/highgui.hpp>
    #include <opencv2/videoio.hpp>
    
    int main(int argc, const char * argv[]) {
        cv::VideoCapture cap(0);
        cv::Mat frame;
        while (cap.read(frame)) {
            imshow("Frame", frame);
            if (cv::waitKey(1) == 'q') {
                break;
            }
        }
        return 0;
    }
    

    The stack trace is the following:

    2015-12-24 09:54:22.297 basic-capture[86100:4481590] -[CaptureDelegate doFireTimer:]: unrecognized selector sent to instance 0x103600680
    2015-12-24 09:54:22.313 basic-capture[86100:4481590] An uncaught exception was raised
    2015-12-24 09:54:22.313 basic-capture[86100:4481590] -[CaptureDelegate doFireTimer:]: unrecognized selector sent to instance 0x103600680
    2015-12-24 09:54:22.313 basic-capture[86100:4481590] (
        0   CoreFoundation                      0x00007fff95766ae2 __exceptionPreprocess + 178
        1   libobjc.A.dylib                     0x00007fff90699f7e objc_exception_throw + 48
        2   CoreFoundation                      0x00007fff95769b9d -[NSObject(NSObject) doesNotRecognizeSelector:] + 205
        3   CoreFoundation                      0x00007fff956a2601 ___forwarding___ + 1009
        4   CoreFoundation                      0x00007fff956a2188 _CF_forwarding_prep_0 + 120
        5   Foundation                          0x00007fff9c7d385b __NSFireTimer + 95
        6   CoreFoundation                      0x00007fff956acbc4 __CFRUNLOOP_IS_CALLING_OUT_TO_A_TIMER_CALLBACK_FUNCTION__ + 20
        7   CoreFoundation                      0x00007fff956ac853 __CFRunLoopDoTimer + 1075
        8   CoreFoundation                      0x00007fff9572ae6a __CFRunLoopDoTimers + 298
        9   CoreFoundation                      0x00007fff95667cd1 __CFRunLoopRun + 1841
        10  CoreFoundation                      0x00007fff95667338 CFRunLoopRunSpecific + 296
        11  HIToolbox                           0x00007fff8f2f3935 RunCurrentEventLoopInMode + 235
        12  HIToolbox                           0x00007fff8f2f3677 ReceiveNextEventCommon + 184
        13  HIToolbox                           0x00007fff8f2f35af _BlockUntilNextEventMatchingListInModeWithFilter + 71
        14  AppKit                              0x00007fff967d10ee _DPSNextEvent + 1067
        15  AppKit                              0x00007fff96b9d943 -[NSApplication _nextEventMatchingEventMask:untilDate:inMode:dequeue:] + 454
        16  libqcocoa.dylib                     0x000000010555ae5a _ZN21QCocoaEventDispatcher13processEventsE6QFlagsIN10QEventLoop17ProcessEventsFlagEE + 1034
        17  libopencv_highgui.3.1.dylib         0x000000010083c596 cvWaitKey + 178
        18  basic-capture                       0x0000000100001666 main + 246
        19  libdyld.dylib                       0x00007fff8a0335ad start + 1
        20  ???                                 0x0000000000000001 0x0 + 1
    )
    2015-12-24 09:54:22.314 basic-capture[86100:4481590] *** Terminating app due to uncaught exception 'NSInvalidArgumentException', reason: '-[CaptureDelegate doFireTimer:]: unrecognized selector sent to instance 0x103600680'
    *** First throw call stack:
    (
        0   CoreFoundation                      0x00007fff95766ae2 __exceptionPreprocess + 178
        1   libobjc.A.dylib                     0x00007fff90699f7e objc_exception_throw + 48
        2   CoreFoundation                      0x00007fff95769b9d -[NSObject(NSObject) doesNotRecognizeSelector:] + 205
        3   CoreFoundation                      0x00007fff956a2601 ___forwarding___ + 1009
        4   CoreFoundation                      0x00007fff956a2188 _CF_forwarding_prep_0 + 120
        5   Foundation                          0x00007fff9c7d385b __NSFireTimer + 95
        6   CoreFoundation                      0x00007fff956acbc4 __CFRUNLOOP_IS_CALLING_OUT_TO_A_TIMER_CALLBACK_FUNCTION__ + 20
        7   CoreFoundation                      0x00007fff956ac853 __CFRunLoopDoTimer + 1075
        8   CoreFoundation                      0x00007fff9572ae6a __CFRunLoopDoTimers + 298
        9   CoreFoundation                      0x00007fff95667cd1 __CFRunLoopRun + 1841
        10  CoreFoundation                      0x00007fff95667338 CFRunLoopRunSpecific + 296
        11  HIToolbox                           0x00007fff8f2f3935 RunCurrentEventLoopInMode + 235
        12  HIToolbox                           0x00007fff8f2f3677 ReceiveNextEventCommon + 184
        13  HIToolbox                           0x00007fff8f2f35af _BlockUntilNextEventMatchingListInModeWithFilter + 71
        14  AppKit                              0x00007fff967d10ee _DPSNextEvent + 1067
        15  AppKit                              0x00007fff96b9d943 -[NSApplication _nextEventMatchingEventMask:untilDate:inMode:dequeue:] + 454
        16  libqcocoa.dylib                     0x000000010555ae5a _ZN21QCocoaEventDispatcher13processEventsE6QFlagsIN10QEventLoop17ProcessEventsFlagEE + 1034
        17  libopencv_highgui.3.1.dylib         0x000000010083c596 cvWaitKey + 178
        18  basic-capture                       0x0000000100001666 main + 246
        19  libdyld.dylib                       0x00007fff8a0335ad start + 1
        20  ???                                 0x0000000000000001 0x0 + 1
    )
    libc++abi.dylib: terminating with uncaught exception of type NSException
    
  • C++ cv::VideoCapture.open(0) always return false on Android

    C++ cv::VideoCapture.open(0) always return false on Android

    System information (version)
    • OpenCV => 3.4.1
    • Operating System macOS High Sierra 10.13.6 / Platform => Android 7.1.1 API 25
    • Compiler => Android NDK
    Detailed description
    Context

    I'm writing a cross platform OpenCV based C++ library. The consuming code is a React Native Application through a react native native module.

    To be perfectly clear, there is no access from Java Code to C++ OpenCV on Android. There are events with the result of the OpenCV C++ code sent to Javascript through the react native bridge.

    My native library is compiled on Android as a SHARED library. It is dynamically linked to the libopencv_world.so that is produced by the compilation of OpenCV C++ for Android.

    What it does

    Basically, it opens the device's default camera and take snapshots.

    The outcome

    This code is then ran on iOS and Android.

    This is working perfectly well on iOS. It fails on Android.

    Here is the failing part of C++ code on Adndroid:

    Steps to reproduce
    // cap is a cv::VideoCapture object    
    if (cap.open(0))
            {
                cap.set(cv::CAP_PROP_FRAME_WIDTH, CAM_WIDTH);
                cap.set(cv::CAP_PROP_FRAME_HEIGHT, CAM_HEIGHT);
            }
            else
            {
                reject("false", "cap.open(0) returned false");
            }
    
  • OpenCV3 python calls to FlannBasedMatcher::knnMatch fail with error

    OpenCV3 python calls to FlannBasedMatcher::knnMatch fail with error

    The following code returns an error.

        sift = x2d.SIFT_create(1000)
        features_l, des_l = sift.detectAndCompute(im_l, None)
        features_r, des_r = sift.detectAndCompute(im_r, None)
    
        FLANN_INDEX_KDTREE = 1
        index_params = dict(algorithm=FLANN_INDEX_KDTREE, trees=5)
        search_params = dict(checks=50)
        flann = cv2.FlannBasedMatcher(index_params,search_params)
    

    The error returned:

    opencv/modules/python/src2/cv2.cpp:161: error: (-215) The data should normally be NULL! in function allocate
    

    I tried this with Python 2.7. FLANN_INDEX_KDTREE is set to 1 unlike here, since in modules/flann/include/opencv2/flann/defines.h I found it set to 1 on line 84.

  • fixing cap_pvpapi interface

    fixing cap_pvpapi interface

    This PR tries to fix the following 3 reported issues

    • http://code.opencv.org/issues/3946
    • http://code.opencv.org/issues/3947
    • http://code.opencv.org/issues/3948

    A complete remake of the PvAPI API interface solving bugs but also

    • allowing the use of MANTA type cameras
    • allowing multiple AVT cameras at the same time
  • fixing waitKey commands to be universal

    fixing waitKey commands to be universal

    As a follow up for the discussion held in PR #7098, and the discussion started in this topic, we decided to implement a waitChar function that always returns the correct value if the returned value is needed for further comparison to ASCII codes of char inputs.

  • AttributeError: 'module' object has no attribute 'face'

    AttributeError: 'module' object has no attribute 'face'

    I got an error when running opencv in Python on raspberry pi.

    I tried to find and apply it to fix the error, but it did not work out. I also confirmed that the module "face" is in the file opencv_contrib-3.3.0. I do not know why for some reason.

    error 1

    Traceback (most recent call last): File "training.py", line 13, in recognizer = cv2.face.createLBPHFaceRecognizer() AttributeError: 'module' object has no attribute 'face'

    error 2

    Traceback (most recent call last): File "training.py", line 13, in help(cv2.face) AttributeError: 'module' object has no attribute 'face'

    error3

    Traceback (most recent call last): File "training.py", line 13, in help(cv2.face.createLBPHFaceRecognizer) AttributeError: 'module' object has no attribute 'face'

    python : 3.5.3 opencv-3.3.0 opencv_contrib-3.3.0

    source code

    Import OpenCV2 for image processing

    Import os for file path

    import cv2, os

    Import numpy for matrix calculation

    import numpy as np

    Import Python Image Library (PIL)

    from PIL import Image

    Create Local Binary Patterns Histograms for face recognization

    recognizer = cv2.face.createLBPHFaceRecognizer()

    Using prebuilt frontal face training model, for face detection

    detector = cv2.CascadeClassifier("haarcascade_frontalface_default.xml");

    Create method to get the images and label data

    def getImagesAndLabels(path):

    # Get all file path
    imagePaths = [os.path.join(path,f) for f in os.listdir(path)] 
    
    # Initialize empty face sample
    faceSamples=[]
    
    # Initialize empty id
    ids = []
    
    # Loop all the file path
    for imagePath in imagePaths:
    
        # Get the image and convert it to grayscale
        PIL_img = Image.open(imagePath).convert('L')
    
        # PIL image to numpy array
        img_numpy = np.array(PIL_img,'uint8')
    
        # Get the image id
        id = int(os.path.split(imagePath)[-1].split(".")[1])
        print(id)
    
        # Get the face from the training images
        faces = detector.detectMultiScale(img_numpy)
    
        # Loop for each face, append to their respective ID
        for (x,y,w,h) in faces:
    
            # Add the image to face samples
            faceSamples.append(img_numpy[y:y+h,x:x+w])
    
            # Add the ID to IDs
            ids.append(id)
    
    # Pass the face array and IDs array
    return faceSamples,ids
    

    Get the faces and IDs

    faces,ids = getImagesAndLabels('dataset')

    Train the model using the faces and IDs

    recognizer.train(faces, np.array(ids))

    Save the model into trainer.yml

    recognizer.save('trainer/trainer.yml')

  • [GSOC] New camera model for stitching pipeline

    [GSOC] New camera model for stitching pipeline

    Merge with extra: https://github.com/opencv/opencv_extra/pull/303

    This PR contains all work for New camera model for stitching pipeline GSoC 2016 project.

    GSoC Proposal

    Stitching pipeline is a well established code in OpenCV. It provides good results for creating panoramas from camera captured images. Main limitation of stitching pipeline is its expected camera model (perspective transformation). Although this model is fine for many applications working with camera captured images, there are applications which aren't covered by current stitching pipeline.

    New camera model

    Due to physical constraints it is possible for some applications to expect much simpler transform with less degrees of freedom. Those are situations when input data are not subject to perspective transform. The transformation can be much simpler, such as affine transformation. Datasets considered here includes images captured by special hardware (such as book scanners[0] that tries hard to eliminate perspective), maps from laser scanning (produced from different starting points), preprocessed images (where perspective was compensated by other robust means, taking advantage of physical situation, e.g. for book scanners we would use data from calibration to compensate remaining perspective). In all those situations we would like to obtain image mosaic under affine transformation.

    I'd like to introduce new camera model based on affine transformation to stitching pipeline. This would include:

    • New Matcher using affine transformation (cv::estimateRigidTransform) to estimate H
    • New Estimator aware of affine model.
    • Defining and documenting this new model for CameraParams (e.g. now translation is always expected to be zero, this might not be true for affine transformation)
    • Integration works in compositing part of pipeline (there might be changes necessary depending how we would decide to represent affine model in CameraParams)
    • New options for high-level API to be able to use affine model instead of current one simply
    • Producing new sample code for stitching pipeline
    • Improving current documentation (current documentation does not mention details about current camera model, this might need some clarification)

    I used approach based on affine transformation to merge maps produced by multiple robots [1] for my robotics project. It shows a good results. However, as mentioned earlier applications for this model are much broader than that.

    Parallelism for FeaturesFinder

    To make usage of stitching pipeline more comfortable and performant for large number of images, I’d like also to improve FeaturesFinder to allow finding features in parallel. All camera models and other users of FeaturesFinder may take benefit from that. The API could be similar to FeaturesMatcher::operator ()(features, pairwise_matches, mask).

    This could be with TBB in similar manner as mentioned method in FeaturesMatcher, which is already being used in stitching pipeline so there would be almost no additional overhead in starting new threads in typical scenarios, because these threads are there already for FeaturesMatcher. This change would be fully integrated into high level stitching interface.

    There might be some changes necessary in finders to ensure thread-safety. Where thread-safety can’t be ensured or it does not make sense (GPU finders), parallelization would be disabled and all images would be processed in serial manner so this method would be always safe to use regardless of underlying finder. This approach is also similar to FeaturesMatcher.

    Benefits to OpenCV

    • New transform options for stitching pipeline
    • Performance improvements through parallel processing of image features

    implemented goals (all + extras)

    new camera model

    • [x] affine matcher
    • [x] affine estimator
    • [x] affine warper
    • [x] affine bundle adjusters
    • tests for affine stitching
      • [x] basic affine stitching integration test
      • [x] tests for affine matcher
      • [x] integration tests on real-word scans
      • [x] affine warper tests
      • [x] affine bundle adjusters tests
    • [x] integrating with high level API (Stitcher)
    • [x] stitching_detailed sample
    • [x] stitching simple sample

    parallel feature finding

    • [x] parallel API in feature finder
    • [x] tests (incl. perf tests)
    • [x] integrating with Stitcher

    implemented extras

    • [x] robust fuctions for affine transform estimations
      • [x] add support for least median robust method
      • [x] tests for LMEDS
      • [x] Levenberg–Marquardt algorithm-based refining for affine estimation functions
      • [x] tests and docs
      • [x] perf tests
    • [x] stitching tutorial for high level API
      • [x] add examples of running the samples on testdata in opencv_extra
    • [x] fix existing stitching tests with SURF (SURF was disabled)

    video

    short video presenting this project

    other work

    During this GSoC I have also coded some related work, that is not going to be included (mostly because we has chosen different approach or the work has been merged under this PR). It is listed here for completeness.

    PRs:

    • #6560
    • #6609
    • #6615
    • #6642

    commits:

    • eba30a89737d4ded755f07cff75fd861864cf09a
    • 150daa2dc57a258ba61a01e12901518b6b4d98e8
  • [GSoC] Add siamrpnpp.py

    [GSoC] Add siamrpnpp.py

    GSoC '20 : Real-time Single Object Tracking using Deep Learning (SiamRPN++)

    Overview

    Proposal : https://summerofcode.withgoogle.com/projects/#4979746967912448 Mentors : Liubov Batanina @l-bat, Stefano Fabri @bhack, Ilya Elizarov @ieliz Student : Jin Yeob Chung @jinyup100

    Details of the Pull Request

    • Export of the torch implementation of the SiamRPN++ visual tracker to ONNX
      • Please refer to (https://gist.github.com/jinyup100/7aa748686c5e234ed6780154141b4685) or Code to generate ONNX models at the bottom of this PR description
    • Addition of siamrpnpp.py in the opencv/samples/dnn repository
      • SiamRPN++ visual tracker can be performed on a sample video input
      • Parsers include:
        • --input_video path to sample video input
        • --target_net path to target branch of the visual tracker
        • --search_net path to search branch of the visual tracker
        • --rpn_head path to head of the visual tracker
        • --backend selection of the computation backend
        • --target selection of the computation target device
    • Additional samples of the visual tracker performed on videos are available at:
      • https://drive.google.com/drive/folders/1k7Z_SHaBWK_4aEQPxJJCGm3P7y2IFCjY?usp=sharing
    Examples

    Pull Request Readiness Checklist

    See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

    • [X] I agree to contribute to the project under OpenCV (BSD) License.
    • [X] To the best of my knowledge, the proposed patch is not based on a code under GPL or other license that is incompatible with OpenCV
    • [X] The PR is proposed to proper branch
    • [X] There is reference to original bug report and related work
    • [X] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name.
    • [X] The feature is well documented and sample code can be built with the project CMake
    Code to generate ONNX Models

    The code shown below to generate the ONNX models of siamrpn++ is also available from : https://gist.github.com/jinyup100/7aa748686c5e234ed6780154141b4685

    ball_track

    The Final Version of the Pre-Trained Weights and successfully converted ONNX format of the models using the codes are available at::

    Pre-Trained Weights in pth Format https://drive.google.com/file/d/11bwgPFVkps9AH2NOD1zBDdpF_tQghAB-/view?usp=sharing

    Target Net : Import :heavy_check_mark: Export :heavy_check_mark: https://drive.google.com/file/d/1dw_Ne3UMcCnFsaD6xkZepwE4GEpqq7U_/view?usp=sharing

    Search Net : Import :heavy_check_mark: Export :heavy_check_mark: https://drive.google.com/file/d/1Lt4oE43ZSucJvze3Y-Z87CVDreO-Afwl/view?usp=sharing

    RPN_head : Import : :heavy_check_mark: Export :heavy_check_mark:
    https://drive.google.com/file/d/1zT1yu12mtj3JQEkkfKFJWiZ71fJ-dQTi/view?usp=sharing

    import numpy as np
    import onnx
    import torch
    import torch.nn as nn
    
    # Class for the Building Blocks required for ResNet
    class Bottleneck(nn.Module):
        expansion = 4
    
        def __init__(self, inplanes, planes, stride=1,
                     downsample=None, dilation=1):
            super(Bottleneck, self).__init__()
            self.conv1 = nn.Conv2d(inplanes, planes, kernel_size=1, bias=False)
            self.bn1 = nn.BatchNorm2d(planes)
            padding = 2 - stride
            if downsample is not None and dilation > 1:
                dilation = dilation // 2
                padding = dilation
    
            assert stride == 1 or dilation == 1, \
                "stride and dilation must have one equals to zero at least"
    
            if dilation > 1:
                padding = dilation
            self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride,
                                   padding=padding, bias=False, dilation=dilation)
            self.bn2 = nn.BatchNorm2d(planes)
            self.conv3 = nn.Conv2d(planes, planes * 4, kernel_size=1, bias=False)
            self.bn3 = nn.BatchNorm2d(planes * 4)
            self.relu = nn.ReLU(inplace=True)
            self.downsample = downsample
            self.stride = stride
    
        def forward(self, x):
            residual = x
    
            out = self.conv1(x)
            out = self.bn1(out)
            out = self.relu(out)
    
            out = self.conv2(out)
            out = self.bn2(out)
            out = self.relu(out)
    
            out = self.conv3(out)
            out = self.bn3(out)
    
            if self.downsample is not None:
                residual = self.downsample(x)
    
            out += residual
    
            out = self.relu(out)
    
            return out
        
    # End of Building Blocks
    
    # Class for ResNet - the Backbone neural network
    
    class ResNet(nn.Module):
        "ResNET"
        def __init__(self, block, layers, used_layers):
            self.inplanes = 64
            super(ResNet, self).__init__()
            self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=0,  # 3
                                   bias=False)
            self.bn1 = nn.BatchNorm2d(64)
            self.relu = nn.ReLU(inplace=True)
            self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
            self.layer1 = self._make_layer(block, 64, layers[0])
            self.layer2 = self._make_layer(block, 128, layers[1], stride=2)
    
            self.feature_size = 128 * block.expansion
            self.used_layers = used_layers
            layer3 = True if 3 in used_layers else False
            layer4 = True if 4 in used_layers else False
    
            if layer3:
                self.layer3 = self._make_layer(block, 256, layers[2],
                                               stride=1, dilation=2)  # 15x15, 7x7
                self.feature_size = (256 + 128) * block.expansion
            else:
                self.layer3 = lambda x: x  # identity
    
            if layer4:
                self.layer4 = self._make_layer(block, 512, layers[3],
                                               stride=1, dilation=4)  # 7x7, 3x3
                self.feature_size = 512 * block.expansion
            else:
                self.layer4 = lambda x: x  # identity
    
            for m in self.modules():
                if isinstance(m, nn.Conv2d):
                    n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
                    m.weight.data.normal_(0, np.sqrt(2. / n))
                elif isinstance(m, nn.BatchNorm2d):
                    m.weight.data.fill_(1)
                    m.bias.data.zero_()
    
        def _make_layer(self, block, planes, blocks, stride=1, dilation=1):
            downsample = None
            dd = dilation
            if stride != 1 or self.inplanes != planes * block.expansion:
                if stride == 1 and dilation == 1:
                    downsample = nn.Sequential(
                        nn.Conv2d(self.inplanes, planes * block.expansion,
                                  kernel_size=1, stride=stride, bias=False),
                        nn.BatchNorm2d(planes * block.expansion),
                    )
                else:
                    if dilation > 1:
                        dd = dilation // 2
                        padding = dd
                    else:
                        dd = 1
                        padding = 0
                    downsample = nn.Sequential(
                        nn.Conv2d(self.inplanes, planes * block.expansion,
                                  kernel_size=3, stride=stride, bias=False,
                                  padding=padding, dilation=dd),
                        nn.BatchNorm2d(planes * block.expansion),
                    )
    
            layers = []
            layers.append(block(self.inplanes, planes, stride,
                                downsample, dilation=dilation))
            self.inplanes = planes * block.expansion
            for i in range(1, blocks):
                layers.append(block(self.inplanes, planes, dilation=dilation))
    
            return nn.Sequential(*layers)
    
        def forward(self, x):
            x = self.conv1(x)
            x = self.bn1(x)
            x_ = self.relu(x)
            x = self.maxpool(x_)
    
            p1 = self.layer1(x)
            p2 = self.layer2(p1)
            p3 = self.layer3(p2)
            p4 = self.layer4(p3)
            out = [x_, p1, p2, p3, p4]
            out = [out[i] for i in self.used_layers]
            if len(out) == 1:
                return out[0]
            else:
                return out
            
    # End of ResNet
    
    # Class for Adjusting the layers of the neural net
    
    class AdjustLayer_1(nn.Module):
        def __init__(self, in_channels, out_channels, center_size=7):
            super(AdjustLayer_1, self).__init__()
            self.downsample = nn.Sequential(
                nn.Conv2d(in_channels, out_channels, kernel_size=1, bias=False),
                nn.BatchNorm2d(out_channels),
                )
            self.center_size = center_size
    
        def forward(self, x):
            x = self.downsample(x)
            l = 4
            r = 11
            x = x[:, :, l:r, l:r]
            return x
    
    class AdjustAllLayer_1(nn.Module):
        def __init__(self, in_channels, out_channels, center_size=7):
            super(AdjustAllLayer_1, self).__init__()
            self.num = len(out_channels)
            if self.num == 1:
                self.downsample = AdjustLayer_1(in_channels[0],
                                              out_channels[0],
                                              center_size)
            else:
                for i in range(self.num):
                    self.add_module('downsample'+str(i+2),
                                    AdjustLayer_1(in_channels[i],
                                                out_channels[i],
                                                center_size))
    
        def forward(self, features):
            if self.num == 1:
                return self.downsample(features)
            else:
                out = []
                for i in range(self.num):
                    adj_layer = getattr(self, 'downsample'+str(i+2))
                    out.append(adj_layer(features[i]))
                return out
            
    class AdjustLayer_2(nn.Module):
        def __init__(self, in_channels, out_channels, center_size=7):
            super(AdjustLayer_2, self).__init__()
            self.downsample = nn.Sequential(
                nn.Conv2d(in_channels, out_channels, kernel_size=1, bias=False),
                nn.BatchNorm2d(out_channels),
                )
            self.center_size = center_size
    
        def forward(self, x):
            x = self.downsample(x)
            return x
    
    class AdjustAllLayer_2(nn.Module):
        def __init__(self, in_channels, out_channels, center_size=7):
            super(AdjustAllLayer_2, self).__init__()
            self.num = len(out_channels)
            if self.num == 1:
                self.downsample = AdjustLayer_2(in_channels[0],
                                              out_channels[0],
                                              center_size)
            else:
                for i in range(self.num):
                    self.add_module('downsample'+str(i+2),
                                    AdjustLayer_2(in_channels[i],
                                                out_channels[i],
                                                center_size))
    
        def forward(self, features):
            if self.num == 1:
                return self.downsample(features)
            else:
                out = []
                for i in range(self.num):
                    adj_layer = getattr(self, 'downsample'+str(i+2))
                    out.append(adj_layer(features[i]))
                return out
            
    # End of Class for Adjusting the layers of the neural net
    
    # Class for Region Proposal Neural Network
    
    class RPN(nn.Module):
        "Region Proposal Network"
        def __init__(self):
            super(RPN, self).__init__()
    
        def forward(self, z_f, x_f):
            raise NotImplementedError
            
    class DepthwiseXCorr(nn.Module):
        "Depthwise Correlation Layer"
        def __init__(self, in_channels, hidden, out_channels, kernel_size=3, hidden_kernel_size=5):
            super(DepthwiseXCorr, self).__init__()
            self.conv_kernel = nn.Sequential(
                    nn.Conv2d(in_channels, hidden, kernel_size=kernel_size, bias=False),
                    nn.BatchNorm2d(hidden),
                    nn.ReLU(inplace=True),
                    )
            self.conv_search = nn.Sequential(
                    nn.Conv2d(in_channels, hidden, kernel_size=kernel_size, bias=False),
                    nn.BatchNorm2d(hidden),
                    nn.ReLU(inplace=True),
                    )
            self.head = nn.Sequential(
                    nn.Conv2d(hidden, hidden, kernel_size=1, bias=False),
                    nn.BatchNorm2d(hidden),
                    nn.ReLU(inplace=True),
                    nn.Conv2d(hidden, out_channels, kernel_size=1)
                    )
            
        def forward(self, kernel, search):    
            kernel = self.conv_kernel(kernel)
            search = self.conv_search(search)
            
            feature = xcorr_depthwise(search, kernel)
            
            out = self.head(feature)
            
            return out
    
    class DepthwiseRPN(RPN):
        def __init__(self, anchor_num=5, in_channels=256, out_channels=256):
            super(DepthwiseRPN, self).__init__()
            self.cls = DepthwiseXCorr(in_channels, out_channels, 2 * anchor_num)
            self.loc = DepthwiseXCorr(in_channels, out_channels, 4 * anchor_num)
    
        def forward(self, z_f, x_f):
            cls = self.cls(z_f, x_f)
            loc = self.loc(z_f, x_f)
            
            return cls, loc
    
    class MultiRPN(RPN):
        def __init__(self, anchor_num, in_channels):
            super(MultiRPN, self).__init__()
            for i in range(len(in_channels)):
                self.add_module('rpn'+str(i+2),
                        DepthwiseRPN(anchor_num, in_channels[i], in_channels[i]))
            self.weight_cls = nn.Parameter(torch.Tensor([0.38156851768108546, 0.4364767608115956,  0.18195472150731892]))
            self.weight_loc = nn.Parameter(torch.Tensor([0.17644893463361863, 0.16564198028417967, 0.6579090850822015]))
    
        def forward(self, z_fs, x_fs):
            cls = []
            loc = []
            
            rpn2 = self.rpn2
            z_f2 = z_fs[0]
            x_f2 = x_fs[0]
            c2,l2 = rpn2(z_f2, x_f2)
            
            cls.append(c2)
            loc.append(l2)
            
            rpn3 = self.rpn3
            z_f3 = z_fs[1]
            x_f3 = x_fs[1]
            c3,l3 = rpn3(z_f3, x_f3)
            
            cls.append(c3)
            loc.append(l3)
            
            rpn4 = self.rpn4
            z_f4 = z_fs[2]
            x_f4 = x_fs[2]
            c4,l4 = rpn4(z_f4, x_f4)
            
            cls.append(c4)
            loc.append(l4)
            
            def avg(lst):
                return sum(lst) / len(lst)
    
            def weighted_avg(lst, weight):
                s = 0
                fixed_len = 3
                for i in range(3):
                    s += lst[i] * weight[i]
                return s
    
            return weighted_avg(cls, self.weight_cls), weighted_avg(loc, self.weight_loc)
            
    # End of class for RPN
    
    def conv3x3(in_planes, out_planes, stride=1, dilation=1):
        "3x3 convolution with padding"
        return nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride,
                         padding=dilation, bias=False, dilation=dilation)
    
    def xcorr_depthwise(x, kernel):
        """
        Deptwise convolution for input and weights with different shapes
        """
        batch = kernel.size(0)
        channel = kernel.size(1)
        x = x.view(1, batch*channel, x.size(2), x.size(3))
        kernel = kernel.view(batch*channel, 1, kernel.size(2), kernel.size(3))
        conv = nn.Conv2d(batch*channel, batch*channel, kernel_size=(kernel.size(2), kernel.size(3)), bias=False, groups=batch*channel)
        conv.weight = nn.Parameter(kernel)
        out = conv(x) 
        out = out.view(batch, channel, out.size(2), out.size(3))
        out = out.detach()
        return out
    
    class TargetNetBuilder(nn.Module):
        def __init__(self):
            super(TargetNetBuilder, self).__init__()
            # Build Backbone Model
            self.backbone = ResNet(Bottleneck, [3,4,6,3], [2,3,4])
            # Build Neck Model
            self.neck = AdjustAllLayer_1([512,1024,2048], [256,256,256])
        
        def forward(self, frame):
            features = self.backbone(frame)
            output = self.neck(features)
            return output
    
    class SearchNetBuilder(nn.Module):
        def __init__(self):
            super(SearchNetBuilder, self).__init__()
            # Build Backbone Model
            self.backbone = ResNet(Bottleneck, [3,4,6,3], [2,3,4])
            # Build Neck Model
            self.neck = AdjustAllLayer_2([512,1024,2048], [256,256,256])
            
        def forward(self, frame):
            features = self.backbone(frame)
            output = self.neck(features)
            return output
     
    class RPNBuilder(nn.Module):
        def __init__(self):
            super(RPNBuilder, self).__init__()
    
            # Build Adjusted Layer Builder
            self.rpn_head = MultiRPN(anchor_num=5,in_channels=[256, 256, 256])
    
        def forward(self, zf, xf):
            # Get Feature
            cls, loc = self.rpn_head(zf, xf)
    
            return cls, loc
        
    """Load path should be the directory of the pre-trained siamrpn_r50_l234_dwxcorr.pth
     The download link to siamrpn_r50_l234_dwxcorr.pth is shown in the description"""
    
    current_path = os.getcwd()
    load_path = os.path.join(current_path, "siamrpn_r50_l234_dwxcorr.pth")
    pretrained_dict = torch.load(load_path,map_location=torch.device('cpu') )
    pretrained_dict_backbone = pretrained_dict
    pretrained_dict_neck_1 = pretrained_dict
    pretrained_dict_neck_2 = pretrained_dict
    pretrained_dict_head = pretrained_dict
    pretrained_dict_target = pretrained_dict
    pretrained_dict_search = pretrained_dict
    
    # The shape of the inputs to the Target Network and the Search Network
    target = torch.Tensor(np.random.rand(1,3,127,127))
    search = torch.Tensor(np.random.rand(1,3,125,125))
    
    # Build the torch backbone model
    target_net = TargetNetBuilder()
    target_net.eval()
    target_net.state_dict().keys()
    target_net_dict = target_net.state_dict()
    
    # Load the pre-trained weight to the torch target net model
    pretrained_dict_target = {k: v for k, v in pretrained_dict_target.items() if k in target_net_dict}
    target_net_dict.update(pretrained_dict_target)
    target_net.load_state_dict(target_net_dict)
    
    # Export the torch target net model to ONNX model
    torch.onnx.export(target_net, torch.Tensor(target), "target_net.onnx", export_params=True, opset_version=11,
                      do_constant_folding=True, input_names=['input'], output_names=['output_1,', 'output_2', 'output_3'])
    
    # Load the saved torch target net model using ONNX
    onnx_target = onnx.load("target_net.onnx")
    
    # Check whether the ONNX target net model has been successfully imported
    onnx.checker.check_model(onnx_target)
    print(onnx.checker.check_model(onnx_target))
    onnx.helper.printable_graph(onnx_target.graph)
    print(onnx.helper.printable_graph(onnx_target.graph))
    
    # Build the torch backbone model
    search_net = SearchNetBuilder()
    search_net.eval()
    search_net.state_dict().keys()
    search_net_dict = search_net.state_dict()
    
    # Load the pre-trained weight to the torch target net model
    pretrained_dict_search = {k: v for k, v in pretrained_dict_search.items() if k in search_net_dict}
    search_net_dict.update(pretrained_dict_search)
    search_net.load_state_dict(search_net_dict)
    
    # Export the torch target net model to ONNX model
    torch.onnx.export(search_net, torch.Tensor(search), "search_net.onnx", export_params=True, opset_version=11,
                      do_constant_folding=True, input_names=['input'], output_names=['output_1,', 'output_2', 'output_3'])
    
    # Load the saved torch target net model using ONNX
    onnx_search = onnx.load("search_net.onnx")
    
    # Check whether the ONNX target net model has been successfully imported
    onnx.checker.check_model(onnx_search)
    print(onnx.checker.check_model(onnx_search))
    onnx.helper.printable_graph(onnx_search.graph)
    print(onnx.helper.printable_graph(onnx_search.graph))
    
    # Outputs from the Target Net and Search Net
    zfs_1, zfs_2, zfs_3 = target_net(torch.Tensor(target))
    xfs_1, xfs_2, xfs_3 = search_net(torch.Tensor(search))
    
    # Adjustments to the outputs from each of the neck models to match to input shape of the torch rpn_head model
    zfs = np.stack([zfs_1.detach().numpy(), zfs_2.detach().numpy(), zfs_3.detach().numpy()])
    xfs = np.stack([xfs_1.detach().numpy(), xfs_2.detach().numpy(), xfs_3.detach().numpy()])
    
    # Build the torch rpn_head model
    rpn_head = RPNBuilder()
    rpn_head.eval()
    rpn_head.state_dict().keys()
    rpn_head_dict = rpn_head.state_dict()
    
    # Load the pre-trained weights to the rpn_head model
    pretrained_dict_head = {k: v for k, v in pretrained_dict_head.items() if k in rpn_head_dict}
    pretrained_dict_head.keys()
    rpn_head_dict.update(pretrained_dict_head)
    rpn_head.load_state_dict(rpn_head_dict)
    rpn_head.eval()
    
    # Export the torch rpn_head model to ONNX model
    torch.onnx.export(rpn_head, (torch.Tensor(np.random.rand(*zfs.shape)), torch.Tensor(np.random.rand(*xfs.shape))), "rpn_head.onnx", export_params=True, opset_version=11,
                      do_constant_folding=True, input_names = ['input_1', 'input_2'], output_names = ['output_1', 'output_2'])
    
    # Load the saved rpn_head model using ONNX
    onnx_rpn_head_model = onnx.load("rpn_head.onnx")
    
    # Check whether the rpn_head model has been successfully imported
    onnx.checker.check_model(onnx_rpn_head_model)
    print(onnx.checker.check_model(onnx_rpn_head_model))    
    onnx.helper.printable_graph(onnx_rpn_head_model.graph)
    print(onnx.helper.printable_graph(onnx_rpn_head_model.graph))
    
    
  • Refactor core module for type-safety

    Refactor core module for type-safety

    Merge with opencv/opencv_contrib#1768 and opencv/opencv_extra#518

    This pullrequest changes

    • Provides a backward-compatible API based on enums for type-safty. Refer to #12288 for further details

    Utilized options/preprocessors:

    • CV_TYPE_SAFE_API with CMAKE option ENABLE_TYPE_SAFE_API to enable enum-based type-safe API
    • CV_TYPE_COMPATIBLE_API with CMAKE option ENABLE_COMPATIBLE_API to enable the overloaded int-based API. Although it is enabled by default and using it would raise deprecation warnings, it still recommended to disable it in the build farm to enforce good practices. CV_COMPATIBLE_API is only available when ENABLE_TYPE_SAFE_API is set.
    • CV_TRANSNATIONAL_API utilized internally, and should be removed after migration completes.
    force_builders=Custom,linux32,win32,windows_vc15,Android pack,ARMv7,ARMv8
    docker_image:Custom=ubuntu-cuda:16.04
    docker_image:Docs=docs-js
    
  • (5.x) Merge 4.x

    (5.x) Merge 4.x

    PRs (253):

    #20367 from augustinmanecy:features2d-rw #20379 from zihaomu:stackblur #21738 from rogday:gather #21745 from alalek:dnn_plugin_openvino #21934 from Yulv-git:3.4-typos2 #21942 from pglotov:add-blob-contours #21945 from driftee:fix-3rdparty_carotene_blur #21966 from Harvey-Huang:Unicode_Path #22017 from xiong-jie-y:py_onnx #22037 from xiong-jie-y:py_gapi_add_state_kernel #22040 from CNClareChen:4.x #22074 from bwang30:opencv-warpAffine-ippiw #22128 from ocpalo:multipage_img_decoder #22130 from catree:homography_tutorial_add_exercise #22145 from danopdev:issues-22141 #22164 from lamm45:hough-angles #22194 from heavyrain-lzy:fixbug_pyrup #22226 from ocpalo:libspng #22227 from danopdev:issue-22198 #22236 from mizo:v4l2-multi-planar-v2 #22248 from cudawarped:ffmpeg_rtsp_low_fps #22285 from asenyaev:asen/disabled_compiling_warnings_3.4 #22286 from asenyaev:asen/disabled_compiling_warnings_4.x #22290 from fengyuentau:naive_yolov7 #22306 from zihaomu:qgemm_and_squeeze_opset13_onnximporter #22329 from chinery:stitching-py-fixes #22332 from komakai:android-cam-stride #22333 from cudawarped:fix_for_21101 #22337 from zihaomu:load_ONNX_fp16_as_fp32 #22343 from komakai:android_cam_polling #22346 from fengyuentau:mat1d_part1 #22347 from bu3w:filter-camera-streaming-by-format #22353 from hanliutong:more-rvv-intrin #22358 from AleksandrPanov:qrcode_x86_arm #22362 from fengyuentau:conv_asym_pad_fuse #22368 from AleksandrPanov:move_contrib_aruco_to_main_objdetect #22371 from kianelbo:patch-1 #22372 from ocpalo:libjpegturbo_nasm #22382 from AleksandrPanov:qrcode_x86_arm_34 #22404 from Kumataro:3.4-fix22388_2 #22407 from Biswa96:cmake-pkgconfig-mingw #22410 from zihaomu:silu_support #22411 from zihaomu:remove_whitespace #22412 from asenyaev/asen/carotene_warnings_macos_arm64 #22425 from AleksandrPanov:qrcode_test_arm #22429 from hanliutong:more-rvv-intrin #22432 from dmatveev:dm/ade_012a #22436 from Harvey-Huang:4-bit_palette_color #22440 from zihaomu:fix_conv_bug #22443 from catree:feat_calibrate_camera_exe_initial_guess #22444 from catree:feat_calibrate_camera_exe_initial_guess_4.x #22448 from Ichini24:reshape-permutations-fix #22451 from dmatveev:dm/abstract_execs #22454 from zihaomu:bug_fix_22450 #22456 from TolyaTalamanov:at/onevpl-fixes-linux #22462 from Biswa96:fix-directx-check #22463 from hanliutong:rvv #22466 from sturkmen72:patch-4 #22474 from alalek:fix_msvs_warning_ffmpeg_test #22475 from cabelo:yolov4_models #22478 from WanliZhong:nary_eltwise_cuda #22479 from scottchou007:master #22480 from gr8jam/fix/pattern_tools:undesiarabe_behaviour_on_windows #22482 from Glutamat42:tutorial_generalized_hough_transform #22494 from TolyaTalamanov:at/expose-all-core-to-python #22495 from cpoerschke:4.x-issue-22483 #22497 from ocpalo:patch-1 #22498 from alalek:update_ffmpeg_4.x #22501 from cabelo:oak-devices-docs #22503 from danopdev:android-video-writter #22504 from hflemmen:fix-broken-link #22505 from asmorkalov:as/matcher_score_thresh #22507 from TolyaTalamanov:at/replace-mfx-major-version-assert-to-warning #22508 from sunshinemyson:TIMVX #22511 from alalek:dnn_build_warning_gcc12 #22512 from alalek:build_warning_gcc12_uninitialized #22516 from ocpalo:patch-1 #22518 from TolyaTalamanov:at/expand-modeling-tool-to-support-config-in-yml #22519 from stefan-spiss:stereo_calib_per_obj_extr_ret #22520 from hanliutong:hsv #22521 from asmorkalov/as:cuddn_version_non_cache #22523 from fengyuentau:update_mirrors_220916 #22526 from paroj:pyrect #22527 from paroj:misc #22529 from fengyuentau:scatter_scatternd #22531 from zihaomu:stop_rely_name #22535 from sashashura:patch-2 #22539 from catree:feat_homography_tutorial_gif #22542 from TolyaTalamanov:at/expand-performance-report-with-metrics #22547 from dkurt:riscv_toolchain #22552 from alvoron:ocv_ov_instruction #22554 from WanliZhong:slice_axes_no_seq #22558 from hanliutong:signmask #22559 from smirnov-alexey:as/vpl_ocl #22560 from fengyuentau:enable_issue_template_chooser_with_templates #22562 from cudawarped:add_bindings_to_cuda_fastNlMeansDenoising #22566 from asmorkalov:as/libjpeg_turbo_linkage_warning #22568 from asmorkalov:as/webp_warning #22570 from alalek:fix_riscv_opt_path #22572 from catree:feat_improve_doc_calib3d #22577 from zihaomu:Disable_winograd_branch_in_tryquantize #22580 from seanm:Wextra-semi #22583 from TolyaTalamanov:at/add-cfg-output-precision-for-ie-backend #22584 from vrabaud:msan #22585 from opencv:zm/remove-code-1 #22588 from TolyaTalamanov:at/sync-ie-request-pool #22589 from ocpalo:imagecollection_warnings #22593 from zihaomu:optimize_wino #22594 from ZhaoChuyang:pr_test_for_22253 #22596 from TolyaTalamanov:at/add-num-iter #22600 from alalek:cmake_opt_force_targets #22601 from cpoerschke:4.x-issue-22595 #22606 from savuor:doc_fix_lmsolver #22611 from zihaomu:greaterOrEqual #22613 from erasta:patch-1 #22615 from cudawarped:nvcuvenc #22617 from changh95:4.x #22629 from asenyaev:asen/cuda_pipeline #22633 from cudawarped:fix_3361 #22635 from hzawary:4.x #22637 from alalek:docs_fix_links_generation_22572 #22639 from WanliZhong:issue#22625 #22643 from sturkmen72:cosmetic #22651 from mshabunin:script-doc #22652 from rogday:cuda_test_fixes #22653 from WanliZhong:issue22597 #22654 from asenyaev:asen/cuda_trigger_4.x #22656 from dkurt:halide_fixes #22659 from AleksandrPanov:qr_reduce_extra_adaptiveThreshold #22661 from catree:fix_AKAZE_bib_pages #22662 from catree:fix_chessboard_img #22666 from zihaomu:support_onnx_qdq_model #22667 from zihaomu:bug_fix_in_winograd #22672 from ramasilveyra:docs/remove-dup-v4 #22675 from CSBVision:patch-2 #22683 from alalek:android_activity_export #22684 from alalek:android_update #22687 from asmorkalov:as/yolo7_test #22689 from asmorkalov:as/ubuntu14_tk1_ffmpeg #22690 from alalek:android_config_ndk25 #22692 from asmorkalov:as/arm_debug_4x #22695 from AleksandrPanov:qr_improve_version_detect #22702 from kallaballa:ffmpeg_environment_variables #22706 from kallaballa:libavdevice_for_ffmpeg_v4l2 #22712 from dmatveev:dm/fix_va_headers #22717 from alalek:issue_22716 #22718 from zihaomu:improve_stackblur #22725 from zihaomu:fix_infinit_loop_in_tf #22726 from JopKnoppers:master #22727 from su77ungr:patch-1 #22735 from TolyaTalamanov:at/expose-all-imgproc-to-python #22737 from fwcd:activate-cocoa-window-on-top #22739 from zihaomu:remove_never_used_code #22744 from WanliZhong:fix_gitcode_mirror #22761 from reunanen:fix-floodFill-for-very-large-images #22771 from kallaballa:opencl_hls_and_hsv_conversions_bug #22775 from WanliZhong:issue22713 #22790 from reunanen:add-capability-to-set-DWA-compression-level-in-OpenEXR-encoding #22792 from tailsu:sd/avfoundation-orientation-meta #22796 from ClayXrex:patch-1 #22801 from alalek:update_zlib #22802 from zihaomu:fix_infinit_loop_in_tf_34 #22804 from dan-masek:fix_issue_22765 #22805 from dan-masek:fix_issue_22766 #22806 from dan-masek:fix_issue_22767 #22808 from zihaomu:nanotrack #22809 from fengyuentau:tile #22811 from alalek:core_check_bool #22814 from AleksandrPanov:log_qr_version #22816 from cudawarped/remove_windows_cuda_dll_warning #22828 from WanliZhong:improve_matmul #22830 from alalek:issue_22752 #22831 from mshabunin:fix-gapi-test-crash #22838 from dan-masek:fix_issue_22837 #22839 from zchrissirhcz:fix-typo-3.4 #22840 from zihaomu:optimze_conv_memory_usage #22842 from asmorkalov:as/pr_22737_backport #22855 from kallaballa:print_cl_status_on_fail #22857 from fengyuentau:batched_nms #22865 from cpoerschke:3.4-issue-22860 #22866 from asmorkalov:as/error_formatting #22873 from WanliZhong:issue22859 #22875 from asmorkalov:as/cl_error_code_fix #22880 from cudawarped:remember_cudacodec_lib_locations #22882 from zihaomu:gemm_first_const #22885 from asmorkalov:as/new_qt_icons #22888 from alalek:dnn_ov_fix_custom_layers #22891 from AleksandrPanov:qr_add_alignment #22894 from mshabunin:ffmpeg-16bit #22898 from fengyuentau:slice_neg_steps #22899 from mshabunin:fix-videoio-plugin #22905 from zihaomu:clean_up_conv3d_1d #22907 from partheee:patch-1 #22910 from alalek:cmake_pkg_config_ignore_atomic #22914 from tozanski:tomoz/ransac-bugfix #22919 from asmorkalov:as/gstreamer_read_timeout #22922 from alalek:fix_riscv_intrin_rvv #22924 from alalek:logger_strip_base_dir #22928 from alalek:riscv_toolchains #22930 from MaximMilashchenko:gstreamer_support #22932 from alalek:cmake_drop_libjpeg_simd_warning #22933 from alalek:fixup_22894 #22934 from alalek:fix_filestorage_binding #22935 from alalek:gapi_error #22936 from hzcyf:orbbec_new_cam_support #22937 from asmorkalov:as/issue_22893 #22939 from stopmosk:21826-python-bindings-for-videocapturewaitany #22940 from alalek:build_warnings_msvc #22942 from alalek:videoio_test_update_hw_checks #22946 from VadimLevin:dev/vlevin/avfoundation-stable-multicamera-index #22951 from zihaomu:update_nanotrack_comment #22954 from VadimLevin:dev/vlevin/fix-merge-artifacts-in-python-misc-tests #22955 from VadimLevin:dev/vlevin/handle-properties-with-keyword-names #22957 from dkurt:new_openvino_api #22958 from asmorkalov:as/ffmpeg_missing_include #22959 from feuerste:parallel_mertens #22962 from stopmosk:20465-dstchannels-does-not-cover-all-color-codes-1 #22963 from cudawarped:replace_texture_ref_with_texture_obj #22965 from vrabaud:numpy_fix #22966 from vrabaud:mm_pause_fix #22967 from stopmosk:usac-maxiters-bugfix #22971 from alalek:update_version_3.4.19-pre #22972 from alalek:update_version_4.7.0-pre #22979 from alalek:fix_videio_test_limit_threads #22980 from alalek:samples_python_3.11 #22981 from alalek:core_freeze_cache_dir_prefix_4.x #22986 from AleksandrPanov:move_contrib_charuco_to_main_objdetect #22988 from vrabaud:mm_pause_fix #22993 from alalek:fixup_21738 #22994 from alalek:gapi_build_issues #22995 from alalek:dnn_fix_opencl_matmul #22999 from mshabunin:ffmpeg-valgrind-supp #23001 from alalek:videoio_init_vars #23002 from alalek:issue_22206 #23004 from erasta:patch-2 #23005 from alalek:objdetect_cleanup_aruco_ptr_filestorage #23008 from mshabunin:fix-yolov4-tiny-hash #23009 from asmorkalov:as/aruco_js_update2 #23012 from cudawarped/fix_win32_cuda_warning #23013 from vrabaud:mertens_fix #23014 from alalek:ffmpeg_default_threads #23017 from asmorkalov:as/qrcode_valgrind #23028 from zihaomu:update_doc_nanotrack #23029 from savuor:backport3_fix_fisheye_aspect_ratio #23035 from alalek:update_ffmpeg_4.x #23036 from asmorkalov:as/blobdetect_range_fix #23037 from cudawarped:fix_for_cuda_12 #23039 from alalek:cmake_3.5_fix #23043 from AlejandroSilvestri:patch-1 #23049 from alalek:issue_23048 #23050 from zihaomu:fix_memory a2b3acfc6e dnn: add the CANN backend (#22634) bb64db98d8 Further optimization of Conv2D, fused Conv_Add_Activation, bring latest code from ficus OpConv.fx. (#22401)

    Previous "Merge 4.x": #22408

    Note: #22519 is not ported

    build_image:Docs=docs-js:18.04
    build_image:Custom=javascript
    buildworker:Custom=linux-1
    
  • DNN: fix possible segmentation fault error in winograd on x86

    DNN: fix possible segmentation fault error in winograd on x86

    Pull Request Readiness Checklist

    See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

    • [x] I agree to contribute to the project under Apache 2 License.
    • [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
    • [x] The PR is proposed to the proper branch
    • [ ] There is a reference to the original bug report and related work
    • [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name.
    • [ ] The feature is well documented and sample code can be built with the project CMake
  • Open CV_CPU_NEON_DOTPROD on Apple silicon devices

    Open CV_CPU_NEON_DOTPROD on Apple silicon devices

    Fixes https://github.com/opencv/opencv/issues/23085.

    After https://github.com/opencv/opencv/pull/22271 was merged, OpenCV can no longer initialize on M1 macs due to missing the required NEON_DOTPROD feature. This patch simply checks if __ARM_FEATURE_DOTPROD is enabled (which should be true when compiling for M1 macs), and turns on CV_CPU_NEON_DOTPROD if this feature is supported in runtime.

    Pull Request Readiness Checklist

    See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

    • [x] I agree to contribute to the project under Apache 2 License.
    • [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
    • [x] The PR is proposed to the proper branch
    • [x] There is a reference to the original bug report and related work
    • [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name.
    • [ ] The feature is well documented and sample code can be built with the project CMake
  • Fix Winograd

    Fix Winograd

    Pull Request Readiness Checklist

    See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

    • [x] I agree to contribute to the project under Apache 2 License.
    • [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
    • [x] The PR is proposed to the proper branch
    • [x] There is a reference to the original bug report and related work
    • [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name.
    • [ ] The feature is well documented and sample code can be built with the project CMake

    This PR should fix #23072 and https://github.com/cocoa-xu/evision/issues/153.

    When compiling on GitHub's macOS runner, CV_TRY_AVX2 and CV_SIMD128 will both set to 1; this leads to _FX_WINO_IBLOCK=6 and _FX_WINO_ATOM_F32=8.

    #if (CV_NEON && CV_NEON_AARCH64) || CV_TRY_AVX2
        _FX_WINO_IBLOCK = 6,
    #else
        _FX_WINO_IBLOCK = 3,
    #endif
    
    #if CV_TRY_AVX2
        _FX_WINO_ATOM_F32 = 8,
    #else
        _FX_WINO_ATOM_F32 = 4,
    #endif
    

    In runWinograd63, the following statements

    #if CV_TRY_AVX2
                            if (conv->useAVX2)
                                opt_AVX2::_fx_winograd_BtXB_8x8_f32(inptr, inpstep, inwptr, Cg);
                            else
    #endif
                            _fx_winograd_BtXB_8x8_f32(inptr, inpstep, inwptr, Cg);
    

    will be evaluated to

                            if (conv->useAVX2)
                                opt_AVX2::_fx_winograd_BtXB_8x8_f32(inptr, inpstep, inwptr, Cg);
                            else
                                 _fx_winograd_BtXB_8x8_f32(inptr, inpstep, inwptr, Cg);
    

    However, runtime CPU feature detection fails to report that this CPU supports AVX2 (or the runner CPU indeed doesn't support AVX2; either way, this doesn't affect the following conclusion and changes)

    (lldb) p cv::checkHardwareSupport(CpuFeatures::CPU_AVX2)
    (bool) $1 = false
    
    Screenshot 2023-01-08 at 03 02 03

    Therefore the code will choose the false branch, cv::dnn::_fx_winograd_BtXB_8x8_f32, with incorrect values of _FX_WINO_IBLOCK and _FX_WINO_ATOM_F32 that computed at compile-time, which eventually causes illegal memory access.

    Since we are detecting CPU features at runtime (if (conv->useAVX2)) and processing the data in corresponding functions, it would be better to use variables, iblock, atom_f32 and natoms_f32, to store the correct right sizes and pass them to these functions. :)

    (IMHO, this is also the reason why we can't reproduce this segmentation fault locally because the CPU on the local machine supports AVX2, so the faulty branch was never taken.)

    /cc @zihaomu

  • Usage of imread(): magic number 0, unchecked result

    Usage of imread(): magic number 0, unchecked result

    Changes related to #23107

    Pull Request Readiness Checklist

    See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

    • [x] I agree to contribute to the project under Apache 2 License.
    • [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
    • [x] The PR is proposed to the proper branch
    • [x] There is a reference to the original bug report and related work
    • [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name.
    • [x] The feature is well documented and sample code can be built with the project CMake
open Multiple View Geometry library. Basis for 3D computer vision and Structure from Motion.
open Multiple View Geometry library. Basis for 3D computer vision and Structure from Motion.

OpenMVG (open Multiple View Geometry) License Documentation Continuous Integration (Linux/MacOs/Windows) Build Code Quality Chat Wiki local/docker bui

Jan 8, 2023
Implementations of Multiple View Geometry in Computer Vision and some extended algorithms.

MVGPlus Implementations of Multiple View Geometry in Computer Vision and some extended algorithms. Implementations Template-based RANSAC 2D Line estim

Apr 7, 2022
Mobile Robot Programming Toolkit (MRPT) provides C++ libraries aimed at researchers in mobile robotics and computer vision
Mobile Robot Programming Toolkit (MRPT) provides C++ libraries aimed at researchers in mobile robotics and computer vision

The MRPT project 1. Introduction Mobile Robot Programming Toolkit (MRPT) provides C++ libraries aimed at researchers in mobile robotics and computer v

Dec 24, 2022
Open source modules to interface Metavision Intelligence Suite with event-based vision hardware equipment

Metavision: installation from source This page describes how to compile and install the OpenEB codebase. For more information, refer to our online doc

Dec 27, 2022
Code and Data for our CVPR 2021 paper "Structured Scene Memory for Vision-Language Navigation"

SSM-VLN Code and Data for our CVPR 2021 paper "Structured Scene Memory for Vision-Language Navigation". Environment Installation Download Room-to-Room

Dec 3, 2022
Episodic Transformer (E.T.) is a novel attention-based architecture for vision-and-language navigation.
Episodic Transformer (E.T.) is a novel attention-based architecture for vision-and-language navigation.

Episodic Transformer (E.T.) is a novel attention-based architecture for vision-and-language navigation. E.T. is based on a multimodal transformer that encodes language inputs and the full episode history of visual observations and actions.

Nov 17, 2022
deep learning vision detector/estimator

libopenvision deep learning visualization C library Prerequest ncnn Install openmp vulkan(optional) Build git submodule update --init --recursuve cd b

Sep 17, 2022
Homework of RoboWalker Vision team of USTC for DJI Robomaster competition.

USTC RoboWalker战队 视觉组2022练习作业 “极限犹可突破,至臻亦不可止。” 作业列表 0. 编程基础教程 Hello World 针对没有学过C++/Python、没有太多相关编程经验的新同学的C++ & Python编程入门教程。 0. Git基础教程 Hello Git 学习世

Feb 20, 2022
ROS wrapper for real-time incremental event-based vision motion estimation by dispersion minimisation
ROS wrapper for real-time incremental event-based vision motion estimation by dispersion minimisation

event_emin_ros ROS wrapper for real-time incremental event-based vision motion estimation by dispersion minimisation (EventEMin). This code was used t

Jan 10, 2022
An open source machine learning library for performing regression tasks using RVM technique.

Introduction neonrvm is an open source machine learning library for performing regression tasks using RVM technique. It is written in C programming la

May 31, 2022
An open source python library for automated feature engineering
An open source python library for automated feature engineering

"One of the holy grails of machine learning is to automate more and more of the feature engineering process." ― Pedro Domingos, A Few Useful Things to

Jan 7, 2023
Cinder is a community-developed, free and open source library for professional-quality creative coding in C++.

Cinder 0.9.3dev: libcinder.org Cinder is a peer-reviewed, free, open source C++ library for creative coding. Please note that Cinder depends on a few

Jan 7, 2023
An open-source, low-code machine learning library in Python
An open-source, low-code machine learning library in Python

An open-source, low-code machine learning library in Python ?? Version 2.3.6 out now! Check out the release notes here. Official • Docs • Install • Tu

Dec 29, 2022
Gesture Recognition Toolkit (GRT) is a cross-platform, open-source, C++ machine learning library designed for real-time gesture recognition.

Gesture Recognition Toolkit (GRT) The Gesture Recognition Toolkit (GRT) is a cross-platform, open-source, C++ machine learning library designed for re

Dec 29, 2022
Computer Science Bridge Program at NYU Tandon School of Engineering

NYU Tandon Bridge 2021 (24 Week) Personal notes, resources, and exercises for the Computer Science Bridge Program at NYU Tandon School of Engineering.

Dec 22, 2022
The code for C programming 2021, Department of Computer Science, National Taiwan University.

C2021 .c for sousce code, .in for input file, and .out for correct output. The numbers are the problem indices in the judge system. "make number" to m

Jan 10, 2022
Computer Networks, [email protected], taught by Hong Xu

CSCI4430, Computer Networks (Spring 2022) Administrivia Schedule Lectures: Mon 12:30pm -- 2:15pm, ERB LT (Zoom link) Tue 4:30pm -- 5:15pm, ERB LT (Zoo

Dec 15, 2022
An Open Source Machine Learning Framework for Everyone
An Open Source Machine Learning Framework for Everyone

Documentation TensorFlow is an end-to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries, a

Jan 7, 2023
Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit

CNTK Chat Windows build status Linux build status The Microsoft Cognitive Toolkit (https://cntk.ai) is a unified deep learning toolkit that describes

Dec 23, 2022