ScanNet is an RGB-D video dataset containing 2.5 million views in more than 1500 scans, annotated with 3D camera poses, surface reconstructions, and instance-level semantic segmentations.

ScanNet

ScanNet is an RGB-D video dataset containing 2.5 million views in more than 1500 scans, annotated with 3D camera poses, surface reconstructions, and instance-level semantic segmentations.

ScanNet Data

If you would like to download the ScanNet data, please fill out an agreement to the ScanNet Terms of Use and send it to us at [email protected].

If you have not received a response within a week, it is likely that your email is bouncing - please check this before sending repeat requests.

Please check the changelog for updates to the data release.

Data Organization

The data in ScanNet is organized by RGB-D sequence. Each sequence is stored under a directory with named scene _ , or scene%04d_%02d, where each space corresponds to a unique location (0-indexed). The raw data captured during scanning, camera poses and surface mesh reconstructions, and annotation metadata are all stored together for the given sequence. The directory has the following structure:

<scanId>
|-- <scanId>.sens
    RGB-D sensor stream containing color frames, depth frames, camera poses and other data
|-- <scanId>_vh_clean.ply
    High quality reconstructed mesh
|-- <scanId>_vh_clean_2.ply
    Cleaned and decimated mesh for semantic annotations
|-- <scanId>_vh_clean_2.0.010000.segs.json
    Over-segmentation of annotation mesh
|-- <scanId>.aggregation.json, <scanId>_vh_clean.aggregation.json
    Aggregated instance-level semantic annotations on lo-res, hi-res meshes, respectively
|-- <scanId>_vh_clean_2.0.010000.segs.json, <scanId>_vh_clean.segs.json
    Over-segmentation of lo-res, hi-res meshes, respectively (referenced by aggregated semantic annotations)
|-- <scanId>_vh_clean_2.labels.ply
    Visualization of aggregated semantic segmentation; colored by nyu40 labels (see img/legend; ply property 'label' denotes the nyu40 label id)
|-- <scanId>_2d-label.zip
    Raw 2d projections of aggregated annotation labels as 16-bit pngs with ScanNet label ids
|-- <scanId>_2d-instance.zip
    Raw 2d projections of aggregated annotation instances as 8-bit pngs
|-- <scanId>_2d-label-filt.zip
    Filtered 2d projections of aggregated annotation labels as 16-bit pngs with ScanNet label ids
|-- <scanId>_2d-instance-filt.zip
    Filtered 2d projections of aggregated annotation instances as 8-bit pngs

Data Formats

The following are overviews of the data formats used in ScanNet:

Reconstructed surface mesh file (*.ply): Binary PLY format mesh with +Z axis in upright orientation.

RGB-D sensor stream (*.sens): Compressed binary format with per-frame color, depth, camera pose and other data. See ScanNet C++ Toolkit for more information and parsing code. See SensReader/python for a very basic python data exporter.

Surface mesh segmentation file (*.segs.json):

{
  "params": {  // segmentation parameters
   "kThresh": "0.0001",
   "segMinVerts": "20",
   "minPoints": "750",
   "maxPoints": "30000",
   "thinThresh": "0.05",
   "flatThresh": "0.001",
   "minLength": "0.02",
   "maxLength": "1"
  },
  "sceneId": "...",  // id of segmented scene
  "segIndices": [1,1,1,1,3,3,15,15,15,15],  // per-vertex index of mesh segment
}

Aggregated semantic annotation file (*.aggregation.json):

{
  "sceneId": "...",  // id of annotated scene
  "appId": "...", // id + version of the tool used to create the annotation
  "segGroups": [
    {
      "id": 0,
      "objectId": 0,
      "segments": [1,4,3],
      "label": "couch"
    },
  ],
  "segmentsFile": "..." // id of the *.segs.json segmentation file referenced
}

BenchmarkScripts/util_3d.py gives examples to parsing the semantic instance information from the *.segs.json, *.aggregation.json, and *_vh_clean_2.ply mesh file, with example semantic segmentation visualization in BenchmarkScripts/3d_helpers/visualize_labels_on_mesh.py.

2d annotation projections (*_2d-label.zip, *_2d-instance.zip, *_2d-label-filt.zip, *_2d-instance-filt.zip): Projection of 3d aggregated annotation of a scan into its RGB-D frames, according to the computed camera trajectory.

ScanNet C++ Toolkit

Tools for working with ScanNet data. SensReader loads the ScanNet .sens data of compressed RGB-D frames, camera intrinsics and extrinsics, and IMU data.

Camera Parameter Estimation Code

Code for estimating camera parameters and depth undistortion. Required to compute sensor calibration files which are used by the pipeline server to undistort depth. See CameraParameterEstimation for details.

Mesh Segmentation Code

Mesh supersegment computation code which we use to preprocess meshes and prepare for semantic annotation. Refer to Segmentator directory for building and using code.

BundleFusion Reconstruction Code

ScanNet uses the BundleFusion code for reconstruction. Please refer to the BundleFusion repository at https://github.com/niessner/BundleFusion . If you use BundleFusion, please cite the original paper:

@article{dai2017bundlefusion,
  title={BundleFusion: Real-time Globally Consistent 3D Reconstruction using On-the-fly Surface Re-integration},
  author={Dai, Angela and Nie{\ss}ner, Matthias and Zoll{\"o}fer, Michael and Izadi, Shahram and Theobalt, Christian},
  journal={ACM Transactions on Graphics 2017 (TOG)},
  year={2017}
}

ScanNet Scanner iPad App

ScannerApp is designed for easy capture of RGB-D sequences using an iPad with attached Structure.io sensor.

ScanNet Scanner Data Server

Server contains the server code that receives RGB-D sequences from iPads running the Scanner app.

ScanNet Data Management UI

WebUI contains the web-based data management UI used for providing an overview of available scan data and controlling the processing and annotation pipeline.

ScanNet Semantic Annotation Tools

Code and documentation for the ScanNet semantic annotation web-based interfaces is provided as part of the SSTK library. Please refer to https://github.com/smartscenes/sstk/wiki/Scan-Annotation-Pipeline for an overview.

Benchmark Tasks

We provide code for several scene understanding benchmarks on ScanNet:

  • 3D object classification
  • 3D object retrieval
  • Semantic voxel labeling

Train/test splits are given at Tasks/Benchmark.
Label mappings and trained models can be downloaded with the ScanNet data release.

See Tasks.

Labels

The label mapping file (scannet-labels.combined.tsv) in the ScanNet task data release contains mappings from the labels provided in the ScanNet annotations (id) to the object category sets of NYUv2, ModelNet, ShapeNet, and WordNet synsets. Download with along with the task data (--task_data) or by itself (--label_map).

Citation

If you use the ScanNet data or code please cite:

@inproceedings{dai2017scannet,
    title={ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes},
    author={Dai, Angela and Chang, Angel X. and Savva, Manolis and Halber, Maciej and Funkhouser, Thomas and Nie{\ss}ner, Matthias},
    booktitle = {Proc. Computer Vision and Pattern Recognition (CVPR), IEEE},
    year = {2017}
}

Help

If you have any questions, please contact us at [email protected]

Changelog

License

The data is released under the ScanNet Terms of Use, and the code is released under the MIT license.

Copyright (c) 2017

Comments
  • Possible bug in the instance segmentation evaluation script

    Possible bug in the instance segmentation evaluation script

    While examining evaluate_semantic_instance.py I found a possible bug:

    https://github.com/ScanNet/ScanNet/blob/fc255fefa3918a21930681ce7c52502a4447c228/BenchmarkScripts/3d_evaluation/evaluate_semantic_instance.py#L267

    I believe that the gt_ids are label_id*1000+inst_id, however the VALID_CLASS_IDS are just label_ids, so the bool_void will always be all zeros. Besides the current formula seems to compute the non-void mask. The right way to compute bool_void is probably:

    bool_void = np.logical_not(np.in1d((gt_ids/1000).astype(np.int32), VALID_CLASS_IDS))

    This bug may not affect the result too much but should be fixed.

  • aggregation index out of range

    aggregation index out of range

    I am reading _vh_clean.aggregation.json for a per point annotation, however, for scene train/ 0002_00, I got the following error: index 307925 is out of bounds for axis 1 with size 297362. It seems that the index stored in aggregation.json exceeds the total number of points.

  • Having trouble dumping depth from scene0207_02: zlib.error: Error -3 while decompressing data: invalid distance cod

    Having trouble dumping depth from scene0207_02: zlib.error: Error -3 while decompressing data: invalid distance cod

    I am using https://github.com/ScanNet/ScanNet/blob/1cc5149d30d248b776039c76ae8864e63df18831/SensReader/python/reader.py to read sens files. But I ran into the following errors for scene0207_02 only, while other scenes are fine to read and dump data.

    Traceback (most recent call last):
      File "./SensReader/python/reader.py", line 79, in <module>
        main()
      File "./SensReader/python/reader.py", line 35, in main
        sd.export_depth_images(os.path.join(opt.output_path, 'depth'))
      File "/home/zhongad/3D_workspace/dataset/ScanNet/SensReader/python/SensorData.py", line 78, in export_depth_images
        depth_data = self.frames[f].decompress_depth(self.depth_compression_type)
      File "/home/zhongad/3D_workspace/dataset/ScanNet/SensReader/python/SensorData.py", line 26, in decompress_depth
        return self.decompress_depth_zlib()
      File "/home/zhongad/3D_workspace/dataset/ScanNet/SensReader/python/SensorData.py", line 31, in decompress_depth_zlib
        return zlib.decompress(self.depth_data)
    zlib.error: Error -3 while decompressing data: invalid distance code
    

    Does anyone know how to solve this? P.S. I downloaded the dataset in Nov 2021.

  • Registering depth maps to world point cloud

    Registering depth maps to world point cloud

    Hello,

    I am trying to compute a combined point cloud by aggregating depth maps across all cameras. After unprojecting depth [X_i,Y_i,Z_i] = unproject(depth_i), each point cloud needs to be transformed by the camera pose to get the world coordinates - [X_i,Y_i,Z_i].transform(camera_pose_i). Similarly for frame j: [X_j,Y_j,Z_j].transform(camera_pose_j). I was expecting that projecting these two point clouds would lead to perfectly aligned clouds. Am I missing something?

  • Corrupted data

    Corrupted data

    I noticed that there are some camera poses in the dataset with invalid data, with matrices only containing inf values. It’s not a big issue, because this only applies to a small portion of frames, but the issue should at least be documented (I didn’t read about this so far).

    Further, I think that many sequences don’t have timestamps, but sometimes random frames seem to have random stamps. Maybe the value wasn’t always initialised?

    Here is a small program that can highlight these issues

    #include <iostream>
    #include "../external/scannet/sensorData.h"
    
    using namespace std;
    
    tuple<bool,bool> analyze_sens(const string& input, bool verbose = false){
    
        // Input
        if(verbose){
            cout << "Loading data ... ";
            cout.flush();
        }
        ml::SensorData sd(input);
        if(verbose){
            cout << "done!" << endl;
            cout << sd << endl;
        }
    
        // Stats
        bool ts_d_monotonic = true;
        bool ts_c_monotonic = true;
        bool ts_d_available = false;
        bool ts_c_available = false;
        uint64_t ts_d_last = 0;
        uint64_t ts_c_last = 0;
    
        bool has_illegal_transformation = false;
    
        for (size_t i = 0; i < sd.m_frames.size(); i++) {
    
            // Test timestamps
            const ml::SensorData::RGBDFrame& frame = sd.m_frames[i];
            uint64_t t_d = frame.getTimeStampDepth();
            uint64_t t_c = frame.getTimeStampColor();
            if (t_d > 0) ts_d_available = true;
            if (t_c > 0) ts_c_available = true;
            if (t_d < ts_d_last) ts_d_monotonic = false;
            if (t_c < ts_c_last) ts_c_monotonic = false;
            ts_d_last = t_d;
            ts_c_last = t_c;
    
            // Test poses
            ml::mat4f t = frame.getCameraToWorld();
            if(t.matrix[15] != 1 || t.matrix[14] != 0 || t.matrix[13] != 0 || t.matrix[12] != 0){
                has_illegal_transformation = true;
                if(verbose)
                    cout << "Found illegal transformation at frame " << to_string(i) << ": ["
                         << t.matrix[0] << ", " << t.matrix[1] << ", " << t.matrix[2] << ", " <<t.matrix[3] << "]["
                         << t.matrix[4] << ", " << t.matrix[5] << ", " << t.matrix[6] << ", " <<t.matrix[7] << "]["
                         << t.matrix[8] << ", " << t.matrix[9] << ", " << t.matrix[10] << ", " <<t.matrix[11] << "]["
                         << t.matrix[12] << ", " << t.matrix[13] << ", " << t.matrix[14] << ", " <<t.matrix[15] << "]]" << endl;
            }
        }
    
        if(verbose){
            cout << "Depth timestamps are monotonic: " << (ts_d_monotonic ? "\x1B[32m yes" : "\x1B[31m no") << "\x1B[0m \n";
            cout << "RGB   timestamps are monotonic: " << (ts_c_monotonic ? "\x1B[32m yes" : "\x1B[31m no") << "\x1B[0m \n";
            cout << "Depth timestamps are available: " << (ts_d_available ? "\x1B[32m yes" : "\x1B[31m no") << "\x1B[0m \n";
            cout << "RGB   timestamps are available: " << (ts_c_available ? "\x1B[32m yes" : "\x1B[31m no") << "\x1B[0m \n";
            cout << "All  camera  poses  were legal: " << (!has_illegal_transformation ? "\x1B[32m yes" : "\x1B[31m no") << "\x1B[0m \n";
            cout << endl;
        }
        return make_tuple(!ts_d_monotonic || !ts_c_monotonic, has_illegal_transformation);
    }
    
    int main(int argc, char* argv[])
    {
        if(argc < 2 || argc > 3) {
            cout << "A tool to analyse scannet *.sens data.\n\n"
                    "Error, invalid arguments.\n"
                    "Mandatory: input *.sens file / input *.txt file\n"
                    "Optional path to dataset dir (if txt is provided)."
                 << endl;
            return 1;
        }
    
        // Input data
        string filename = argv[1];
        if(filename.substr(filename.find_last_of(".") + 1) == "txt"){
            // Analyse many sens files
            string sequence_name;
            string root = (argc == 3 ? argv[2] : "");
            ifstream in_stream(filename);
            while (getline(in_stream, sequence_name)){
                cout << "Checking " << sequence_name << "...";
                cout.flush();
                tuple<bool,bool> r = analyze_sens(root + "/" + sequence_name + "/" + sequence_name  + ".sens");
    
                if(get<0>(r))
                    cout << "\x1B[31m Timestamp issue \x1B[0m";
                else
                    cout << "\x1B[32m Timestamps good \x1B[0m";
    
                if(get<1>(r))
                    cout << "\x1B[31m Pose issue \x1B[0m" << endl;
                else
                    cout << "\x1B[32m Poses good \x1B[0m" << endl;
            }
            in_stream.close();
        } else {
            // Analyse single sens files
            analyze_sens(filename, true);
        }
    
    
        return 0;
    }
    
    

    If you run this script on sequence scene0003_01.sens for example, you get a list of invalid poses, like:

    Found illegal transformation at frame 1054: [-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf]] Found illegal transformation at frame 1071: [-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf]] Found illegal transformation at frame 1079: [-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf]] Found illegal transformation at frame 1080: [-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf]] Found illegal transformation at frame 1081: [-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf]] Found illegal transformation at frame 1082: [-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf]] Found illegal transformation at frame 1083: [-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf]] Found illegal transformation at frame 1084: [-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf]] Found illegal transformation at frame 1085: [-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf]] Found illegal transformation at frame 1086: [-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf]] Found illegal transformation at frame 1087: [-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf]] Found illegal transformation at frame 1088: [-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf]]

    Sequence scene0066_00.sens on the other hand has issues with timestamps. You can also run the tool on a list of sequences, giving you output like:

    Checking scene0000_00... Timestamps good Poses good Checking scene0000_01... Timestamps good Poses good Checking scene0000_02... Timestamps good Poses good Checking scene0001_00... Timestamps good Pose issue Checking scene0001_01... Timestamps good Pose issue Checking scene0002_00... Timestamps good Poses good Checking scene0002_01... Timestamps good Pose issue Checking scene0003_00... Timestamps good Pose issue Checking scene0003_01... Timestamps good Pose issue Checking scene0003_02... Timestamps good Pose issue Checking scene0004_00... Timestamps good Poses good Checking scene0005_00... Timestamps good Poses good ...

    Thanks for providing this great dataset by the way!

  • cannot download scannet datasets

    cannot download scannet datasets

    when i download scannet datasets, i met one error. please help me, thanks very much.

    $ python -V Python 3.7.2 $ python download-scannet.py -o ./Scannet_Dataset/ By pressing any key to continue you confirm that you have agreed to the ScanNet terms of use as described at: http://kaldir.vc.in.tum.de/scannet/ScanNet_TOS.pdf


    Press any key to continue, or CTRL-C to exit. Traceback (most recent call last): File "download-scannet.py", line 233, in if name == "main": main() File "download-scannet.py", line 142, in main key = raw_input('') NameError: name 'raw_input' is not defined

  • [Invalid  Scannet benchmark link ]

    [Invalid Scannet benchmark link ]

    Hi there, I want to access Scannet benchmark with link http://kaldir.vc.in.tum.de/scannet_benchmark but failed. I tried with different IP to access but all failed, so could you check if there is anything wrong on your server side?

  • Can not download the data

    Can not download the data

    Dears,

    Thanks for your amazing work!

    While I am trying to download the data on a new machine I faced the following error "IOError: [Errno socket error] [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:590)" Despite I downloaded it two days ago successfully

    Thanks in advance!

  • Can not align multiple views

    Can not align multiple views

    I am trying to map the RGBD images into a 3D point cloud. But I have struggled for a long time to align the views. The point cloud always looks like this (this point cloud contain two views, and they are not aligned):

    image

    Am I doing anything wrong? This is my code:

    import matplotlib.pyplot as plt
    import numpy as np
    import os
    import random
    from PIL import Image
    from mpl_toolkits.mplot3d import Axes3D
    from sensor_data import SensorData
    
    # data extracted using SensorData.py
    DATA_PATH = ...
    RGB_PATH = ... 
    DEPTH_PATH = ...  
    POSE_PATH = ... 
    MAX_INDEX = 100 # only load the first 100 for testing
    
    def load_matrix_from_txt(path, shape=(4, 4)):
        with open(path) as f:
            txt = f.readlines()
        txt = ''.join(txt).replace('\n', ' ')
        matrix = [float(v) for v in txt.split()]
        return np.array(matrix).reshape(shape)
    
    intrinsic_depth = load_matrix_from_txt(os.path.join(DATA_PATH, 'intrinsic_depth.txt'))
    poses = [load_matrix_from_txt(os.path.join(POSE_PATH, f'{i}.txt')) for i in range(MAX_INDEX)]
    
    def load_image(path):
        image = Image.open(path)
        return np.array(image)
    
    rgb_images = [load_image(os.path.join(RGB_PATH, f'{i}.jpg')) for i in range(MAX_INDEX)]
    depth_images = [load_image(os.path.join(DEPTH_PATH, f'{i}.png'))for i in range(MAX_INDEX)]
    
    # map a rgbd pixel into 3D space. 
    # should be the same as SensorReader
    def convert_from_uvd(u, v, d, intr, pose):
        
        # const mat4f intrinsicInv = m_calibrationDepth.m_intrinsic.getInverse();
        intrinsic_inv = np.linalg.inv(intr)
        
        # float d = (float)depth[i]/m_depthShift;
        d /= 1000
        
        # vec3f cameraPos = (intrinsicInv*vec4f((float)x*d, (float)y*d, d, 0.0f)).getVec3();
        camera_pos = (intrinsic_inv.dot(np.array([u*d, v*d, d, 0.0])))[:3]
        
        # vec3f worldPos = transform * cameraPos;
        xyzw = pose[:, :3].dot(camera_pos) + pose[:, 3]
        world_pos = xyzw[:3] / xyzw[3]
        
        return world_pos
    
    x_data, y_data, z_data, c_data = [], [], [], []
    
    for idx in [20, 50]: # take two views as a test
        d = depth_images[idx]
        c = rgb_images[idx]
        p = poses[idx]
        
        # map each pixel into 3D space
        for i in range(d.shape[0]):
            for j in range(d.shape[1]):
                x, y, z = convert_from_uvd(i, j, d[i, j], intrinsic_depth, p)
                    
                x_data.append(x)
                y_data.append(y)
                z_data.append(z)
                
                ci = int(i * c.shape[0] / d.shape[0])
                cj = int(j * c.shape[1] / d.shape[1])
                c_data.append(c[ci, cj] / 255.0)
    
    # plot the 3D point cloud
    def plot_3d(xdata, ydata, zdata, color=None, b_min=-3, b_max=3, view=(45, 45)):
        fig, ax = plt.subplots(subplot_kw={"projection": "3d"}, dpi=200)
        ax.view_init(view[0], view[1])
    
        ax.set_xlim(b_min, b_max)
        ax.set_ylim(b_min, b_max)
        ax.set_zlim(b_min, b_max)
    
        ax.scatter3D(xdata, ydata, zdata, c=color, cmap='rgb', s=0.1)
    
        plt.show()
        
    plot_3d(x_data, y_data, z_data, color=c_data)
    

    For the convert_from_uvd function, I also tried the out-of-the-box algorithm implemented by Pytorch3D, which also resulted in the same problem.

    from pytorch3d.renderer import PerspectiveCameras
    import torch
    
    def convert_from_uvd(u, v, d, intr, pose):
        d /= 1000
        camera = PerspectiveCameras(
            focal_length=torch.Tensor([[intr[0, 0], intr[1, 1]]]),
            principal_point=torch.Tensor([[intr[0, 2], intr[1, 2]]]),
            R=torch.from_numpy(pose[:3, :3]).unsqueeze(0),
            T=torch.from_numpy(pose[:3, 3]).unsqueeze(0)
        )
        p = camera.unproject_points(torch.Tensor([u, v, d]).unsqueeze(0))
        
        return p.tolist()[0]
    
  • Align Depth and Color with resolution 640x480

    Align Depth and Color with resolution 640x480

    When I want to align depth to color used by intrinc and colortodepth extrinc, there are some problems. scene0001_00/scene0001_00.txt didn't have DepthtoColorExtric.

  • Cannot build Alignment project

    Cannot build Alignment project

    I cannot compile Alignment project from this repository.

    Following error messages occurred after build started:

    Severity	Code	Description	Project	File	Line	Suppression State
    Error	C2244	'CGAL::Segment_3<R_>::vertex': unable to match function definition to an existing declaration	alignment	D:\Work\Projects\ScanNet\external\mLibExternal\include\CGAL\Segment_3.h	192	
    Error	C2244	'CGAL::Segment_2<R_>::max': unable to match function definition to an existing declaration	alignment	D:\Work\Projects\ScanNet\external\mLibExternal\include\CGAL\Segment_2.h	192	
    Error	C2244	'CGAL::Segment_2<R_>::min': unable to match function definition to an existing declaration	alignment	D:\Work\Projects\ScanNet\external\mLibExternal\include\CGAL\Segment_2.h	183	
    Error	C2244	'CGAL::Segment_2<R_>::operator []': unable to match function definition to an existing declaration	alignment	D:\Work\Projects\ScanNet\external\mLibExternal\include\CGAL\Segment_2.h	217	
    Error	C2244	'CGAL::Segment_2<R_>::point': unable to match function definition to an existing declaration	alignment	D:\Work\Projects\ScanNet\external\mLibExternal\include\CGAL\Segment_2.h	209	
    Error	C2244	'CGAL::Segment_2<R_>::vertex': unable to match function definition to an existing declaration	alignment	D:\Work\Projects\ScanNet\external\mLibExternal\include\CGAL\Segment_2.h	201	
    Error	C2244	'CGAL::Segment_3<R_>::max': unable to match function definition to an existing declaration	alignment	D:\Work\Projects\ScanNet\external\mLibExternal\include\CGAL\Segment_3.h	183	
    Error	C2244	'CGAL::Segment_3<R_>::min': unable to match function definition to an existing declaration	alignment	D:\Work\Projects\ScanNet\external\mLibExternal\include\CGAL\Segment_3.h	174	
    
    

    Microsoft Visual Studio 2019 with msbuild:

    Microsoft (R) Build Engine version 16.9.0+57a23d249 for .NET Framework                                                                                                                                             Copyright (C) Microsoft Corporation. All rights reserved.                                                                                                                                                                                                                                                                                                                                                                             16.9.0.11203 
    

    C++ compiler version:

    Microsoft (R) C/C++ Optimizing Compiler Version 19.28.29913                                                                                                                                           Copyright (C) Microsoft Corporation.  All rights reserved. 
    

    All external libraries are from dropbox

  • Evaluation script 2d

    Evaluation script 2d

    Hi, I tried to use the 2d semantic evaluation script on the validation set of the dataset in order to test my output before applying the method on the test set for the benchmark submission.

    However, since the ground truth does not have a valid label for each pixel (unknown value), i had to update the script to make it work.

    Is there any "good" practice to use the script on validation set ?

    Thanks for your answer.

  • Activated links does not work

    Activated links does not work

    Hi! I want to register a new account, but the activated link does not work. I copy the below link to my browser, but it comes with the following error:

    kaldir.vc.in.tum.de/scannet_benchmark//kaldir.vc.in.tum.de/scannet_benchmark//activate-account?token=289cd6b93929770bbf70f4874c4753b4 imageCould you give some suggestions to resolve it?

    Thank you !

  • Align color and depth image

    Align color and depth image

    There are issues #92 #28 about this question. According to @CaptainTrunky and @angeladai , the images are aligned so just resizing is ok.

    However I noticed that the resolution of color images is 1296 x 968, while the resolution of depth image is 640 x 480. Resize the former to the later may change the ratio of the width and height. Won't there be latent bug or so?

  • Where are the files that store ground truth?

    Where are the files that store ground truth?

    I downloaded some scenes from ScanNet v2 stored in .ply files but can't seem to find the corresponding text files that store the ground truth for semantic labeling. I mean the ones you need to specify for evaluate_semantic_label.py (https://github.com/ScanNet/ScanNet/blob/master/BenchmarkScripts/3d_evaluation/evaluate_semantic_label.py).

    Sorry if this is a stupid questions, just can't get my head around where to find them. Cheers, B

Video stabilization is a software-based approach in real-time to eliminating environmental effects (wind, heavy vehicle etc.) and enhance the visual performance that degrade video streaming quality.
Video stabilization is a software-based approach in real-time to eliminating environmental effects (wind, heavy vehicle etc.) and enhance the visual performance that degrade video streaming quality.

Video Stabilization Contents General Info Installation To Do General Info Video stabilization is a software-based approach in real-time to eliminating

Sep 23, 2022
OpenShot Video Library (libopenshot) is a free, open-source C++ library dedicated to delivering high quality video editing, animation, and playback solutions to the world

OpenShot Video Library (libopenshot) is a free, open-source C++ library dedicated to delivering high quality video editing, animation, and playback solutions to the world

Sep 26, 2022
Vulkan Video Sample Application demonstrating an end-to-end, all-Vulkan, processing of h.264/5 compressed video content.
Vulkan Video Sample Application demonstrating an end-to-end, all-Vulkan, processing of h.264/5 compressed video content.

This project is a Vulkan Video Sample Application demonstrating an end-to-end, all-Vulkan, processing of h.264/5 compressed video content. The application decodes the h.264/5 compressed content using an HW accelerated decoder, the decoded YCbCr frames are processed with Vulkan Graphics and then presented via the Vulkan WSI.

Sep 21, 2022
Minimalist video maker -- simplify your music score video making process!

VisualScores 极简视频制作程序,简化你的乐谱视频制作! 如果需要编译,请解压 lib 文件夹中压缩包。 使用前请参考 manual 文件夹中的用户手册。 请勿修改、移动或删除 resource 文件夹中的任何文件。 VisualScores Minimalist video maker

Sep 7, 2022
A RTSPServer for RTS3903N based IP Cameras (Yi Camera Inspired)

RTSPServer for RTS3903N based YI Cameras While this repo is focused on Yi based cameras, it should compile and run on any RTS3903N based camera! Backg

Sep 23, 2022
yangwebrtc is a self-developed rtc architecture supporting Webrtc/Srt/Rtmp, including a variety of video and audio codecs and processing, etc.
yangwebrtc is a self-developed rtc architecture supporting Webrtc/Srt/Rtmp, including a variety of video and audio codecs and processing, etc.

YangWebrtc Overview yangwebrtc是一个自主研发的支持Webrtc/Srt/Rtmp的rtc架构,包含多种视音频编解码和处理等。 支持视频会议、高清录播直播、直播互动等多种视音频应用。 可用于远程教育、远程医疗、指挥调度、安防监控、影视录播、协同办公、直播互动等多种行业应用

Sep 21, 2022
Vireo is a lightweight and versatile video processing library written in C++11

Overview Vireo is a lightweight and versatile video processing library that powers our video transcoding service, deep learning recognition systems an

Sep 20, 2022
Olive is a free non-linear video editor for Windows, macOS, and Linux.
Olive is a free non-linear video editor for Windows, macOS, and Linux.

Olive is a free non-linear video editor for Windows, macOS, and Linux.

Sep 29, 2022
A WFH utility to visually indicate user engagement of audio and video
A WFH utility to visually indicate user engagement of audio and video

DIY: In meeting indicator - WFH Utility The need for in meeting indicator at home So many of you have gotten accustomed to work from home by now. This

Jun 28, 2021
SRS is a simple, high efficiency and realtime video server, supports RTMP/WebRTC/HLS/HTTP-FLV/SRT/GB28181.
SRS is a simple, high efficiency and realtime video server, supports RTMP/WebRTC/HLS/HTTP-FLV/SRT/GB28181.

SRS is a simple, high efficiency and realtime video server, supports RTMP/WebRTC/HLS/HTTP-FLV/SRT/GB28181.

Sep 24, 2022
NymphCast is a audio and video casting system with support for custom applications.
NymphCast is a audio and video casting system with support for custom applications.

NymphCast is a software solution which turns your choice of Linux-capable hardware into an audio and video source for a television or powered speakers. It enables the streaming of audio and video over the network from a wide range of client devices, as well as the streaming of internet media to a NymphCast server, controlled by a client device.

Sep 29, 2022
SortNode is a JS binding for SORT: Simple, online, and real-time tracking of multiple objects in a video sequence.

SortNode is a JS binding for SORT: Simple, online, and real-time tracking of multiple objects in a video sequence.

Aug 2, 2022
Video game library manager with support for wide range of 3rd party libraries and game emulation support, providing one unified interface for your games.
Video game library manager with support for wide range of 3rd party libraries and game emulation support, providing one unified interface for your games.

An open source video game library manager and launcher with support for 3rd party libraries like Steam, GOG, Origin, Battle.net and Uplay. Includes game emulation support, providing one unified interface for your games.

Sep 29, 2022
🎬 ScreenToGif allows you to record a selected area of your screen, edit and save it as a gif or video.
🎬 ScreenToGif allows you to record a selected area of your screen, edit and save it as a gif or video.

ScreenToGif ?? screentogif.com This tool allows you to record a selected area of your screen, live feed from your webcam or live drawings from a sketc

Sep 28, 2022
FFmpeg coding tutorial - learn how to code custom transmuxing, transcoding, metadata extraction, frame-by-frame reading and more

FFmpeg code examples FFmpeg coding tutorial - learn how to code custom transmuxing, transcoding, metadata extraction, frame-by-frame reading and more

Jul 12, 2022
Open h.265 video codec implementation.
Open h.265 video codec implementation.

libde265 - open h.265 codec implementation libde265 is an open source implementation of the h.265 video codec. It is written from scratch and has a pl

Sep 29, 2022
Video player for 3ds
Video player for 3ds

Video player for 3DS Patch note v1.0.1 Added allow skip frames option v1.0.0 Initial release Summary Video player for 3DS Performance 256x144(144p)@30

Sep 30, 2022
Plugin for VLC that pauses/plays video on mouse click

Pause Click plugin for VLC VLC plugin that allows you to pause/play a video by clicking on the video image. Can be configured to work nicely with doub

Sep 23, 2022
Real-Time Intermediate Flow Estimation for Video Frame Interpolation filter for VapourSynth

Description RIFE filter for VapourSynth, based on rife-ncnn-vulkan. Usage rife.RIFE(clip clip[, int model=0, int gpu_id=auto, int gpu_thread=2, bint t

Sep 15, 2022