ReUseX  0.0.1
3D Point Cloud Processing for Building Reuse
Loading...
Searching...
No Matches
IDataset.hpp
Go to the documentation of this file.
1// SPDX-FileCopyrightText: 2025 Povl Filip Sonne-Frederiksen
2//
3// SPDX-License-Identifier: GPL-3.0-or-later
4#pragma once
5#include "reusex/vision/IData.hpp"
6
7#include <opencv2/core/mat.hpp>
8
9#include <filesystem>
10#include <memory>
11#include <span>
12#include <vector>
13
14// Forward declaration
15namespace ReUseX::io {
16class RTABMapDatabase;
17}
18
19namespace ReUseX::vision {
20/* Interface for datasets. A dataset is a collection of data samples, where each
21 * sample consists of an image and a label. The dataset is stored in a SQLite
22 * database, where each sample is stored as a row in a table. The table has the
23 * following columns: - id: an integer primary key that uniquely identifies the
24 * sample - image: a blob that contains the image data - label: an integer that
25 * represents the label of the sample. The dataset provides methods for
26 * retrieving samples and saving new samples to the database. The get method
27 * retrieves a sample by its index, and the save method saves a batch of samples
28 * to the database. The dataset also provides methods for retrieving and saving
29 * images, which are used internally by the get and save methods. The dataset is
30 * designed to be used with the IData interface, which represents a single data
31 * sample. The IData interface provides methods for accessing the image and
32 * label of a sample, and for saving the sample to the database. The dataset is
33 * intended to be used in machine learning applications, where it can be used to
34 * train and evaluate models on a collection of labeled images. */
35class IDataset {
36 public:
37 /* A pair of a data sample and its index. The data sample is represented as a
38 * unique pointer to an IData object, and the index is a size_t that
39 * represents the position of the sample in the dataset. The get method
40 * returns a Pair, which allows the caller to access both the data sample
41 * and its index. The save method takes a span of Pairs, which allows the
42 * caller to save a batch of samples to the database. */
43 using Pair = std::pair<std::unique_ptr<IData>, size_t>;
44
45 /* Constructs a new IDataset object with a shared database instance.
46 *
47 * This constructor allows multiple IDataset instances to share the same
48 * database connection. The database is managed by shared_ptr, so it will
49 * remain open as long as any IDataset instance references it.
50 *
51 * @param database Shared pointer to RTABMapDatabase instance
52 */
53 explicit IDataset(std::shared_ptr<io::RTABMapDatabase> database);
54
55 /* Constructs a new IDataset object by opening a database at the given path.
56 *
57 * This convenience constructor creates a new RTABMapDatabase instance
58 * internally and stores it as a shared_ptr. The database connection is
59 * managed by the IDataset and will be closed when the last reference is
60 * destroyed.
61 *
62 * @param dbPath The path to the RTABMap database file.
63 */
64 explicit IDataset(std::filesystem::path dbPath);
65
66 /* Virtual destructor to ensure proper cleanup of derived classes. */
67 virtual ~IDataset() = default;
68
69 /* Returns the number of samples in the dataset. The size method returns the
70 * number of samples in the dataset, which is equal to the size of the ids_
71 * vector. The size method is used by the caller to determine how many samples
72 * are available in the dataset, and to iterate over the samples using their
73 * indices. The size method is a const method, which means that it does not
74 * modify the state of the IDataset object.
75 * @return The number of samples in the dataset.
76 */
77 size_t size() const;
78
79 /* Retrieves a sample by its index. The get method takes an index as input,
80 * which is used to look up the corresponding sample ID in the ids_ vector.
81 * The get method then retrieves the image and label for the sample from the
82 * database, and returns a Pair containing a unique pointer to an IData object
83 * that represents the sample, and the index of the sample in the dataset. The
84 * get method is a const method, which means that it does not modify the state
85 * of the IDataset object. The get method is a pure virtual method, which
86 * means that it must be implemented by derived classes.
87 * @param index The index of the sample to retrieve.
88 * @return A Pair containing a unique pointer to an IData object that
89 * represents the sample, and the index of the sample in the dataset.
90 */
91 virtual Pair get(const std::size_t index) const = 0;
92
93 /* Saves a batch of samples to the database. The save method takes a span of
94 * Pairs as input, which allows the caller to save a batch of samples to the
95 * database. The save method iterates over the span of Pairs, and for each
96 * Pair, it retrieves the IData object and its index, and saves the image and
97 * label for the sample to the database. The save method returns true if all
98 * samples were saved successfully, and false otherwise. The save method is a
99 * pure virtual method, which means that it must be implemented by derived
100 * classes.
101 * @param data A span of Pairs, where each Pair contains a unique pointer to
102 * an IData object that represents a sample, and the index of the sample in
103 * the dataset.
104 * @return true if all samples were saved successfully, and false otherwise.
105 */
106 virtual bool save(const std::span<Pair> &data) = 0;
107
108 protected:
109 /* Retrieves the image data for a sample from the database. The getImage
110 * method takes an index as input, which is used to look up the corresponding
111 * sample ID in the ids_ vector. The getImage method then retrieves the image
112 * data for the sample from the database, and returns it as a cv::Mat object.
113 * The getImage method is a const method, which means that it does not modify
114 * the state of the IDataset object. The getImage method is used internally by
115 * the get method to retrieve the image data for a sample when constructing an
116 * IData object to represent the sample.
117 * @param index The index of the sample whose image data to retrieve.
118 * @return A cv::Mat object containing the image data for the sample.
119 */
120 cv::Mat getImage(const std::size_t index) const;
121
122 /* Saves the image data for a sample to the database. The saveImage method
123 * takes an index and a cv::Mat object as input, which represent the index of
124 * the sample and the image data to save, respectively. The saveImage method
125 * saves the image data for the sample to the database, and returns true if
126 * the image was saved successfully, and false otherwise. The saveImage method
127 * is used internally by the save method to save the image data for a sample
128 * when saving a batch of samples to the database.
129 * @param index The index of the sample whose image data to save.
130 * @param image A cv::Mat object containing the image data to save for the
131 * sample.
132 * @return true if the image was saved successfully, and false otherwise.
133 */
134 bool saveImage(const std::size_t index, const cv::Mat &image);
135
136 /* Access to the underlying database for subclasses.
137 *
138 * Subclasses can use this to access database functionality beyond the
139 * basic getImage/saveImage interface if needed.
140 *
141 * @return Shared pointer to the RTABMapDatabase instance
142 */
143 std::shared_ptr<io::RTABMapDatabase> getDatabase() const;
144
145 private:
146 /* Shared pointer to the RTABMap database. Multiple IDataset instances can
147 * share the same database connection. The database connection is managed
148 * via RAII and will be closed when the last reference is destroyed.
149 */
150 std::shared_ptr<io::RTABMapDatabase> db_;
151
152 /* Cached list of node IDs in the dataset. This is populated once during
153 * construction by querying the database. The IDs are used to map from
154 * dataset indices (0, 1, 2, ...) to RTABMap node IDs.
155 */
156 std::vector<int> ids_;
157};
158} // namespace ReUseX::vision
Core database class that wraps RTABMap's database functionality.
IDataset(std::filesystem::path dbPath)
cv::Mat getImage(const std::size_t index) const
std::shared_ptr< io::RTABMapDatabase > getDatabase() const
virtual bool save(const std::span< Pair > &data)=0
virtual Pair get(const std::size_t index) const =0
virtual ~IDataset()=default
bool saveImage(const std::size_t index, const cv::Mat &image)
std::pair< std::unique_ptr< IData >, size_t > Pair
Definition IDataset.hpp:43
IDataset(std::shared_ptr< io::RTABMapDatabase > database)