SorterBot Cloud Documentation¶
Locator Module¶
locator.detectron class¶
This class uses Facebook Reserach’s Detectron2 platform to localize and classify objects on the provided image.
https://github.com/facebookresearch/detectron2
-
class
locator.detectron.
Detectron
(base_img_path, model_config, threshold=0.7)¶ Bases:
object
This class sets up Detectron2 and provides a method to predict objects on a provided image.
- Parameters
base_img_path (str) – Root directory for saved images. Inside the provided folder, the image to be processed should be saved inside the original directory.
model_config (str) – Location of the Detectron config (.yaml) file inside the detectron2 repository’s config folder. The value should contain any subfolders and the extension as well.
threshold (float) – Object detection threshold for Detectron2.
-
predict
(session_id, image_name, img)¶ This method predicts the locations of bounding boxes on the provided image.
- Parameters
session_id (str) – Datetime based unique identifier of the current session. It is generated by the Raspberry Pi and passed with the POST request.
image_name (str) – Name of the image saved in the images/original folder.
img (np.array) – Image to be processed as Numpy array.
- Returns
results – List of dicts containing the results of the object detection as absolute pixel values as well as the original image name, dimensions and the predicted class.
- Return type
list
Main Module¶
Main module responsible for orchestrating image processing.
-
class
main.
Main
(base_img_path)¶ Bases:
object
Main class for controlling image processing. When instantiated, it loads all neccessary environment variables and config files and instantiates all needed modules.
- Parameters
base_img_path (str) – Location where the downloaded images should be stored.
-
process_image
(arm_id, session_id, image_name, img_bytes)¶ This method runs object recognition on the passed image and saves the result to the database.
- Parameters
session_id (str) – Datetime based unique identifier of the current session. It is generated by the Raspberry Pi and passed with the POST request.
image_name (str) – Name of the image to be processed. The image has to be uploaded to the s3 bucket. Value is passed with the POST request.
img_bytes (bytes) – Image to be processed as raw bytes.
- Returns
success – Boolean indicating if processing was successful.
- Return type
bool
-
save_and_upload_image
(img_path, s3_path, img_bytes, log_args)¶ Takes an image as bytes and writes it to disk, then uploads it to s3. This functionality is not essential to generate the commands, so if this is executed on a separate thread, some speed-up can be gained.
- Parameters
img_path (str) – Path of the image where it should be saved to disk.
s3_path (str) – Path of the image where it should be uploaded to s3.
img_bytes (bytes) – The image as bytes.
log_args (dict) – Arguments to correctly place log entry on the control panel.
- Returns
success – Boolean representing if saving to disk and uploading to s3 was successful.
- Return type
bool
-
stitch_images
(arm_id, session_id, stitch_type, images_with_objects)¶ Stitches together overlapping images to provide an overview on the UI about the area of interest before and after the arm’s operation.
- Parameters
arm_id (str) – Unique identifier of the robot arm.
session_id (str) – Datetime based unique identifier of the current session. It is generated by the Raspberry Pi and passed with the POST request.
stitch_type (str) – Type of stitch which will be prepended to the file name. Possible values: original and after.
images_with_objects (list of dicts) – List of dicts, each dictionary containing ‘image_name’ and ‘objects’. There is one entry for each unique image in a session.
-
vectorize_session_images
(arm_constants, session_id, should_stitch=True)¶ This method is to be executed after the last image of a session is processed. It gets a list of unique images in the current session, retrieves all the objects that belong to each image and runs the vectorizer on them.
- Parameters
arm_contants (dict) – Constants specific to the arm, defined in the arm’s config file.
session_id (str) – Unique identifier of the session.
should_stitch (bool) – Boolean value that enables or disables stitching a panorama image.
- Returns
pairings (list) – List of dicts containing pairs of objects and clusters.
session_id (str) – Datetime based unique identifier of the current session. It is generated by the Raspberry Pi and passed with the POST request.
stitching_process (process) – Process which is used for stitching. Should be joined after the commands are sent back to Raspberry Pi, this way the Pi is not waiting for it unneccessarily and also the process doesn’t become zombie.
Server Module¶
WebSockets server that listens to messages from the Raspberry Pis and calls the appropriate functions.
-
class
server.
WebSockets
¶ Bases:
object
-
async
listen
(websocket, path)¶ Function that listens to new WebSocket messages. It can handle bytes and JSON messages. Supported message types: recv_img_proc, recv_img_after, get_commands_of_session, stitch_after_image.
- Parameters
websocket (WebSocket) – WebSocket instance provided by websockets.serve
path (str) – Unused.
-
async
Utils Module¶
utils.S3 class¶
This class provides methods to interact with AWS s3 storage bucket.
-
class
utils.S3.
S3
(base_img_path, logger_instance)¶ Bases:
object
This class sets up the s3 client (boto3).
- Parameters
base_img_path (str) – Absolute path where the downloaded images should be stored. Inside this folder, appropriate subfolders will be automatically created for the original images (named “original”) and the cropped images (named “cropped”).
logger_instance (logger) – Logger instance passed from main.
-
download_image
(arm_id, session_id, image_name)¶ This method downloads images from s3. To avoid unneccessary downloads, images are only downloaded if they are missing or corrupted.
- Parameters
arm_id (str) – Unique identifier of the arm.
session_id (str) – Datetime based unique identifier of the current session.
image_name (str) – Name of the image in the s3 bucket to be downloaded.
- Returns
img_path – Absolute path of the downloaded image.
- Return type
str
-
upload_file
(bucket_name, file_path, s3_path)¶ Uploads a file to s3.
- Parameters
bucket_name (str) – Name of the bucket to upload.
file_path (str) – Path of the image to be uploaded.
s3_path (str) – Path (including filename) where the file should be saved in s3.
utils.coord_conversion class¶
Functions to convert bounding box coordinates from relative to absolute.
-
utils.coord_conversion.
filter_duplicates
(objects, threshold=150)¶ Filters out the bounding boxes that belong to the same object, but showed up on a different image.
- Parameters
objects (list) – List of dicts, containing the absolute coordinates of the objects.
threshold (int) – Distance within that objects are considered the same.
- Returns
filtered_objs – List of the absolute coordinates of the unique objects.
- Return type
list
-
utils.coord_conversion.
object_to_polar
(arm_constants, image_name, obj)¶ Converts bounding box coordinates relative to the image frame to absolute polar coordinates, relative to the robotic arm.
- Parameters
arm_constants (dict) – Dictionary containing the arm’s constants that are saved in the arm’s config file and sent with the request.
image_name (str) – Name of the image, which also corresponds to the robot arm’s rotation, expressed in pulse width.
obj (dict) – Dictionary contining the relative coordinates of the bounding box.
- Returns
abs_coords – Dictionary contining the computed absolute coordinates, the rotation as pulse width, the object ID and the bounding box dimensions.
- Return type
dict
utils.logger class¶
A Python logger to format the logs and send them to the Control Panel using an HTTPHandler.
utils.postgres class¶
Utility class to provide methods to interact with the PostgreSQL database.
-
class
utils.postgres.
Postgres
¶ Bases:
object
Class to provide method to interact with PostgreSQL database. Uses connection pooling to avoid opening and closing connections every time a request comes in. It uses a single database which is created when starting the service if it does not exist already. Each arm’s data is saved to a separate schema while each session gets its own table.
-
create_table
(*args, **kwargs)¶ This method creates a new table with a given name in the database if it does not exist yet. A separate table should be created for each session.
- Parameters
cursor (psycopg2.cursor) – Cursor to be used for SQL execution. Provided by add_connection decorator.
schema_name (str) – Name of the schema to be created. Corresponds to arm_id.
table_name (str) – Name of the table to be created. Corresponds to session_id.
- Returns
table_created – True if table was created, false if it already existed.
- Return type
bool
-
get_objects_of_image
(*args, **kwargs)¶ This method retrieves the recognized objects from the database belonging to the provided image.
- Parameters
cursor (psycopg2.cursor) – Cursor to be used for SQL execution. Provided by add_connection decorator.
schema_name (str) – Name of the schema to be used. Corresponds to arm_id.
table_name (str) – Name of the table to be used. Corresponds to session_id.
image_name (str) – Name of the image of which the objects should be retrieved.
- Returns
objects_of_image – List of dicts containing the follwing keys: id, type, bbox_dims. bbox_dims contains the relative coordinates of the top left and bottom right corners of the bounding box.
- Return type
list
-
get_unique_images
(*args, **kwargs)¶ This method retrieves a list of unique images in the current session.
- Parameters
cursor (psycopg2.cursor) – Cursor to be used for SQL execution. Provided by add_connection decorator.
schema_name (str) – Name of the schema to be used. Corresponds to arm_id.
table_name (str) – Name of the table to be used. Corresponds to session_id.
- Returns
unique_images – List of strings containing unique image names.
- Return type
list
-
insert_results
(*args, **kwargs)¶ This method inserts the result from the object recognition to the database.
- Parameters
cursor (psycopg2.cursor) – Cursor to be used for SQL execution. Provided by add_connection decorator.
schema_name (str) – Name of the schema to be used. Corresponds to arm_id.
table_name (str) – Name of the table to be used. Corresponds to session_id.
results (list) – List of dict’s containing the following keys: image_name, image_width, image_height, class, x1, y1, x2, y2.
-
-
utils.postgres.
add_connection
(func)¶ Decorator function to retrieve a connection from the connection pool, pass it to the decorated function, then put it back to the pool.
Vectorizer Module¶
vectorizer.preprocessor class¶
The PreProcessor module is responsible for fetching images from the AWS s3 bucket and cropping the recognized objects from the original images.
-
class
vectorizer.preprocessor.
PreProcessor
(base_img_path)¶ Bases:
object
This class provides methods to crop images.
- Parameters
base_img_path (str) – Absolute path where the downloaded images are stored. Inside this folder, appropriate subfolder will be automatically created for the cropped images (named “cropped”).
-
crop_all_objects
(session_id, image_name, objects)¶ This function crops all recognized items from an original image. It has been separated from crop_object for performance reasons (Image.open() is executed only once per image not once per object).
- Parameters
session_id (str) – Unique identifier of the current session. It is generated by the Raspberry Pi and passed with the POST request.
image_name (str) – Name of the image on disk to be loaded.
objects (list) – List of dicts containing information about the recognized items. bbox_dims will be used here for cropping.
-
crop_object
(img_folder, img, id, bbox_dims)¶ This method should be called on each item (not containers) recognized on an image. It will crop the image around the provided bounding box coordinates and save the portion of the image inside the bounding box as a separate file. The cropped image will be placed in a folder corresponding to the name of the original image. The name of the new (cropped) image will contain id.
- Parameters
img_folder (str) – Absolute path of the folder where the current image’s cropped pictures should be saved.
img (Image) – Already opened image as PIL Image object, returned by Image.open().
id (int) – Object id (unique within an original image) of the recognized object. The cropped image will be saved with a name like the following: item_<id>.ext
bbox_dims (dict) – Bounding box dimensions to be used for cropping. The dict should contain the following keys: x1 and y1 representing the coordinates of the top left, while x2 and y2 representing the coordinates of the bottom right corner of the bounding box. All values are floats between 0 and 1, representing relative distances.
-
run
(session_id, images)¶ This method coordinates the preprocessing. It loops though the provided list of images and crops all recognized objects from the image.
- Parameters
session_id (str) – Unique identifier of the current session. It is generated by the Raspberry Pi and passed with the POST request.
images (list) – List of dicts containing image_name and objects keys. The objects value contains the bounding boxes to be cropped.
vectorizer.vectorizer class¶
Vectorizer module is intended to load files from disk and calculate feature vectors from them. These feature vectors are used later for clustering. Vectorizer is built on PyTorch/Torchvision. Any model from torchvision.models can be used for vectorization.
-
class
vectorizer.vectorizer.
ImageFolderWithPaths
(root, transform=None, target_transform=None, loader=<function default_loader>, is_valid_file=None)¶ Bases:
torchvision.datasets.folder.ImageFolder
Extends torchvision.datasets.ImageFolder to add image paths to the dataset. Implements an override for the __getitem__ method, which is called when the dataloader accesses an item from the dataset
-
class
vectorizer.vectorizer.
Vectorizer
(base_img_path, model_name, input_dimensions, output_layer='avgpool', output_length=512, stats=None, batch_size=1024, num_workers=4)¶ Bases:
object
This class initiates the chosen neural network, loads the specified images, and computes feature vectors from them. The computation is executed in batches to increase efficiency.
- Parameters
base_img_path (str) – Absolute path where the downloaded images should be stored. Inside this folder, appropriate subfolders will be automatically created for the original images (named “original”) and the cropped images (named “cropped”).
model_name (str) – Name of the neural network architecture to be used for vectorization. Has to be one of the available models of torchvision.models.
input_dimensions (tuple) – Dimensions of the input tensor required by the network architecture. The processed images will be resized to these dimensions. Networks usually require square-shaped images.
output_layer (str, optional) – The layer of interest’s name in the neural net. The specified layer’s outputs are the vectors, which will be copied from the network and returned as results. To get a model summary with names, simply print the model, like:
print(model)
output_length (int, optional) – The length of the output vector.
stats (dict, optional) – Dict with 2 keys:
mean
andstd
. The corresponding values are both lists of 3 elements representing the 3 color channels of images. Specify here the means and standard deviations of the training set on which the used model was trained. Defaults to ImageNet stats.batch_size (int, optional) – Specifies how many images a batch should contain. Use a higher number, preferably a power of 2 for faster processing. Keep in mind that one batch has to fit in memory at once.
num_workers (int, optional) – User by PyTorch’s DataLoader to determine how many subprocesses to use for data loading. Use 0 to load everything on the main process, use a higher number for parallel loading.
-
compute_vectors
()¶ This function runs the inference on the loaded pictures using the previously selected model.
- Returns
filenames (list) – List of filenames of cropped images. Also contains the name of the original image and corresponds to the location of the image in the ‘cropped’ folder.
vectors (list) – List of resulting vectors.
-
load_data
(data_path)¶ Loads images from disk.
This function first creates a dataset using ImageFolderWithPaths, which is an extension of PyTorch’s ImageFolder, then it creates a dataloader.
- Parameters
data_path (str) – Specifies the location of the images to be loaded. Given the way ImageFolder works, the images has to be in another folder inside the specified folder. The name of that folder would be the label for training, but in case of inference, it’s irrelevant.
- Returns
found_images – Boolean value representing if any images were found to be vectorized.
- Return type
bool
-
run
(session_id, unique_images, objects, n_containers)¶ This method coordinates the process of vectorization. First, it run’s the preprocessor to crop the bounding boxes from the original images, then runs the vectorizer on the cropped images and finally clusters the resulting vectors.
- Parameters
session_id (str) – Datetime based unique identifier of the current session. It is generated by the Raspberry Pi and passed with the POST request.
objects (list) – List of dicts containing information describing the recognized objects. image_name and bbox_dims are needed here for cropping the items.
n_containers (int) – Number of recognized containers on the current image. Will be used in K-Means as the number of clusters to be created.
- Returns
pairings – List of dicts containing filename and cluster keys. The filename includes the original image’s name and the recognized object’s id, the cluster is the index of the cluster to which the particular object belongs.
- Return type
list