SorterBot Cloud Documentation

Locator Module

locator.detectron class

This class uses Facebook Reserach’s Detectron2 platform to localize and classify objects on the provided image.

https://github.com/facebookresearch/detectron2

class locator.detectron.Detectron(base_img_path, model_config, threshold=0.7)

Bases: object

This class sets up Detectron2 and provides a method to predict objects on a provided image.

Parameters
  • base_img_path (str) – Root directory for saved images. Inside the provided folder, the image to be processed should be saved inside the original directory.

  • model_config (str) – Location of the Detectron config (.yaml) file inside the detectron2 repository’s config folder. The value should contain any subfolders and the extension as well.

  • threshold (float) – Object detection threshold for Detectron2.

predict(session_id, image_name, img)

This method predicts the locations of bounding boxes on the provided image.

Parameters
  • session_id (str) – Datetime based unique identifier of the current session. It is generated by the Raspberry Pi and passed with the POST request.

  • image_name (str) – Name of the image saved in the images/original folder.

  • img (np.array) – Image to be processed as Numpy array.

Returns

results – List of dicts containing the results of the object detection as absolute pixel values as well as the original image name, dimensions and the predicted class.

Return type

list

Main Module

Main module responsible for orchestrating image processing.

class main.Main(base_img_path)

Bases: object

Main class for controlling image processing. When instantiated, it loads all neccessary environment variables and config files and instantiates all needed modules.

Parameters

base_img_path (str) – Location where the downloaded images should be stored.

process_image(arm_id, session_id, image_name, img_bytes)

This method runs object recognition on the passed image and saves the result to the database.

Parameters
  • session_id (str) – Datetime based unique identifier of the current session. It is generated by the Raspberry Pi and passed with the POST request.

  • image_name (str) – Name of the image to be processed. The image has to be uploaded to the s3 bucket. Value is passed with the POST request.

  • img_bytes (bytes) – Image to be processed as raw bytes.

Returns

success – Boolean indicating if processing was successful.

Return type

bool

save_and_upload_image(img_path, s3_path, img_bytes, log_args)

Takes an image as bytes and writes it to disk, then uploads it to s3. This functionality is not essential to generate the commands, so if this is executed on a separate thread, some speed-up can be gained.

Parameters
  • img_path (str) – Path of the image where it should be saved to disk.

  • s3_path (str) – Path of the image where it should be uploaded to s3.

  • img_bytes (bytes) – The image as bytes.

  • log_args (dict) – Arguments to correctly place log entry on the control panel.

Returns

success – Boolean representing if saving to disk and uploading to s3 was successful.

Return type

bool

stitch_images(arm_id, session_id, stitch_type, images_with_objects)

Stitches together overlapping images to provide an overview on the UI about the area of interest before and after the arm’s operation.

Parameters
  • arm_id (str) – Unique identifier of the robot arm.

  • session_id (str) – Datetime based unique identifier of the current session. It is generated by the Raspberry Pi and passed with the POST request.

  • stitch_type (str) – Type of stitch which will be prepended to the file name. Possible values: original and after.

  • images_with_objects (list of dicts) – List of dicts, each dictionary containing ‘image_name’ and ‘objects’. There is one entry for each unique image in a session.

vectorize_session_images(arm_constants, session_id, should_stitch=True)

This method is to be executed after the last image of a session is processed. It gets a list of unique images in the current session, retrieves all the objects that belong to each image and runs the vectorizer on them.

Parameters
  • arm_contants (dict) – Constants specific to the arm, defined in the arm’s config file.

  • session_id (str) – Unique identifier of the session.

  • should_stitch (bool) – Boolean value that enables or disables stitching a panorama image.

Returns

  • pairings (list) – List of dicts containing pairs of objects and clusters.

  • session_id (str) – Datetime based unique identifier of the current session. It is generated by the Raspberry Pi and passed with the POST request.

  • stitching_process (process) – Process which is used for stitching. Should be joined after the commands are sent back to Raspberry Pi, this way the Pi is not waiting for it unneccessarily and also the process doesn’t become zombie.

Server Module

WebSockets server that listens to messages from the Raspberry Pis and calls the appropriate functions.

class server.WebSockets

Bases: object

async listen(websocket, path)

Function that listens to new WebSocket messages. It can handle bytes and JSON messages. Supported message types: recv_img_proc, recv_img_after, get_commands_of_session, stitch_after_image.

Parameters
  • websocket (WebSocket) – WebSocket instance provided by websockets.serve

  • path (str) – Unused.

Utils Module

utils.S3 class

This class provides methods to interact with AWS s3 storage bucket.

class utils.S3.S3(base_img_path, logger_instance)

Bases: object

This class sets up the s3 client (boto3).

Parameters
  • base_img_path (str) – Absolute path where the downloaded images should be stored. Inside this folder, appropriate subfolders will be automatically created for the original images (named “original”) and the cropped images (named “cropped”).

  • logger_instance (logger) – Logger instance passed from main.

download_image(arm_id, session_id, image_name)

This method downloads images from s3. To avoid unneccessary downloads, images are only downloaded if they are missing or corrupted.

Parameters
  • arm_id (str) – Unique identifier of the arm.

  • session_id (str) – Datetime based unique identifier of the current session.

  • image_name (str) – Name of the image in the s3 bucket to be downloaded.

Returns

img_path – Absolute path of the downloaded image.

Return type

str

upload_file(bucket_name, file_path, s3_path)

Uploads a file to s3.

Parameters
  • bucket_name (str) – Name of the bucket to upload.

  • file_path (str) – Path of the image to be uploaded.

  • s3_path (str) – Path (including filename) where the file should be saved in s3.

utils.coord_conversion class

Functions to convert bounding box coordinates from relative to absolute.

utils.coord_conversion.filter_duplicates(objects, threshold=150)

Filters out the bounding boxes that belong to the same object, but showed up on a different image.

Parameters
  • objects (list) – List of dicts, containing the absolute coordinates of the objects.

  • threshold (int) – Distance within that objects are considered the same.

Returns

filtered_objs – List of the absolute coordinates of the unique objects.

Return type

list

utils.coord_conversion.object_to_polar(arm_constants, image_name, obj)

Converts bounding box coordinates relative to the image frame to absolute polar coordinates, relative to the robotic arm.

Parameters
  • arm_constants (dict) – Dictionary containing the arm’s constants that are saved in the arm’s config file and sent with the request.

  • image_name (str) – Name of the image, which also corresponds to the robot arm’s rotation, expressed in pulse width.

  • obj (dict) – Dictionary contining the relative coordinates of the bounding box.

Returns

abs_coords – Dictionary contining the computed absolute coordinates, the rotation as pulse width, the object ID and the bounding box dimensions.

Return type

dict

utils.logger class

A Python logger to format the logs and send them to the Control Panel using an HTTPHandler.

utils.postgres class

Utility class to provide methods to interact with the PostgreSQL database.

class utils.postgres.Postgres

Bases: object

Class to provide method to interact with PostgreSQL database. Uses connection pooling to avoid opening and closing connections every time a request comes in. It uses a single database which is created when starting the service if it does not exist already. Each arm’s data is saved to a separate schema while each session gets its own table.

create_table(*args, **kwargs)

This method creates a new table with a given name in the database if it does not exist yet. A separate table should be created for each session.

Parameters
  • cursor (psycopg2.cursor) – Cursor to be used for SQL execution. Provided by add_connection decorator.

  • schema_name (str) – Name of the schema to be created. Corresponds to arm_id.

  • table_name (str) – Name of the table to be created. Corresponds to session_id.

Returns

table_created – True if table was created, false if it already existed.

Return type

bool

get_objects_of_image(*args, **kwargs)

This method retrieves the recognized objects from the database belonging to the provided image.

Parameters
  • cursor (psycopg2.cursor) – Cursor to be used for SQL execution. Provided by add_connection decorator.

  • schema_name (str) – Name of the schema to be used. Corresponds to arm_id.

  • table_name (str) – Name of the table to be used. Corresponds to session_id.

  • image_name (str) – Name of the image of which the objects should be retrieved.

Returns

objects_of_image – List of dicts containing the follwing keys: id, type, bbox_dims. bbox_dims contains the relative coordinates of the top left and bottom right corners of the bounding box.

Return type

list

get_unique_images(*args, **kwargs)

This method retrieves a list of unique images in the current session.

Parameters
  • cursor (psycopg2.cursor) – Cursor to be used for SQL execution. Provided by add_connection decorator.

  • schema_name (str) – Name of the schema to be used. Corresponds to arm_id.

  • table_name (str) – Name of the table to be used. Corresponds to session_id.

Returns

unique_images – List of strings containing unique image names.

Return type

list

insert_results(*args, **kwargs)

This method inserts the result from the object recognition to the database.

Parameters
  • cursor (psycopg2.cursor) – Cursor to be used for SQL execution. Provided by add_connection decorator.

  • schema_name (str) – Name of the schema to be used. Corresponds to arm_id.

  • table_name (str) – Name of the table to be used. Corresponds to session_id.

  • results (list) – List of dict’s containing the following keys: image_name, image_width, image_height, class, x1, y1, x2, y2.

utils.postgres.add_connection(func)

Decorator function to retrieve a connection from the connection pool, pass it to the decorated function, then put it back to the pool.

Vectorizer Module

vectorizer.preprocessor class

The PreProcessor module is responsible for fetching images from the AWS s3 bucket and cropping the recognized objects from the original images.

class vectorizer.preprocessor.PreProcessor(base_img_path)

Bases: object

This class provides methods to crop images.

Parameters

base_img_path (str) – Absolute path where the downloaded images are stored. Inside this folder, appropriate subfolder will be automatically created for the cropped images (named “cropped”).

crop_all_objects(session_id, image_name, objects)

This function crops all recognized items from an original image. It has been separated from crop_object for performance reasons (Image.open() is executed only once per image not once per object).

Parameters
  • session_id (str) – Unique identifier of the current session. It is generated by the Raspberry Pi and passed with the POST request.

  • image_name (str) – Name of the image on disk to be loaded.

  • objects (list) – List of dicts containing information about the recognized items. bbox_dims will be used here for cropping.

crop_object(img_folder, img, id, bbox_dims)

This method should be called on each item (not containers) recognized on an image. It will crop the image around the provided bounding box coordinates and save the portion of the image inside the bounding box as a separate file. The cropped image will be placed in a folder corresponding to the name of the original image. The name of the new (cropped) image will contain id.

Parameters
  • img_folder (str) – Absolute path of the folder where the current image’s cropped pictures should be saved.

  • img (Image) – Already opened image as PIL Image object, returned by Image.open().

  • id (int) – Object id (unique within an original image) of the recognized object. The cropped image will be saved with a name like the following: item_<id>.ext

  • bbox_dims (dict) – Bounding box dimensions to be used for cropping. The dict should contain the following keys: x1 and y1 representing the coordinates of the top left, while x2 and y2 representing the coordinates of the bottom right corner of the bounding box. All values are floats between 0 and 1, representing relative distances.

run(session_id, images)

This method coordinates the preprocessing. It loops though the provided list of images and crops all recognized objects from the image.

Parameters
  • session_id (str) – Unique identifier of the current session. It is generated by the Raspberry Pi and passed with the POST request.

  • images (list) – List of dicts containing image_name and objects keys. The objects value contains the bounding boxes to be cropped.

vectorizer.vectorizer class

Vectorizer module is intended to load files from disk and calculate feature vectors from them. These feature vectors are used later for clustering. Vectorizer is built on PyTorch/Torchvision. Any model from torchvision.models can be used for vectorization.

class vectorizer.vectorizer.ImageFolderWithPaths(root, transform=None, target_transform=None, loader=<function default_loader>, is_valid_file=None)

Bases: torchvision.datasets.folder.ImageFolder

Extends torchvision.datasets.ImageFolder to add image paths to the dataset. Implements an override for the __getitem__ method, which is called when the dataloader accesses an item from the dataset

class vectorizer.vectorizer.Vectorizer(base_img_path, model_name, input_dimensions, output_layer='avgpool', output_length=512, stats=None, batch_size=1024, num_workers=4)

Bases: object

This class initiates the chosen neural network, loads the specified images, and computes feature vectors from them. The computation is executed in batches to increase efficiency.

Parameters
  • base_img_path (str) – Absolute path where the downloaded images should be stored. Inside this folder, appropriate subfolders will be automatically created for the original images (named “original”) and the cropped images (named “cropped”).

  • model_name (str) – Name of the neural network architecture to be used for vectorization. Has to be one of the available models of torchvision.models.

  • input_dimensions (tuple) – Dimensions of the input tensor required by the network architecture. The processed images will be resized to these dimensions. Networks usually require square-shaped images.

  • output_layer (str, optional) – The layer of interest’s name in the neural net. The specified layer’s outputs are the vectors, which will be copied from the network and returned as results. To get a model summary with names, simply print the model, like: print(model)

  • output_length (int, optional) – The length of the output vector.

  • stats (dict, optional) – Dict with 2 keys: mean and std. The corresponding values are both lists of 3 elements representing the 3 color channels of images. Specify here the means and standard deviations of the training set on which the used model was trained. Defaults to ImageNet stats.

  • batch_size (int, optional) – Specifies how many images a batch should contain. Use a higher number, preferably a power of 2 for faster processing. Keep in mind that one batch has to fit in memory at once.

  • num_workers (int, optional) – User by PyTorch’s DataLoader to determine how many subprocesses to use for data loading. Use 0 to load everything on the main process, use a higher number for parallel loading.

compute_vectors()

This function runs the inference on the loaded pictures using the previously selected model.

Returns

  • filenames (list) – List of filenames of cropped images. Also contains the name of the original image and corresponds to the location of the image in the ‘cropped’ folder.

  • vectors (list) – List of resulting vectors.

load_data(data_path)

Loads images from disk.

This function first creates a dataset using ImageFolderWithPaths, which is an extension of PyTorch’s ImageFolder, then it creates a dataloader.

Parameters

data_path (str) – Specifies the location of the images to be loaded. Given the way ImageFolder works, the images has to be in another folder inside the specified folder. The name of that folder would be the label for training, but in case of inference, it’s irrelevant.

Returns

found_images – Boolean value representing if any images were found to be vectorized.

Return type

bool

run(session_id, unique_images, objects, n_containers)

This method coordinates the process of vectorization. First, it run’s the preprocessor to crop the bounding boxes from the original images, then runs the vectorizer on the cropped images and finally clusters the resulting vectors.

Parameters
  • session_id (str) – Datetime based unique identifier of the current session. It is generated by the Raspberry Pi and passed with the POST request.

  • objects (list) – List of dicts containing information describing the recognized objects. image_name and bbox_dims are needed here for cropping the items.

  • n_containers (int) – Number of recognized containers on the current image. Will be used in K-Means as the number of clusters to be created.

Returns

pairings – List of dicts containing filename and cluster keys. The filename includes the original image’s name and the recognized object’s id, the cluster is the index of the cluster to which the particular object belongs.

Return type

list

Indices and tables