Pipelines
Data processing implies running a set of operations that manipulate the data towards obtaining a desired outcome. In Timeworx.io, data processing is designed as a pipeline with multiple stages through which data can flow. Each stage defines how the data is transformed, performs the task and passes the outcome on to the next stage.
Data Types
The first step in defining a data processing pipeline is to define the structure of the data. Timeworx.io supports the following data types:
Data Type
MIME Types
Description
Analysis Methods
Text
text/plain, text/html,
text/css, text/javascript
text/markdown
Text data includes anything in written form. This could be blog posts, tweets, product reviews, or any other type of written content.
Text analysis methods could include sentiment analysis, topic modelling, keyword extraction, named entity recognition, etc.
Image
image/jpeg, image/png, image/svg+xml, image/gif
Image data involves any type of graphics or images. This could range from user-uploaded photos to satellite imagery.
Image analysis methods could include object detection, image classification, segmentation, facial recognition, etc.
Video
video/mpeg, video/mp4, video/webm
Video data involves moving visual images. It could be anything from short video clips to full-length movies.
Video analysis methods could include scene recognition, action recognition, object tracking, video summarization, etc.
Audio
audio/midi, audio/mpeg, audio/webm, audio/ogg, audio/wav
Audio data includes any type of sound, music, or speech content.
Audio analysis methods could include speech recognition, music classification, audio fingerprinting, emotion recognition from speech, etc.
Tabular
text/csv, application/vnd.ms-excel, application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
Tabular data is structured data that's organised in columns and rows, similar to what you'd see in a spreadsheet or a database table.
Tabular data analysis methods could include statistical analysis, regression analysis, time series analysis, machine learning models, etc.
Data Processing Types
The platform supports a wide variety of processing tasks that can be applied on the above data types. Based on customer requirements, this list will continue to grow throughout time.
Classification
Classification, as the name suggests, is the process of assigning classes to objects, actions or concepts in some given piece of data. Processing the data implies detecting, recognizing, understanding and grouping objects into “sub-populations”.
Classification can take many shapes and sizes, and can be applied to a very large number of data types, for instance:
Labelling the emotion in a text, image, audio or video as “Positive”, “Neutral” or “Negative”
Identifying which animal made a specific sound just by listening to it: “Cat” vs “Dog”
Tagging the intensity of traffic from live footage as “High”, “Medium” or “Low”
Recognising an object in an image or a video with a simple binary classification of “Yes” and “No”
Applicability:
Self-driving cars can choose to navigate to an alternative route when the live camera feed detects unforeseen traffic conditions up ahead
Automated checkout counters can separate your cleaning supplies from your vegetables to avoid contamination of your food inside of the shopping bags
Email assistants can flag urgent messages that require immediate attention based on the intensity of the sentiments detected in their content.
Supported data types: text, image, audio, video
Image & Video Processing
Through image & video processing, we are able to transform visual media into a digital form from which we can extract valuable information.
There are quite a number of data processing tasks that can be applied to images and videos:
Segmentation plays a huge role in computer vision - it is the process through which an image or a video frame is partitioned into one or more image segments (or regions, or objects). This simplifies the analysis of the image, since each pixel is now associated with a concept, and machines can focus on reasoning based on higher level constructs, or abstractions, instead of needing to manage every pixel in particular.
Image and video manipulation, such as proper rotation, noise reduction, colour enhancement, cropping can improve image quality to allow extracting more meaningful information from visual media.
Object and activity detection and recognition can be further enhanced by providing bounding boxes to highlight the presence of classified objects.
Applicability:
Analysing satellite images can be used for forest monitoring or urban planning
Inspecting medical imagery can assist doctors in identifying tumours or abnormalities early on
Monitoring crops can help farmers detect crop diseases before they wreak havoc through the fields
Self-driving cars can identify and drive around obstacles, and it can also determine if human lives are at risk and attempt evasive manoeuvres
Supported data types: image, video
Natural Language Processing
Communication plays a pivotal role in human interaction, and the capability of extracting information from conversations from data in all formats yields immense potential for engaging experiences.
Natural Language Processing (NLP) tasks can be applied to various data types:
Transcribing information from text, image, audio or video content
Summarising and extracting key information from text, audio or video content
Speech detection and voice recognition in audio or video content
Applicability:
Companies can translate their videos into the native languages of all of their target audience
Educational platforms can summarise their content to provide students with the key highlights of a given course subject
Organisations can digitise document from analytical forms (i.e., handwritten or printed) into digital formats
Supported data types: text, image, audio, video, tabular
Sentiment Analysis
Sentiment analysis, a branch of Natural Language Processing, at its core, involves the computational study of opinions, sentiments, attitudes, and emotions expressed in data. It's about discerning whether a piece of writing is positive, negative, or neutral. In more advanced cases, it can go as far as identifying specific emotions such as happiness, anger, or disappointment.
Applicability:
Governments can tune in into citizens' concerns and needs, shaping policies and services to match the expectations of the population
Marketers can understand how customers perceive their products or brands and build strategies that adapt to brand perception and customer feedback
Businesses can employ sentiment analysis to gauge the effectiveness of marketing campaigns, conduct market research, spot emerging trends and competitive insights, and automate customer support
Supported data types: text, image, audio, video
Ranking
Ranking is a technique through which data items are sorted by relevance based on specified criteria from most important to least and is a fundamental data processing technique in the field of information retrieval.
Applicability:
Generative AI models are fine tuned by understanding how users rank the relevance of the generated data in comparison to models
Tourism employs ranking for rating attractions, restaurants, hotels and points of interest based on relevance, pricing, popularity and quality
Social Media uses ranking for serving relevant content to users based on previous history of interaction and relationships with other users
Supported data types: text, image, audio, video
Building Blocks
The platform allows businesses to define their data processing requirements through a set of intuitive and highly customizable building blocks.
Ingestion Block
It is mandatory that every pipeline start with an Ingestion Block which is responsible for connecting the customer data into the pipeline. An Ingestion Block must specify the data type and the source of the data:
Static: the customer is assigned a remote directory in which they can upload and store data files matching a corresponding MIME type for populating the pipeline.
Dynamic: the customer is able to connect a REST API for pushing data into the pipeline.
The platform also accounts for use cases in which customers do not have data to begin with. As such, customers can configure Ingestion Blocks that require Agents to provide the data for populating the pipeline, for example:
Agents are capable of producing images and/or video directly from the phone’s camera feed accessed from the mobile application, based on a given prompt.
Agents are capable of producing audio data directly from the phone’s microphone accessed from the mobile application, based on a given prompt.
Agents are capable of producing textual data based on a given prompt.
Based on the selected data type, customers can configure metadata for specifying additional requirements, i.e., the maximum length of a video / audio stream, deactivating flash in capturing images, etc.
Custom Ingestion Blocks are akin to Agent Blocks, since they require remuneration for Agents that provide the data, similar to data processing tasks.
Agent Block
Agent Blocks define the data processing task that needs to be performed by an Agent (human or AI) on the input data. Since this implies that the data is processed by Agents, each Agent Block defines a default replication factor which specifies how many copies of a given Task will be distributed to Agents.
Each Agent Block must specify a Consensus algorithm to be applied when establishing the ground truth in the decentralised data processing protocol, upon the Agents completing all replicas of a given Task.
The platform provides built-in Consensus algorithms which can be configured by the customer. Some examples include:
Statistic: the consensus value is chosen based on the median and standard deviation of all values.
Most frequent: the consensus value is chosen based on the value that has been provided by most Agents
Union: the consensus value is composed of all of the values provided by Agents
Intersection: the consensus value is composed of all of the common values provided by Agents
Additionally, customers can provide custom Consensus algorithms written in Python. The platform generates code stubs that can be implemented and deployed by the customer through a basic coding interface exposed by the UI.
Transformation Block
In certain cases, a customer can decide to run a mathematical or algorithmic transformation on the data, which does not require an Agent’s intervention.
Examples include:
Rotation of an image at an angle specified as the input data
Filtering the input data based on specific criteria
Choosing the most relevant result from ranking data
Additionally, customers can provide custom transformation algorithms written in Python. The platform generates code stubs that can be implemented and deployed by the customer through a basic coding interface exposed by the UI.
Connecting the dots
Customers can define data processing pipelines by choosing appropriate building blocks and linking them together in the desired sequence to specify how data flows sequentially through each building block.
Let’s illustrate how pipelines are defined through an example: A logistics company needs accountability for all receipts that are passed between couriers. They currently rely on the couriers to transcribe and pass this information back. Due to human error (usually accidental), their records do not match with the receipts reported from the field.
The use case target is to build a machine learning model that automatically extracts the Total Value and Total VAT Value from a picture of a receipt. Furthermore, the customer does not have an initial dataset of receipt images.
The pipeline requires the following building blocks:
An custom Ingestion Block for generating images of receipts:
Max replication factor: 2000
Padding: 10% of the screen width, 10% of the screen height
An Agent Block for rotating the images such that the text is upright
Consensus: mean angle
An Agent Block for creating a bounding box around the receipt
Consensus: mean polygon
An Agent Block for identifying the “Total amount” label on a receipt
Consensus: mean brush vector
An Agent Block for identifying the Total amount value on a receipt
Consensus: mean brush vector
An Agent Block for identifying the “Total VAT” label on a receipt
Consensus: mean brush vector
An Agent Block for identifying the Total VAT amount value on a receipt
Consensus: mean brush vector
An Agent Block for transcribing the Total amount value from a receipt
Consensus: most frequent value
An Agent Block for transcribing the Total VAT amount value from a receipt
Consensus: most frequent value
Figure 7 depicts how the above building blocks have been linked to form a data processing pipeline. When connecting a link between a pair of building blocks, the output from the first block becomes the input for the second block. Aside from sequential pipelines, parallelism can be achieved by splitting the output of a block as input for multiple other blocks.
By executing this data processing pipeline, the customer achieved the following outcomes:
Ingested 2000 quality images of receipts used for training the ML models in the next steps
Created a fine-tuned ML model for rotating the receipt such that the writing is upright (increasing the quality of extracting data)
Created a fine-tuned ML model for determining the boundary of receipts in an image
Created a fine-tuned ML model for identifying the Total Value and Total VAT value from an image of a receipt
Created a fine-tuned ML model for extracting the Total Value and Total VAT value from a picture of a receipt
This not only shows that Timeworx.io can be used to achieve the end goal of building a machine learning model that automatically extracts the Total Value and Total VAT Value from a picture of a receipt, but also that intermediate datasets processed by each building block can be, in turn, used to train specialised machine learning models for specific operations.
Last updated