AI in Smart Buildings #4 — Scenarios

Oct 16, 2022 | Jagannath Rajagopal | 10 min read


During ideation, we made decisions on the methods needed for our initial build in the list of experiments run for prototyping. To build the architecture, we go method by method in our design. Here’s what you had at the end of the last step —


Let’s start. The goal in Scenarios is to assign techniques to each step and build a spec that your team of Data Scientists, ML Engineers and developers can prototype.

We aim to be illustrative and pedagogical is showcasing how Hero Methods may be applied to complex problems in AI. Actual problems may involve less or more steps based on the scenario.

- - -

1. Feature Extraction

Method choice for initial build

Morphometric: Extract features (sub-areas of images corresponding to eyes, mouth, nose, for instance) by individual based on a morphometric algorithm and then generate a numerical transformation of these and store them into a content-indexed database. For a candidate picture, the process of querying the database is to extract features, to perform the same transformation for the sub-areas, and to find the closest record in the DB to these values.”

Technique selection

  1. Use SIFT (Scale Invariant Feature Transform) to identify key points in faces.
  2. Obtain morphometric vectors of faces: These vectors represent the location of several the fiducial facial landmark points (mostly defining the facial contour and certain facial components). The calculation of these vectors can be done as a linear regression problem (ex., ridge, ElasticNet etc) to adjust a face model with the representative landmarks (contour, eyes, mouth, nose, …).
  3. Perform content based indexing for key features: Techniques for indexing utilize similarity or distance as a measure, and are based on transforming the feature space into a smaller dimension space (using PCA, hierarchical clustering or tSNE) and then use this as the indexing criteria.
  4. Store the content-indexed features in a Biometrics Database: There are existing technologies that support CBIR (Content-based image retrieval) or QBE (query by example) functionalities, e.g., IBM QBIC, Elastic Vision, or VIR Image Engine (Virage).

Alternatives

  1. Equivalent to SIFT, there are other feature extraction techniques such as SURF or BRIEF. Alternatively, one can use local appearance-based methods such as LBP, HOG or LPQ.
  2. For content-based indexing, alternative techniques include those that are based on Haar features (wavelets transforms).


- - -

2. Building Model

Method choice for initial build

“The most basic information includes spaces (rooms/hallways) and their connectivity (doors and staircases). Therefore, the simplest option would be a graph model in which the nodes are spaces and edges represent how they are connected. This option can be extended by enriching the information for each node and edge with additional attributes. This can be categorical (such as type/function of the space — restroom, corridor, food court, …) or numerical (volume/surface area, dimensions and sizes, how long it takes to move across the room or cross a connection etc).”

Technique selection

Here is the list we want to initially try

  1. Build an ETL integration to extract information about spaces/connectivity out of the Building Information Management (BIM) Model and into a Graph Database
  2. Build an ETL integration to extract the above-mentioned categorical and numerical information about the building and map them to attributes of nodes/edges in the graph database.

Alternatives

  1. If the BIM Model has an API for querying and extracting its information, you may choose to have the integration use that instead. If no API exists, building one is non-trivial and should only be considered if there exists the need to expose the data within a BIM system to third parties or other systems within the org.
  2. There is also academic work like BIMql which exposes data in a BIM system to a user in an interactive programmatic SQL like environment.


I’ve created a LOT of resources on this topic. Here’s my course on Design Thinking for Hero Methods. Here’s my YouTube channel with previews of course videos. Here’s my website; navigate through the courses to find free previews & pdfs.

- - -

3. Regulation Extraction

Method choice for initial build

“A solution may need a combination of both, having an NLP model extract structures in a regulation and then an expert post-process these results to speed-up the task of creating the ontology.”

Technique selection

Here is the list we want to initially try.

  1. Define concepts — classes and their relationships — in building regulations, as the initial step in creating the regulation ontology. This should include types of rules and regulations; types may be by part of building [roof vs floors vs structure], or function [HVAC vs security vs emergency procedures] etc. Three parts here — a) Use NLP techniques to extract summary level terms — chapter titles, headings, table of contents. b) Use these techniques to identify key concepts and those indicating regulatory constraints [sentences such as “corridors connecting open areas must be at least 5 meters wide”]. c) Identify hierarchical structures, e.g. a building has floors, the floors have spaces, … and their taxonomies [distinction between public areas and restricted areas, …]
  2. Model individual rules to be loaded into the ontology. Four steps here — a) Apply Fundamental NLP techniques — lemmatization, stemming, parts-of-speech (POS) tagging, sentence parsing — to building regulations. b) Model sentence/paragraph structures to be used as input in creating ontologies. c) Extract rules and map them to concept/type of rule. d) This process of creating the ontology shall be led by an expert in knowledge representation, with assistance from NLP components in the definition of the KB.

Alternatives

Use neural network, deep learning for lemmatization, stemming, POS tagging, parsing etc. Adapt pre-built language models as well as deep net architectures for this purpose.


- - -

4. Person Detector

Method choice for initial build

Use Computer Vision (CV) algorithms to detect people in video. “This option is sometimes very limiting if you want to identify the nature of the object moving in the scene (ex., is it a person, a dog, a baby stroller). This can be done by using some specific algorithms that detect simple components (mostly shapes)”

Technique selection

Here is the list of CV we want to initially try, by what they do

  1. Detect changes in image/frame
  2. Foreground/background segmentation
  3. Object tracking/Haar filter.

Alternatives

Basic object tracking techniques that use Centroid Tracking, advanced computer vision techniques such as Boosting Trackers (AdaBoost + Haar cascades), Discriminative Correlation Filters, Kernelized Correlation Filters, Median Flow Trackers. There are also some deep learning alternatives such as GOTURN tracker.


- - -

5. Pose Identifier

Method choice for initial build

“One approach is the identification of joints — detecting knees, elbows, hips and other major joints, and drawing a basic skeleton structure (a type of 12-to-20-segment wireframe representation) of the person detected. Sub-steps include (a) detecting joints, (b) estimating skeleton, and c) characterizing pose”.

Technique selection

Here is the list we want to initially try

  1. Detecting Joints — Spectral Graph Wavelet Transform (SGWT) + Hidden Markov Model (HMM)
  2. Estimating Skeleton — Multi-level Histogram of Oriented Gradients (HOG) + Support Vector Machine (SVM)
  3. Characterizing pose — any model with memory. Ex., LSTM

Alternatives

  1. To estimate skeletons, an alternative is one that defines a model of the distances between the joints (based on the length of the bones/limbs) and solves the problem as a quadratic optimization fit.
  2. To characterize a pose, one can use approaches based on the CNN (TCN) as an alternative to the LSTM.


- - -

6. Crowd Model

Method choice for initial build

“The simpler option is a discrete-event simulation (mostly a queue system) that identifies the number of people per node of the graph (spaces in the building). There is a given task the people perform in each node — like walking, shopping, looking at the store displayed items, going to the restroom etc — and the probability or plan to transit to the next node“ + “A solver may be used to chart pathways through the building. Depending on the size and complexity of the building, the number of theoretical pathways can be very large. For a simulation, a solver may be used to determine the relevant options out of the existing ones. Also, a solver may be used to react to situations within a mall — like a large crowd, or a closed corridor, or a broken elevator etc” + “If we choose the Discrete Event Simulation way, we will need to model Behaviour-Design-Intent as pre-processing for Step 7.”

Technique selection

Here is the list we want to initially try
  1. Discrete Event Simulation using an engine such as SimPy or AnyLogic
  2. Pathfinding (A*)
  3. Preprocessing for an Agent-based model — Behaviour-Design-Intent (BDI) analysis

Alternatives

  1. To simply model occupation of areas, it can be solved as a queue system with a Markovian model. This approach limits the control mechanism that governs the agents and it only models density of people per area. In this case, the underlying behaviour model is a random walk.
  2. A more complex definition of agents requires assigning goals and different strategies to pursue them, including at least a rule-based model or a controller based on behaviour trees. For discrete-event models, use software libraries like SimPy or languages like Simula. For agent-based models, counterparts include AgentPy, AgentBase etc. In either situation, since atomic operations are trivial, effort so far has focussed on developing software libraries, languages and apps for this simulation, as opposed to complex mathematical models such as neural networks.


- - -

7. Behaviour Model

Method choice for initial build

“Crowd behaviour can be modelled as an extension of agent-planning. The easiest approach is to have a basic perception model and a simple internal state representation that allows the agents to switch between plans depending on some fixed rules (e.g. I’m tired so I am going to stop shopping, find a bench, and eat something or leave the mall). A more advanced model would consider the interpretation of the environment. There are computational cognitive models that implement perception and elicitation processes. If the simulation deals with, for instance, emergency simulations such as a fire or people evacuating the building in a hurry, these models better capture these situations. Regardless of simulation approach, a learning model is needed to characterize the type of crowd behaviour”

Technique selection

  1. Perception and Elicitation — Emotional Elicitation Process (EEP) in Computational Cognitive Psychology
  2. Classifying behaviour — A simple learner like an SVM or a Probabilistic Graphical Model (PGM).

Alternatives

An alternative is to use simpler intention models such as WASABI or GAMA (Agent Based Modelling — AMT Techniques for Agent-Based Simulation).


- - -

8. Surveillance

Method choice for initial build

“The problem may be formulated so that learning is still a part of it”.
  1. Input: Raw video sequence from the surveillance cameras & crowd model outcomes
  2. Task: Segment faces in video, correct for perspective and occlusion (and so on), de-scale and extract facial feature vectors, and match to morphometric vectors from the biometrics database. Machine learning models, especially deep neural networks, are famous for not requiring hand-crafting of features and filters in image recognition so the first parts — segmenting, correcting and extracting — could potentially be set up as a data-driven one. The matching process would then utilize a simple distance measure to determine which morphometric vector the extracted facial feature vector is closest to.
  3. Output: Name, Trespasser Yes/No, Degree of Certainty”

Technique selection

Here is the list we want to initially try
  1. Convolutional Neural Network or any state-of-the-art Deep Learning Model for Image Recognition. This would be trained on a large database of raw video of people with scale-free morphometric vectors of faces as labels. The model will need to recognize all faces in video involving crowds, and extract scale-free morphometric vectors.
  2. Simple distance measure to evaluate and identify best possible reference vectors for a given face in a video feed. Uses extracted morphometric vectors from the prior step.

Alternatives

A possibility is to try the same approach that we used for Step 1, but we should take into account that the quality of the image will limit the accuracy.

- - -

9. Preventive Security Engine

Method choice for initial build

“The core of the system can be a rule-based engine. These are either implemented in logic programming languages (e.g., Prolog or Lisp) or use an intermediate programming format to describe rules. This engine shall be able to propose a sequence of actions to deal with a particular situation based on the current state of the system (occupation of the building). Sometimes, these engines use extended implementations based on partial order planning. This extends the basic rule-based systems to deal with a problem by proposing a sequence of actions (several of which can be performed in alternative orders) to find the objective state of the system”.

Technique selection

Here is the list we want to initially try
  1. A Partial Order Planner or a full-order planner such as STRIPS.
  2. Using an intermediate format to define the rules for the planner (e.g., Planning Domain Definition Language (PDDL) or Action description language (ADL))

Alternatives

  1. An implementation of a Rules-based Engine using Logic Programming.
  2. It could be possible to consider a planner that uses probabilistic premises, but in a context like security or safety it will not be justified or even desirable.


- - -

In the next article, we’ll discuss prototyping & assembly.

At Kado, combining methods into an architecture to solve complex problems is what we do. Here’s the why — cuz this is compelling!
Don't hesitate!

Design Thinking for Hero Methods

Created with