What is SignifAI?
SignifAI is a cloud-based machine intelligence platform that helps DevOps and site reliability teams guarantee more uptime.
How does SignifAI work?
SignifAI uses AI and machine learning to correlate log, metric and event monitoring data, plus an organization’s own operational expertise to deliver the following benefits:
What is SAM?
SAM stands for Site Reliability Engineer (SRE) Augmented Member. Think of SAM as a virtual team member who is always “on call”, remembers the cause and resolution of all incidents, and learns from your team’s interactions with SignifAI. SAM helps reduce alert noise, correlate data, take automatic decisions on your behalf, deliver predictive insights and suggest solutions.
What monitoring, deployment and collaboration tools does SignifAI integrate with?
You find a complete list of over sixty supported integrations here.
What is a “sensor?”
A “sensor” is how SignifAI integrates with various monitoring, notification and collaboration tools. There are three ways in which an organization can integrate their existing tools with SignifAI:
What is SignifAI’s “Web Collector?”
A Web Collector sensor integrates SignifAI with your tool over a webhook. In order to do so, you will need to have the right permissions, as well as the ability to choose which type of events, incidents, or data points you wish to send to the SignifAI platform. Specific instructions for each sensor are on their respective pages under the “Sensors” tab in the SignifAI dashboard.
What is SignifAI’s “Active Inspector?”
SignifAI’s Active Inspector™ collects information from multiple platforms in a secure way using a platform-specific API. With this method of collection, you are not required to configure collected metric/event types or limits; SignifAI automatically collects the most relevant data points, events, and metrics with the highest value for analysis and actionable information.
What are SignifAI’s “Agent Sensors?”
SignifAI’s collection philosophy is an API driven one. We believe it is much easier to integrate with systems over remote APIs in a much more standard way. However, sometimes this is not possible and we still want to be able to support different systems and applications. Because of this, we developed a set of collection and processing capabilities using an agent based on the SNAP Telemetry Framework.
What is SignifAI’s “Control Center?”
The Control Center is the primary interface you will use to monitor, troubleshoot, and analyze SignifAI’s “Issues, Insights and Answers™” feed.
What are Insights and Answers™ inside of SignifAI?”
Along with prioritized issues appearing in the Control Center feed, SignifAI automatically populates it with Insights and Answers™ to issues happening in real-time and predictively on daily, weekly and monthly schedules. The insights that SignifAI generates are unique to your environment, not generic algorithms applied to a static set of data. Accompanying every insight are diagnostics and recommendations that inform potential solutions to the issues that have been detected.
Insights are also SAM’s way of helping you improve your environment and become predictive and proactive.
What are SignifAI “Decisions?”
In SignifAI, “decisions” are human driven actions to configure SAM for their own specific needs. By configuring SAM with your own specific expert logic, you can create logic that is very deterministic and related to your own environment and conditions. Decisions first section allows you to have full control over the correlation logic, priorities, thresholds and other algorithms.
What is “alert noise?”
Sometimes referred to as “alert fatigue,” alert noise is something almost all organizations deal with when they employ a variety of tools to monitor thousands of metrics and events. Ask any SRE who has ever been on call to tell you how frustrating it is to deal with alerts that weren’t worth acknowledging (false positives) and critical alerts that got overlooked (true positives) because they got lost in the “noise.” Anomaly detection, when applied to alerts can help aggregate them into “incidents” or higher order alerts so you end up with less alert volume. And those that you do get, will likely be alerts you need to react to.
What are “predictive analytics?”
The goal of “predictive analytics” is to analyze historical and recent data in order to identify patterns that can inform predictions about behaviors, outcomes, events or performance in the future.
What is “anomaly detection?”
Anomaly detection, sometimes referred to as “outlier detection,” is the process by which machines attempt to identify outliers that deviate from a “normal” or expected pattern of behavior.
What is “unsupervised anomaly detection?”
In this method of anomaly detection you are dealing with unlabeled data. You are basically asking the machine to decide what doesn’t look “normal” by finding patterns or clusters within the data. For example: We might give the machine a dataset of half a million points that measured the CPU utilization of a system every minute for a year. The machine will point out which data points might be outliers and which look “normal,” but it’ll be up to a human to make the final judgement as to whether or not they are actually outliers. For example, maybe high CPU utilization was expected at certain times because of seasonal load and that particular pattern should be classified as “normal.”
At SignifAI, we are combining different approaches. First, we are using multiple outlier algorithms which helps us detect a variety of anomaly conditions. DBSCAN, Hampel Filter, Holt-Winters and ARIMA (X, SA) to name a few. Those are being used mostly on metric data, but also help us in smooth hard streams before pushing them into the next anomaly tier. Other types of methods SignifAI uses are mostly based on Supported Vectors Machines (or SVM) type algorithms, as well as Bayesian and Probabilistic modeling.
What is “supervised anomaly detection?”
In this type of anomaly detection you train the machine to spot anomalies by feeding it two sets of data. The first set of data tells the machine what sort of behavior is “good.” The second data set tells the machine what sort of behavior should be considered “bad.” If we revisit the previous example, with supervised anomaly detection, the machine has clear instructions on how determine what an outlier is. For example, the labeled data set might include the acceptable utilization percentages at any given minute during the year to account for seasonal load.
What is semi-supervised anomaly detection?
In this final method for detecting anomalies, a model of what should be considered “normal” is generated from a dataset and then evaluated against another dataset to see what the likely outliers would be. It’s considered “semi-supervised” because the dataset that is used to supervise the comparison can be thought of as an assumption of what might constitute as “bad,” but still requires some degree of human validation.
What is “artificial intelligence?”
Artificial intelligence is the name given to programs that have been written to solve problems (often very difficult) which humans can already solve. The goal of many researchers and programmers in this field is to create programs that can arrive to a problem’s solution, autonomously, often without supervision and using methods or logic that might differ from what a human might employ.
What is the difference between “labeled” and “unlabeled data?”
Data is what trains machines to detect and make judgements about what constitutes an anomaly. Here’s a simple example to illustrate the difference:
An unlabeled inventory of assets might just tell us if a system is a “physical server”, “VM”, or “container.” A labeled data set would include more useful information like “location,” “OS,” “CPU,” “RAM,” “package version,” “build number,” etc. From a machine learning perspective, the more labels or tags a piece of data has, the more likely it will be able to produce an accurate insight about unlabeled data it is asked to evaluate. Put another way, the denser and relevant the data set, the better the training and resulting correlations.
What is the difference between “training”, “validation” and “test” data sets?
“Training,” “validation” and “test”, data sets are utilizing by machine learning algorithms and programs to use as a basis for its learning. The data sets themselves can be characterized in the following ways:
What is the difference between “weak” and “strong” AI?
As the name implies, weak AI or “narrow AI,” is focused on solving very narrow problems or use cases. Examples of weak AI include robots on a manufacturing floor or “virtual assistants” like Amazon’s Alexa/Echo or Apple’s Siri which use voice recognition to retrieve the results of searches or perform basic tasks like play music, voice a calendar reminder or tell you what the weather is San Francisco. In a nutshell, if the AI cannot learn to perform a task it was not originally programmed to carry out, it is most definitely weak AI.
On the other hand, strong AI can be characterized as AI that has the ability to reason, solve problems, make judgements, strategize, learn new things, interface with humans in a natural way and other traits most commonly thought of as quintessentially “human.”
What is “machine learning?”
Machine learning is the practical application of AI in the form of a set of algorithms or programs. The “learning” aspect relies on training data and time. Meaning the more relevant data you feed into the program, the longer it can evaluate it, the more sophisticated the algorithms it employs…the more the machine can “learn.”. An example of machine learning could be a program that is constantly being fed stock market data, making predictions based on algorithms, evaluating those predictions against real world outcomes and then adjusting its data processing in an effort to get closer to resembling an accurate prediction about the future performance of the stock market.
What is “natural language processing?”
In the context of machine learning, “natural language processing” (NLP) are the efforts to remove as much of the friction as possible in the interactions between machines and people. More specifically, NLP attempts to enable machines to make sense of human language so that the interfacing required between computers and people can be improved.
What is “natural language generation?”
“Natural language generation” (NLG) are the algorithms programmers make use of so that machines can produce language (written or spoken) that to a human can not readily be identified as being generated by a machine. Examples of NLG might include automated voice prompts that adapt to verbal queues and chatbots which when interacted with feel like you are talking to an actual customer service representative.
What are “neural networks?”
Artificial neural networks are architected in a way so that they closely resemble the way brain neurons are connected to each other and process information. Therefore the name, “neural network.” Deep learning makes heavy use of neural network design. A more specialized version of a neural network is a “recurrent neural network.” In this type of a network the outputs of the network are fed back into itself allowing it to use the learning it has achieved so far to more efficiently sort subsequent data.
What is “machine intelligence?”
Machine intelligence is a unified term between artificial intelligence and machine learning. Machine intelligence offers very real and tangible benefits to DevOps teams by combining statistical algorithms, classification, regression, bayesian statistics modeling and other machine learning techniques, with the power of a true AI model such as expert systems. It’s the combination between a solid AI engine which allows reinforcement from the SRE engineer with the combination of machine learning algorithms and multiple mathematical approaches that is so powerful and relevant to SREs. Simply applying machine learning algorithms to monitoring data and calling that AI – is simply not accurate.
What is “deep learning?”
Deep learning is a very specific genre of AI that relies on neural network design. The “neurons” or more specifically the nodes in these networks are layered in a way that they provide exponential processing power and learning speed over monolithic AI systems. “Neural” is used to describe this architecture because it closely resembles how the brain processes information.
What are the differences between “reinforcement,” “supervised” and “unsupervised learning?”
There are typically three ways in which machines “learn”. These include: