logo

DeepBind Based Analysis

Introduction of DeepBind

DeepBind is an analysis method for predicting the sequence specificities of DNA- and RNA-binding proteins based on deep learning. DeepBind can analyze noisy experimental data to determine a set of DNA and RNA sequences to which the protein will bind. It can then look at a new sequence and calculate how likely it is that these proteins are bound to it. Given a sequence mutation, the tool can analyze whether a binding change is made. Protein-binding site mutations, additions or deletions can alter gene expression patterns and cause disease.

Artificial Intelligence in DeepBind

Deep learning is a new field in machine learning research. Its motivation is to establish and simulate a neural network that simulates the human brain for analysis and learning. It mimics the mechanism of the human brain to interpret data, such as images, sounds, and text. Convolutional neural networ k(CNN) is currently one of the most widely used deep learning techniques. It is a deep neural network with a feature extractor (composed of a convolutional layer and a hybrid pool layer), which is popular in the field of computer vision. DeepBind, is based on deep learning techniques-deep convolutional neural networks (CNN), which offer a scalable, flexible and unified computational approach for pattern discovery.

Analysis Process
Fig 1. Details of inner workings of DeepBind and its training procedure.

Fig 1. Details of inner workings of DeepBind and its training procedure. (Alipanahi, B, et al. 2015)

Analysis Principle

For a sequence binding score f(s), DeepBind uses fore stages to compute the binding score:

  • At first stage, the convolution stage scans a set of motif detectors.
  • At second stage,rectification stage, isolates positions with a good pattern match by shifting the response of detector Mk by bk and clamping all negative values to zero.
  • At third stage, pooling stage, computes the maximum and average of each motif detector's rectified response across the sequence.
  • Last stage, these values are input into a non-linear neural network with a weight of W, which combines the responses to produce a score.
Advantages of DeepBind

There are several challenges in learning models of sequence specificity using modern high-throughput sequencing technologies, but DeepBind addresses most of the challenges.

  • DeepBind can learn from millions of sequences that obtained from high-throughput experiment through parallel implementation on a graphics processing unit (GPU).
  • It can be applied to both microarray data and high-throughput sequencing data.
  • Even without correcting technology-specific biases, it summarizes various technologies well.
  • It can tolerate moderate noise and mislabeled training data.
  • It can train predictive models fully automatically, eliminating the need for careful and time-consuming manual adjustments.
  • A trained model can be applied and visualized in a way familiar to PWM users.
Application Filed

DeepBind is based on a deep convolutional neural network, even if the position of the pattern in the sequence is unknown, new patterns can be discovered-traditional neural network tasks that require an exorbitant amount of training data. DeepBind can be applied to the following analysis fields:

  • DNA sequence specificities analysis.
  • RNA sequence specificities analysis.
  • Identify binding sequences in both vitro and vivo.
  • Identify and visualize damaging genetic variations.
  • The DeepBind model can identify deleterious genomic variants.
  • Ascertaining splicing patterns.

CD ComputaBio provides sequence specificity analysis based on DeepBind method. DeepBind adapted deep learning methods CNN (convolutional neural networks) to the task of predicting sequence specificities and found that they compete favorably with the state of the art . For sequence specificity analysis, in addition to using DeepBind method for analysis, we can also use other analysis software or predictive model methods, including some of the most cutting-edge analysis models, according to customer needs. Protheragen provides one-stop data analysis services, you only need to upload raw sequencing data, and we will use DeepBind method to analyze data and generate a complete analysis result report for you. For DeepBind method analysis, if you have any questions, please feel free to contact us for details. We will provide you with satisfactory data analysis services.

References

  • Alipanahi, B, et al .Predicting the sequence specificities of DNA - and RNA - binding proteins by deep learning [J]. Nature Biotechnology. 2015. 33: 831–838.
  • Haoyang Z , et al. Convolutional neural network architectures for predicting DNA–protein binding [J]. Bioinformatics (12):i121-i127.

Services

Online Inquiry