Artificial neural networks are at the heart of modern deep learning algorithms. We describe how to embed and train a general neural network in a quantum annealer without introducing any classical element in training. To implement the network on a state-of-the-art quantum annealer, we develop three crucial ingredients: binary encoding the free parameters of the network; polynomial approximation of the activation function; and reduction of binary higher-order polynomials into quadratic ones. Together, these ideas allow encoding the loss function as an Ising model Hamiltonian. The quantum annealer then trains the network by finding the ground state. We implement this for an elementary network and illustrate the advantages of quantum training: its consistency in finding the global minimum of the loss function and the fact that the network training converges in a single annealing step, which leads to short training times while maintaining a high classification performance. After training the network using a quantum annealer, one can then use the quantum network weights in a classical network algorithm of identical design for inference. Our approach opens an avenue for the quantum training of general machine learning models.