Abstract: Advances in differentiable numerical integrators have enabled the use of gradient descent techniques to learn ordinary differential equations (ODEs). In the context of machine learning, differentiable solvers are central for Neural ODEs (NODEs), a class of deep learning models with continuous depth, rather than discrete layers. However, these integrators can be unsatisfactorily slow and inaccurate when learning systems of ODEs from long sequences, or when solutions of the system vary at widely different timescales in each dimension. In this paper we propose an alternative approach to learning ODEs from data: we represent the underlying ODE as a vector field that is related to another base vector field by a differentiable bijection, modelled by an invertible neural network. By restricting the base ODE to be amenable to integration, we can drastically speed up and improve the robustness of integration. We demonstrate the efficacy of our method in training and evaluating continuous neural networks models, as well as in learning benchmark ODE systems. We observe improvements of up to two orders of magnitude when integrating learned ODEs with GPUs computation.