We study the inverse problem of reconstructing spectral functions from Euclidean correlation functions via machine learning. We propose a novel neural network, SVAE, which is based on the variational autoencoder (VAE) and can be naturally applied to the inverse problem. The prominent feature of the SVAE is that a Shannon-Jaynes entropy term having the ground truth values of spectral functions as prior information is included in the loss function to be minimized. We train the network with general spectral functions produced from a Gaussian mixture model. As a test, we use correlators generated from four different types of physically motivated spectral functions made of one resonance peak, a continuum term and perturbative spectral function obtained using non-relativistic QCD. From the mock data test we find that the SVAE in most cases is comparable to the maximum entropy method (MEM) in the quality of reconstructing spectral functions and even outperforms the MEM in the case where the spectral function has sharp peaks with insufficient number of data points in the correlator. By applying to temporal correlation functions of charmonium in the pseudoscalar channel obtained in the quenched lattice QCD at 0.75 $T_c$ on $128^3\times96$ lattices and $1.5$ $T_c$ on $128^3\times48$ lattices, we find that the resonance peak of $\eta_c$ extracted from both the SVAE and MEM has a substantial dependence on the number of points in the temporal direction ($N_\tau$) adopted in the lattice simulation and $N_\tau$ larger than 48 is needed to resolve the fate of $\eta_c$ at 1.5 $T_c$.