%% ****** Start of file apstemplate.tex ****** %
%%
%%
%% This file is part of the APS files in the REVTeX 4 distribution.
%% Version 4.1r of REVTeX, August 2010
%%
%%
%% Copyright (c) 2001, 2009, 2010 The American Physical Society.
%%
%% See the REVTeX 4 README file for restrictions and more information.
%%
\documentclass[aps, prl, twocolumn, groupedaddress, 10pt]{revtex4-1}
\usepackage{graphicx}
\usepackage{color}
\usepackage{amsfonts, amsmath, amsthm, amscd}
%\usepackage{cite}
%\usepackage{eucal}
%\usepackage{a4}
%\usepackage{booktabs}
%\usepackage{microtype}
%\usepackage{hyperref}
%\usepackage{times}
%\usepackage{showkeys}
\def\tcr#1{\textcolor{red}{#1}}
\def\epp{\: .}
\def\epc{\: ,}
\def\lam{\lambda}
\def\blam{{\boldsymbol{\lambda}}}
\def\brho{{\boldsymbol{\rho}}}
\def\limth{\lim\nolimits_\text{th}}
\def\Drho{\mathcal{D}[\rho]}
\def\cO{\mathcal{O}}
\newcommand{\ftk}[2]{\text{FT}\big[#1\big](#2)}
\newcommand{\ft}[1]{\text{FT}\big[#1\big]}
\newcommand{\iftx}[2]{\text{FT}^{-1}\big[#1\big](#2)}
\newcommand{\ift}[1]{\text{FT}^{-1}\big[#1\big]}
\newcommand*{\red}{\textcolor{red}}
\begin{document}
\clearpage
\setcounter{equation}{0}
\renewcommand{\theequation}{S\arabic{equation}}
\setcounter{page}{1}
\onecolumngrid
\begin{center}
{\Large Supplementary Material for EPAPS\\
Quenching the Anisotropic Heisenberg Chain: Exact Solution \\ and Generalized Gibbs Ensemble}
\end{center}
\setcounter{page}{1}
\section*{Thermodynamic limit of the overlaps}
In order to apply the quench action logic we need to compute the thermodynamic limit of the overlap coefficients and in particular their leading extensive part
\begin{equation}
S[\boldsymbol{\rho}]= \limth S_{\boldsymbol{\lambda}} = - \limth \ln \frac{\langle \Psi_0 | \{\pm\lambda_j\}_{j=1}^{M/2} \rangle }{\|\{\pm\lambda_j\}_{j=1}^{M/2}\|} \epp
\end{equation}
The procedure is to consider the overlap coefficient for a generic finite size Bethe state $| \{ \lambda_j\}_{j=1}^M \rangle$ that in the thermodynamic limit, $N \to \infty$ with $ M/N=1/2$ fixed, flows to a set of distributions $| \{ \lambda_j\}_{j=1}^M \rangle \to | \boldsymbol{\rho} \rangle $. Equivalently, in the thermodynamic limit the eigenvalue of a smooth diagonal observable $\mathcal{A}$ can be recast into a sum of integrals weighted by the distributions $\boldsymbol{\rho} = \{ \rho_n\}_{n=1}^\infty$:
\begin{equation}\label{eqn:therm_observable}
\mathcal{A} | \{ \lambda_j\}_{j=1}^M \rangle =\Big[\sum_{j=1}^M A_j \Big]| \{ \lambda_j\}_{j=1}^M \rangle \to \Big[N \sum_{n=1}^\infty \int_{- \pi/2}^{\pi/2} d\lambda \:\: \rho_n(\lambda) \tilde{A}_n(\lambda) \Big] | \boldsymbol{\rho} \rangle \epp
\end{equation}
By assumption the extensive part of the overlap coefficient $S[\boldsymbol{\rho}]$ is smooth and does not depend on finite size differences within the set of Bethe states that scale to the same densities $\brho$. The number of Bethe states for each set of distributions $\boldsymbol{\rho}$ is given by the extensive Yang-Yang entropy \cite{KorepinBOOK}: $e^{S_{YY}[\boldsymbol{\rho}]}$. We are then free to select a representative finite size Bethe state out of all states that scale to the same $\brho$. We consider the state $| \{ \lambda_j\}_{j=1}^M \rangle $ consisting of $2n_s$ strings such that $2n_s = \sum_{n=1}^\infty M_n$, where $M_n$ is the number of $n$-strings in the state and we choose all $M_n$ to be even. Note that this is one possible choice for a representative state. Different choices regarding the eveness of the fillings $\{M_n \}_{n=1}^\infty$ lead to different expressions for the exact overlap formula \eqref{eqn:Overlap_Exact}, but are believed \cite{2013_Caux_PRL_110,FollowUp} to have the same extensive smooth part $S[\boldsymbol{\rho}]$.
For any finite size $N$, Bethe states are organized in deviated strings. We use the following notation to label the rapidities of such states:
\begin{equation}
\lambda_j \to \lam^{n,a}_\alpha = \lambda^{n}_\alpha + \frac{i\eta}{2}(n+1 -2 a) + i\delta^{n, a}_\alpha \epc
\end{equation}
where $a= 1, \ldots n$ and $\alpha= 1, \ldots , M_n$. The string deviations $\delta^{n, a}_\alpha$ are vanishing in the thermodynamic limit. Though the string hypothesis is not systematically verified around the ground state of the zero-magnetized spin chain \cite{1982_Woynarovich_JPA_15,1983_Babelon_NPB_220_1}, it is effectively verified away from the ground state, for example at finite temperatures \cite{1983_Tsvelik_AP_32}, and by extension in the circumstances discussed here.
The finite size overlap formula between this class of representative states and the N\'eel state is given in Ref.~\cite{2014_Brockmann_JPA},
\begin{equation}\label{eqn:Overlap_Exact}
\frac{\langle \Psi_0 | \{\pm\lambda_j\}_{j=1}^{M/2} \rangle }{\|\{\pm\lambda_j\}_{j=1}^{M/2}\|}= \gamma \times \sqrt{ \frac{\det_{M/2}(G_{jk}^{+})}{\det_{M/2}(G_{jk}^{-})}} \qquad \text{with} \quad \gamma = \prod_{j=1}^{M/2}\frac{\sqrt{\tan(\lambda_j+i\eta/2) \tan(\lambda_j-i\eta/2)}}{2\sin(2\lambda_j)} \epp
\end{equation}
The prefactor $\gamma$ has to leading order no explicit system size dependence from the string deviations $\delta \to 0$, but is exponentially vanishing when the particle number $M$ is sent to infinity due to the product over all rapidities.
We focus on the ratio of the two determinants, where the matrices are given by
\begin{equation}
G^\pm_{(n,\alpha,a),(m,\beta,b)} = \delta_{(n,\alpha,a),(m,\beta,b)}\left(NK_{\eta/2}(\lambda^{n,a}_\alpha)-\sum_{(\ell,\gamma,c)}K_\eta^+(\lambda^{n,a}_\alpha, \lambda^{\ell,c}_\gamma)\right) + K_\eta^\pm(\lambda^{n,a}_\alpha ,\lambda^{m,b}_\beta) \epp
\end{equation}
Here, $K_\eta^\pm(\lambda,\mu)=K_\eta(\lambda-\mu) \pm K_\eta(\lambda+\mu)$ and $K_\eta(\lambda)=\frac{\sinh(2\eta)}{\sin(\lambda+i\eta)\sin(\lambda-i\eta)}$.
Divergences in system size as $1/\delta$ occur in each string block $(n=m,\alpha= \beta)$ when $b= a+1$ in the term $K_\eta(\lambda^{n,a}_\alpha - \lambda^{n,a+1}_\alpha) \sim \frac{i}{( \delta^{n, a+1}_\alpha - \delta^{n, a}_\alpha )}$. On the other hand, for our representative state the terms $\pm K_\eta(\lambda+\mu)$ in $G^\pm$ are never divergent since all string centers are strictly positive. We conclude that the divergences in $1/\delta$ in $\det_{M/2}(G^+)$ cancel exactly the divergences in $\det_{M/2}(G^-)$, as they occur in exactly the same form. The same cancellation applies to divergences appearing in $K_\eta(\lambda- \mu)$ when two rapidities from different strings get close in the thermodynamic limit $\mu \to \lambda \pm i \eta + g(N)$ with $\limth g(N) = 0$. We are then able to take the thermodynamic limit $\limth$ for the overlap coefficients analogously to Ref.~\cite{2014_DeNardis_PRA_89}. Being non-exponential in system size, the ratio of the two determinants can then be neglected in the contribution to the overlaps that is leading in system size. The thermodynamic overlap coefficients then read as
\begin{equation}
S[\brho] = \limth S_{\boldsymbol{\lambda}} = - \frac{N}{2} \sum_{n=1}^\infty \int_{0}^{\pi/2} \mathrm{d}\lam \, \rho_n(\lam) \ln W_n(\lam) \epc
\end{equation}
where
\begin{equation}
W_n (\lam) = \frac{1}{2^{n+1} \sin^{2}2\lam } \frac{ \cosh n \eta - \cos 2\lam }{\cosh n \eta + \cos 2\lam } \prod_{j=1}^{\frac{n-1}{2}} \left( \frac{ \cosh (2j-1) \eta - \cos 2\lam }{(\cosh (2j-1) \eta + \cos 2\lam) (\cosh 4\eta j - \cos 4\lam )} \right)^{2} \epc
\end{equation}
if $n$ odd, and
\begin{equation}
W_n (\lam) = \frac{ \tan^{2}\lam}{2^{n} } \frac{ \cosh n \eta - \cos 2\lam }{\cosh n \eta + \cos 2\lam } \frac{1}{\prod_{j=1}^{\frac{n}{2}} \left(\cosh 2(2j-1)\eta - \cos 4\lam \right)^{2} } \prod_{j=1}^{\frac{n-2}{2}} \left( \frac{\cosh 2j\eta - \cos 2\lam }{\cosh 2j\eta + \cos 2\lam }\right)^{2} \epc
\end{equation}
if $n$ even.
\section*{GTBA equations for the N\'eel-to-XXZ quench}
In this section we focus on the derivation of the saddle point state, specified by the set of distribution $\brho^\text{sp}$ obtained by varying the quench action $S_{QA}[ \brho] = 2 S[\brho] -\tfrac{1}{2} S_{YY}[\brho]$ with respect to each root density $\rho_n(\lam).$ Variation of the Yang-Yang entropy is well-known \cite{KorepinBOOK}. It should be noted that in front of the Yang-Yang entropy there is an extra factor $\tfrac12$. The reason is that only parity-invariant Bethe states contribute and therefore the number of microstates in the ensemble $\brho$ is the square root of the usual number of microstates. Furthermore, only states in the magnetization sector $M=N/2$ have non-zero overlap with the initial N\'eel state. In order to vary with respect to all $\rho_n(\lam)$ independently, we need to add a Lagrange multiplier term to the quench action:
\begin{equation}
-h\,N \left( \sum_{m=1}^{\infty} m \, \int_{-\pi/2}^{\pi/2} \mathrm{d}\lam \, \rho_{m}(\lam) - \frac{1}{2} \right) \epc
\end{equation}
where $h$ is the Lagrange multiplier. Variation with respect to $\rho_n(\lam)$ then leads to the saddle point conditions
\begin{equation} \label{eq:TBA_XXZ}
\ln \eta_{n}(\lam) = - 2\,h\,n - \ln W_{n}(\lam) + \sum_{m=1}^{\infty} a_{nm} \ast \ln \left( 1 + \eta_{m}^{-1} \right) (\lam) \epc
\end{equation}
where $n\geq1$, $\eta_n(\lam) = \rho_{n,h}(\lam)/\rho_n(\lam)$ and the convolution $\ast$ is defined by
\begin{equation}\label{eq:convolution}
(f \ast g)(\lambda) = \int_{-\pi/2}^{\pi/2} d\mu \: f(\lambda - \mu) g(\mu)
\end{equation}
The kernels are defined by $a_{nm}(\lam) = (1-\delta_{nm})a_{|n-m|}(\lam) + 2 a_{|n-m| + 2}(\lam) + \ldots + 2 a_{n+m-2}(\lam) + a_{n+m}(\lam)$, where
\begin{equation}\label{eq:an}
a_{n}(\lam) = \frac{1}{\pi} \frac{\sinh(n\eta)}{\cosh(n\eta) - \cos(2\lam)} \epp
\end{equation}
The functions $-2hn-\ln(W_n)$ are called driving terms. For each fixed value of $h$ the set \eqref{eq:TBA_XXZ} of Generalized Thermodynamic Bethe Ansatz (GTBA) equations has a solution in terms of the functions $\eta_n$. Substituting this solution into thermodynamic Bethe equations leads to the saddle point distributions $\brho^\text{sp}$. The parameter $h$ is fixed by the magnetization condition $M/N=1/2$ of the initial N\'eel state,
\begin{equation}
\sum_{m=1}^{\infty} m \, \int_{-\pi/2}^{\pi/2} \mathrm{d}\lam \, \rho_{m}^\text{sp}(\lam) = \frac{1}{2} \epp
\end{equation}
As for the TBA equations at finite temperature \cite{TakahashiBOOK}, one can recast the GTBA Eqs.~\eqref{eq:TBA_XXZ} into a factorized form where there is no infinite sum over string lengths. We will now derive this result. We use the Fourier transform conventions
\begin{align}\label{eq:FourierTransform}
\hat{f}_k &= \ftk{f}{k} = \int_{-\pi/2}^{\pi/2}e^{2ik\lam}f(\lam)\,d\lam\epc\quad k\in\mathbb{Z}\epc \\
f(\lam) &= \iftx{\hat{f}}{\lam} = \frac{1}{\pi}\sum_{k\in\mathbb{Z}} e^{-2ik\lam}\hat{f}_k\epp
\end{align}
The Fourier transforms of the kernels are $\hat{a}_{n,k} =e^{-|k|n\eta}$, and using the convolution theorem this implies $a_m \ast a_n = a_{m+n}$. A set of identities for the kernels can then be derived \cite{TakahashiBOOK}
\begin{subequations} \label{eq:kernelidentities}
\begin{equation}
(a_0+a_2)\ast a_{nm} = a_1\ast (a_{n-1,m}+a_{n+1,m}) + (\delta_{n-1,m}+\delta_{n+1,m})\,a_1\epc \qquad n\geq 2,\, m \geq 1 \epc
\end{equation}
and
\begin{equation} \label{eq:2ndkernelidentity}
(a_0+a_2)\ast a_{1,m} = a_1 \ast a_{2,m} + a_1 \, \delta_{2,m} \epc \qquad m\geq 1 \epc
\end{equation}
\end{subequations}
where we used the convention $a_0(\lam) = \delta(\lam)$. Convolving the GTBA Eqs.~\eqref{eq:TBA_XXZ} with $(a_0 + a_2)$, the infinite sum can be removed, and we find that
\begin{equation}\label{eq:gTBA_fact_prev}
(a_0+a_2)\ast\ln(\eta_n) = (a_0+a_2)\ast g_n - a_1\ast(g_{n-1}+g_{n+1}) + a_1 \ast \big[\ln(1+\eta_{n-1})+\ln(1+\eta_{n+1})\big] \epp
\end{equation}
Here, the driving terms of the original GTBA equations are rewritten in a more convenient form with $g_n(\lam) = - \ln W_n(\lam) - 2n \ln 4$, where
\begin{subequations}\label{eq:gTBA}
\begin{align}\label{eq:gTBAa}
g_n &= \sum_{l=0}^{n-1} \ln\left[\frac{s_{n-1-2l}c_{n-1-2l}s_{-n+1+2l}c_{-n+1+2l}}{t_{n-2l}t_{-n+2l}}\right]\epc \\
t_n &= \frac{s_n}{c_n}\epc\quad s_n(\lam) = \sin\left(\lam+\frac{i\eta n}{2}\right)\epc\quad c_n(\lam) = \cos\left(\lam+\frac{i\eta n}{2}\right)\epp
\end{align}
\end{subequations}
Defining $g_0 (\lam) = 0$ and $\eta_0 (\lam) = 0$, Eq.~\eqref{eq:gTBA_fact_prev} holds for $n\geq1$. Let us rewrite the new driving terms $\tilde{d}_n = (a_0+a_2)\ast g_n - a_1\ast(g_{n-1}+g_{n+1})$ of Eq.~\eqref{eq:gTBA_fact_prev}. We first rewrite $g_n$ such that only positive indices are present:
\begin{align}
g_n = 2\delta_{n\,\text{mod}\,2,1}\ln\left[s_0^{(2)}\right] + 4 \sum_{l=1}^{\lfloor n/2 \rfloor}\ln\left[s_{n+1-2l}^{(2)}\right] + 2 \sum_{l=1}^{n-1}\ln\left[\frac{c_{l}^{(2)}}{s_{l}^{(2)}}\right] + \ln\left[\frac{c_{0}^{(2)}}{s_{0}^{(2)}}\right]
+ \ln\left[\frac{c_{n}^{(2)}}{s_{n}^{(2)}}\right] \epc
\end{align}
where $s_l^{(2)} = s_ls_{-l}$, $c_l^{(2)} = c_lc_{-l}$ or, explicitly,
\begin{subequations}
\begin{align}
s_l^{(2)}(\lam) &= \sin\left(\lam+\frac{i\eta l}{2}\right)\sin\left(\lam-\frac{i\eta l}{2}\right) = \sin^2\left(\lam\right)+\sinh^2\left(\frac{\eta l}{2}\right)\epc\\
c_l^{(2)}(\lam) &= \cos\left(\lam+\frac{i\eta l}{2}\right)\cos\left(\lam-\frac{i\eta l}{2}\right) = \cos^2\left(\lam\right)+\sinh^2\left(\frac{\eta l}{2}\right)\epp
\end{align}
\end{subequations}
Now we use that for $\tilde{a}_\alpha(\lam) = \frac{1}{2\pi}\frac{\sinh(2\alpha)}{\sin^2(\lam)+\sinh^2(\alpha)}$ and $f_\beta(\lam) = \ln\left[\sin^2(\lam)+\sinh^2(\beta)\right]$ the following relation holds ($\alpha,\beta >0$):
\begin{equation}
\tilde{a}_{\alpha}\ast f_\beta = f_{\alpha+\beta}-2\alpha\epp
\end{equation}
This implies the identities
\begin{equation}
a_m\ast \ln\left[\frac{c_l^{(2)}}{s_l^{(2)}}\right] = \ln\left[\frac{c_{l+m}^{(2)}}{s_{l+m}^{(2)}}\right] \epc \quad a_l\ast \ln\left[\frac{s_{0}^{(2)}}{s_{2}^{(2)}}\right] = \ln\left[\frac{s_{l}^{(2)}}{s_{l+2}^{(2)}}\right] \quad \text{and}\quad a_l\ast \ln\left[\frac{c_{0}^{(2)}}{c_{2}^{(2)}}\right] = \ln\left[\frac{c_{l}^{(2)}}{c_{l+2}^{(2)}}\right] \epp
\end{equation}
From this we can calculate $\tilde{d}_{2n}$ and $\tilde{d}_{2n-1}$ for all $n\geq 1$ explicitly:
\begin{equation}
\tilde{d}_{2n} = \ln\left[\frac{c_0^{(2)}}{c_2^{(2)}}\right] - \ln\left[\frac{s_0^{(2)}}{s_2^{(2)}}\right] \epc\qquad
\tilde{d}_{2n-1} =\ln\left[\frac{c_0^{(2)}}{c_2^{(2)}}\right] + \ln\left[\frac{s_0^{(2)}}{s_2^{(2)}}\right] \epp
\end{equation}
The GTBA equations can be written compactly as
\begin{subequations}
\begin{equation}\label{eq:TBA_XXZ_fact0}
(a_0+a_2)\ast\ln(\eta_n) = \tilde{d}_n + a_1 \ast \big[\ln(1+\eta_{n-1})+\ln(1+\eta_{n+1})\big]\epc
\end{equation}
where $n\geq1$, the $\lam$-dependence is left implicit and by convention $\eta_0(\lam)= 0$ and $a_0(\lam)=\delta(\lam)$. The driving terms are given by
\begin{equation}
\tilde{d}_{n}(\lam) = \ln\left[\frac{\cos^2(\lam)}{\cos^2(\lam)+\sinh^2(\eta)}\right] - (-1)^n \ln\left[\frac{\sin^2(\lam)}{\sin^2(\lam)+\sinh^2(\eta)}\right]\epp
\end{equation}
\end{subequations}
Using the convolution theorem once again, one can invert the operation of $(a_0+a_2)\ast$ and bring it to the right hand side of Eq.~\eqref{eq:TBA_XXZ_fact0}. The Fourier transforms of the driving terms are
\begin{equation}
\hat{\tilde{d}}_{n,k} = 2\pi \frac{(1-e^{-2\eta|k|})}{|k|}\left(\frac{(-1)^n-(-1)^k}{2}\right) \epp
\end{equation}
Defining
\begin{align}
\hat{d}_{n,k} &:= \frac{\hat{\tilde{d}}_{n,k} }{\hat{a}_{0,k}+\hat{a}_{2,k}} = 2\pi \frac{\tanh(\eta k)}{k}\left(\frac{(-1)^n-(-1)^k}{2}\right)\epc \nonumber \\
\hat{s}_k & := \frac{\hat{a}_{1,k}}{\hat{a}_{0,k}+\hat{a}_{2,k}} = \frac{1}{2\cosh(k\eta)} \epc
\end{align}
the GTBA equations in Fourier space are
\begin{equation}
\ft{\ln(\eta_n)}(k) = \hat{d}_{n,k} + \hat{s}_k\Big( \ft{\ln(1+\eta_{n-1})}(k) + \ft{\ln(1+\eta_{n+1})}(k) \Big) \epp
\end{equation}
In real space they are precisely the GTBA equations of the main text.
\section*{Nearest-neighbor spin-spin correlation}\label{sec:nn-correlator}
In this section we show how to compute the expectation value of the nearest-neighbor spin-spin correlator, $\langle\sigma_j^z \sigma_{j+1}^z\rangle = 1 + \frac{4}{N J} \left\langle \frac{\partial H}{\partial \Delta} \right\rangle$, in the thermodynamic limit on a generic, translationally invariant Bethe eigenstate specified by a set of smooth distributions $\boldsymbol{\rho}$. This is a direct consequence of the Hellmann-Feynman theorem \cite{1939_Feynman}. Independently, an analogous implementation was recently presented in Ref.~\cite{2014_Mestyan}.
A naive application of the Hellmann-Feynman theorem on the saddle point state leads to a wrong result
\begin{equation}
\langle \boldsymbol{\rho}^\text{sp} |\frac{1}{N}\frac{\partial H }{\partial \Delta} | \boldsymbol{\rho}^\text{sp} \rangle \neq \frac{\partial }{\partial \Delta} \langle \boldsymbol{\rho}^\text{sp} | \frac{1}{N} H | \boldsymbol{\rho}^\text{sp} \rangle = -\frac{J}{2} \epp
\end{equation}
This is because the overlaps of two Bethe states with different values of $\Delta$ are exponentially small in system size, which makes the thermodynamic limit and the derivative with respect to $\Delta$ noncommuting,
\begin{equation}
\lim_{\delta \to 0} \limth \langle \blam(\Delta) | \blam(\Delta + \delta) \rangle \neq \limth \lim_{\delta \to 0} \langle \blam(\Delta) | \blam(\Delta + \delta) \rangle \epp
\end{equation}
In order to apply the Hellmann-Feynman theorem we need to take the derivative of the energy eigenvalue of a generic Bethe state at finite size $N$ and then take the thermodynamic limit. Under the string hypothesis the energy eigenvalue $\omega_\blam$ becomes a function only of the string centers $ \lambda_\alpha^n$ of the Bethe state, since the string deviations vanish exponentially in system size,
\begin{equation}\label{eq:energy}
\omega_\blam = - \pi J \sinh (\eta) \sum_{n , \alpha} \: a_{n}(\lambda_{\alpha}^n) \epc
\end{equation}
where $a_n$ is defined in Eq.~\eqref{eq:an}. We can now apply the Hellmann-Feynman theorem to this finite size state by taking the derivative of the energy with respect to $\Delta$:
\begin{equation}\label{eq:corr1}
\langle \blam | \sigma^z_1 \sigma^z_{2} | \blam \rangle = 1+ \frac{4}{NJ} \frac{d\omega_\blam}{d \Delta} =1+ \frac{4}{NJ} \frac{1}{\sinh \eta}\frac{d\omega_\blam}{d \eta} = 1 - \frac{4\pi}{N} \sum_{n , \alpha} \left[ \frac{\cosh \eta}{\sinh \eta} a_{n}(\lambda_{\alpha}^n) + (\partial_{\eta} a_{n})(\lambda_{\alpha}^n) + \partial_{\lambda_{\alpha}^n} a_{n}(\lambda_{\alpha}^n) \frac{d \lambda_{\alpha}^n}{d \eta} \right] \epp
\end{equation}
The quantities $\frac{d \lambda_{\alpha}^n}{d \eta}$ are obtained by deriving the reduced Bethe equations for string centers \cite{1931_Bethe_ZP_71,1971_Takahashi_PTP_46}
\begin{equation}
\theta_{n}(\lambda_{\alpha}^n) - \frac{1}{N} \sum_{m ,\beta} \theta_{nm}(\lambda_{\alpha}^n - \lambda_{\beta}^m ) = \frac{2 \pi}{N} I_{\alpha}^n
\end{equation}
(see \cite{TakahashiBOOK} for the definitions of $\theta_n$ and $\theta_{nm}$) with respect to the interaction parameter $\eta$,
\begin{equation}
N\, \tilde{a}_{n}(\lambda_{\alpha}^n) - \sum_{m ,\beta} \tilde{a}_{nm}(\lambda_{\alpha}^n - \lambda_{\beta}^m ) + \sum_{m ,\beta} G_{(n, \alpha),(m, \beta)} \frac{d \lambda_{\beta}^m}{ d \eta} = 0 \epp
\end{equation}
Here we introduced the reduced Gaudin matrix for string centers \cite{2005_Caux_JSTAT_P09003}
\begin{equation}
G_{(n, \alpha),(m, \beta)} = \delta_{(n, \alpha),(m, \beta)} \Big( N\, a_{n}(\lambda_{\alpha}^n ) - \sum_{k, \gamma} a_{nk}(\lambda_{\alpha}^n - \lambda_{\gamma}^k ) \Big) + a_{nm}(\lambda_{\alpha}^n - \lambda_{\beta}^m )
\end{equation}
and the functions $\tilde{a}_{n} $ and $\tilde{a}_{nm}$,
\begin{equation}
\tilde{a}_{n} (\lambda) = \frac{1}{2 \pi }\partial_\eta \theta_n (\lambda) =- a_n(\lambda) \frac{n \sin 2\lam}{2\sinh n \eta}\epc
\end{equation}
\begin{equation}
\tilde{a}_{nm}(\lambda) = \frac{1}{2 \pi }\partial_\eta \theta_{nm} = (1-\delta_{nm})\tilde{a}_{|n-m|}(\lambda) + 2 \tilde{a}_{|n-m| + 2}(\lambda) + \ldots + 2 \tilde{a}_{n+m-2}(\lambda) + \tilde{a}_{n+m}(\lambda) \epp
\end{equation}
We define a set of the auxiliary functions $\boldsymbol{h}=\{ h_n\}_{n=1}^\infty$ such that
\begin{equation}\label{eq:def_hn}
{h_n(\lambda_\alpha^n)} = \sum_{m,\beta} (G^{-1})_{(n, \alpha),(m, \beta)} \Big( N\, \tilde{a}_{m}(\lambda_{\beta}^m) - \sum_{k ,\gamma} \tilde{a}_{mk}(\lambda_{\beta}^m - \lambda_{\gamma}^k ) \Big) \epp
\end{equation}
In the thermodynamic limit, expression \eqref{eq:corr1} can then be recast as a functional of the distributions $\boldsymbol{\rho}$ and the auxiliary functions $\boldsymbol{h}$,
\begin{equation} \label{eq:rescorr}
\langle \brho | \sigma^z_1 \sigma^z_{2} | \brho \rangle = 1+ \limth \frac{4}{NJ}\frac{d\omega_\blam}{d \Delta} =1 - 4\pi \sum_{n=1}^\infty \int_{-\pi/2}^{\pi/2} d\lambda \, \rho_n(\lambda) \left[ \frac{\cosh \eta}{\sinh \eta} a_{n}(\lambda) + \partial_{\eta} a_{n}(\lambda) - (\partial_{\lambda} a_{n}(\lambda)) {h_n(\lambda)}{ } \right] \epc
\end{equation}
where the auxiliary functions are determined by a set of linear integral equations
\begin{equation}
\rho_{n,t}(\lambda) h_n (\lambda) + \sum_{m=1}^\infty a_{n m} \ast ({\rho_m}{} h_m )\, (\lam)= \tilde{a}_n(\lambda) - \sum_{m=1}^\infty \tilde{a}_{nm}\ast \rho_m \,(\lam) \epp
\end{equation}
As in the case of GTBA Eqs.~\eqref{eq:TBA_XXZ}, they can be reduced to
\begin{equation}
(a_0 + a_2) \ast (\rho_{n,t} h_n) = \Big[ (a_0 + a_2) \ast d_n - a_1 \ast (d_{n-1} + d_{n+1})\Big] + a_1 \ast (h_{n-1} {\rho_{n-1,h}}{} + h_{n+1} {\rho_{n+1,h}}{}) \epc
\end{equation}
with $h_0(\lam) =d_0(\lam) = 0$ and the driving terms given by ($n \geq1$)
\begin{equation}
d_n(\lambda) = \tilde{a}_n(\lambda) - \sum_{m=1}^\infty \tilde{a}_{nm}\ast \rho_m\, (\lam) \epp
\end{equation}
Analogously to the local conserved charges, the correlator \eqref{eq:rescorr} can be expressed solely in terms of $\rho_{1,h}$ and the auxiliary function $h_1$ (see \cite{FollowUp}),
\begin{align}
\langle \brho | \sigma^z_1 \sigma^z_{2} | \brho \rangle = 1+ 4\Big\{ & -\frac{\cosh \eta}{\sinh \eta} \sum_{k \in \mathbb{Z}} \left( \frac{ e^{- |k|\eta } - \hat{\rho}_{1,h}(k) }{2 \cosh k \eta} \right) + \sum_{k\in\mathbb{Z}} |k | \left[\frac{ e^{- |k| \eta} }{2 \cosh k \eta} +\tanh(|k| \eta) \left( \frac{ e^{- |k|\eta } - \hat{\rho}_{1,h}(k) }{2 \cosh k \eta} \right) \right] \nonumber \\
& - \pi \int_{-\pi/2}^{\pi/2} d\lambda \: \rho_{1,h}(\lambda) \, h_1(\lambda) \frac{\partial}{\partial \lambda}
s(\lambda) \Big\} \epc
\end{align}
where the function $s$ is given in the main text and $\hat{\rho}_{1,h}$ is the Fourier transform, defined in \eqref{eq:FourierTransform}, of $\rho_{1,h}$.
\section*{Numerical Linked-Cluster Expansions}
In this section we discuss the numerical linked-cluster expansion (NLCE) results. Such an expansion allows
us to obtain the infinite-time average, or the diagonal ensemble (DE), result for the expectation
value of spin-spin correlations. This approach was introduced in
Ref.~\cite{rigol_14} to study quenches with initial thermal states.
It can be straightforwardly tailored to study quenches with initial ground states
as explained in detail, for the particular quenches considered in this work,
in Ref.~\cite{rigolpre_14}. Here we report the results that are relevant to the
discussion in the main text.
\begin{figure}[!b]
\includegraphics[width=0.63\textwidth]{NLCELargeDbarevsQA}
\vspace{-0.1cm}
\caption{Next-nearest neighbor correlations $\langle\sigma^z_1\sigma^z_3\rangle^\text{DE}_l$
versus $l$ for quenches with $\Delta=1$, 1.1, 1.5, 2, 2.5, and 3. Results are reported
for the last nine orders of the NLCE.}\label{fig:NLCELargeDbarevsQA}
\end{figure}
The NLCE calculations are done in clusters with up to 18 sites, as in
Ref.~\cite{rigol_14}. We denote as $\mathcal{O}^\text{DE}_l$
the result obtained for an observable $\mathcal{O}$ when adding the contribution
of all clusters with up to $l$ sites. We should stress that while the
convergence of the NLCE calculations improves for increasing values of $\Delta$ \cite{rigolpre_14}, for large $\Delta$ they do not
allow us to discriminate between the results of the quench action (QA) and the GGE
calculations, which are very close to each other.
They also do not allow us to discriminate between the QA and GGE results
as $\Delta\rightarrow1$ when the observable considered is
$\sigma^z_1\sigma^z_2$. In that regime, the QA and GGE results
are also too close to each other, becoming identical for $\Delta=1$. As shown
in Fig.~3 of the main text, the largest relative differences between the
QA and GGE predictions are seen for $\langle\sigma^z_1\sigma^z_3\rangle$
when $\Delta\rightarrow1$. This is the observable and regime in which we
focus our NLCE effort.
\begin{figure}[!t]
\includegraphics[width=0.63\textwidth]{NLCEbarevsQA}
\vspace{-0.1cm}
\caption{Next-nearest neighbor correlations $\langle\sigma^z_1\sigma^z_3\rangle^\text{DE}_l$
versus $l$ for quenches with $\Delta=1$, 1.01, 1.1, and 1.2. Results are reported
for the last nine orders of the NLCE. The QA and GGE predictions for $\Delta=1$ are reported
as continuous and dashed horizontal lines,
respectively.}\label{fig:NLCEbarevsQA}
\end{figure}
In Fig.~\ref{fig:NLCELargeDbarevsQA}, we show results for
$\langle\sigma^z_1\sigma^z_3\rangle^\text{DE}_l$ versus $l$ for several values of
$\Delta$ between 1 and 3. A few remarks are in order on those results. They can be
seen to be converging with increasing $l$, the amplitude of the oscillations decreases,
but do not quite converge for the cluster sizes accessible to us. As $\Delta$ decreases
from 3, one can see that the convergence of the NLCE initially worsens, the amplitude of
the oscillations increases for any two contiguous values of $l$, and then improves very
close to $\Delta=1$. Finally, it is
apparent that the results for $\langle\sigma^z_1\sigma^z_3\rangle^\text{DE}$ decrease with
decreasing $\Delta$, as expected from the exact calculations discussed in the main text.
We now turn our attention to the regime in which $\Delta$ is very close to 1. The fact
that the convergence of our NLCE calculations improves in that regime is better seen in
Fig.~\ref{fig:NLCEbarevsQA}, where we report results for $\Delta=1$, 1.01, 1.1, and 1.2.
However, the results of our bare NLCE sums still do not allow us to discriminate between
the QA and GGE results for $\Delta=1$, which are depicted as continuous and dashed
horizontal lines, respectively.
As discussed in NLCE studies of systems in thermal equilibrium
\cite{rigol_bryant_06,rigol_bryant_07}, whenever NLCE series do not converge to a desired
accuracy, one can use resummation techniques to accelerate the convergence of the
series and obtain more accurate results. There are two resummation techniques that
were explained in detail in Ref.~\cite{rigol_bryant_07}, which we have found to
improve convergence in our problem. Those are Wynn's and Brezinski's algorithms.
They can be applied multiple times (in what we call ``cycles'') to a series
for an observable and are expected to improve convergence with each cycle.
However, the application of too many cycles can also lead to numerical
instabilities \cite{rigol_bryant_07}.
\begin{figure}[!b]
\includegraphics[width=0.54\textwidth]{NLCEresumvsQA}
\vspace{-0.1cm}
\caption{Next-nearest neighbor correlations $\langle\sigma^z_1\sigma^z_3\rangle^\text{DE}$
after each cycle of Wynn's (empty symbols) and Brezinski's (filled symbols) algorithms.
Results are reported for $\Delta=1$, 1.001, 1.005, and 1.01, and are compared to the QA
(continuous line) and GGE (dashes line) results for $\Delta=1$.}
\label{fig:NLCEresumvsQA}
\end{figure}
In a nutshell, every cycle of those algorithms leads to a new series with fewer elements
\cite{rigol_bryant_07}. The last element after each cycle is expected to approach the
$l\rightarrow\infty$ result. The reduction in the number of elements after each cycle
limits the number of times each algorithm can be applied to a series. In Wynn's algorithm, each
cycle reduces the number of elements by two, and in Brezinski's algorithm each cycle reduces the
number of elements by three. Since our bare series for $\langle\sigma^z_1\sigma^z_3\rangle^\text{DE}$
has 18 elements (we consider clusters with up to 18 sites), the maximal number of cycles we can
apply for Wynn's algorithm is 8 and for Brezinski's algorithm is 5.
In Fig.~\ref{fig:NLCEresumvsQA} we report the results (the last element) after each cycle of
Wynn's and Brezinski's algorithms. As $\Delta$ approaches 1, we find that the resummations are
stable and lead to very close results. This gives us confidence in the robustness of the
resummations around the Heisenberg point. Figure~\ref{fig:NLCEresumvsQA} shows that,
as $\Delta\rightarrow1$ and as the number of cycles increases, Brezinski's algorithm
appears to converge to a slightly larger
value of $\langle\sigma^z_1\sigma^z_3\rangle^\text{DE}$ than the QA prediction, while Wynn's
resummations seem to converge to a result slightly below the QA prediction. Both are far
from the GGE prediction. Hence, our NLCE results support the correctness of the QA predictions
for the outcome of the relaxation dynamics in these systems.
In the main text, we report the results of Brezinski's resummations after one cycle as representative
of the outcome of the resummation techniques used. This because those results are almost identical to the ones
obtained using Wynn's algorithm after one cycle, and are in between the results obtained after the
maximal number of cycles that could be applied in each algorithm. In the main text, we also report an
interval of confidence that contains all results obtained within Brezinski's and Wynn's resummations,
except for $\Delta=0.015$ for which the last cycle of Brezinski's algorithm
resulted in $\langle\sigma^z_1\sigma^z_3\rangle^\text{DE}=0.1346$, which is out of the interval
of confidence reported. We should add that the fluctuations
in the values of $\Delta$ after resummations increase as $\Delta$ increases for $\Delta<2$. Because
of this, and the fact that the QA and GGE results approach each other with increasing $\Delta$,
we cannot use our NLCE results to discriminate between the QA and GGE predictions as one moves
further away from the Heisenberg point.
\bibliography{XXZ_TBA_supp_mat_biblio}
\end{document}
%% ****** End of file apstemplate.tex ****** %