Interplay between electronic correlation and metal-ligand delocalization in the spectroscopy of transition metal compounds: case study on a series of planar Cu$^{2+}$ complexes

We present a comprehensive theoretical study of the physical phenomena that determine the relative energies of the three of the lowest electronic states of each of the square-planar copper complexes $\cucl$, $\cunh$ and $\cuwater$, and present a detailed analysis of the extent to which truncated configuration interaction (CI) and coupled cluster (CC) theories succeed in predicing the excitation energies. We find that ligand-metal charge transfer (CT) single excitations play a crucial role in the correct determination of the properties of these systems, even though the CT processes first occur at fourth order in perturbation theory, and propose a suitable choice of minimal active space for describing these systems with multi-reference theories. CCSD energy differences agree very well with near full CI values even though the $T_1$ diagnostics are large, which casts doubt on the usefulness of singles-amplitude based multi-reference diagnostics. CISD severely underestimates the excitation energies and the failure is a direct consequence of the size-inconsisency errors in CISD. Finally, we present reference values for the energy differences computed using explicitly correlated CCSD(T) and BCCD(T) theory.


I. INTRODUCTION
Open-shell transition metal complexes, which are ubiquitous in biological and industrial chemistry, represent one of the main challenges for present-day quantum chemistry, where theory seeks to provide prediction and interpretation of key properties such as electronic transition energies, spin-density maps and magnetic anisotropy. Complexes containing Cu 2+ have been studied extensively, both using Density Functional and wavefunction theories, [1][2][3][4][5][6][7][8][9][10][11][12][13][14][15][16][17][18] and have been found to pose a tough test for electronic structure methods. Popular functionals such as B3LYP and BP86 systematically underestimate the spin-density at the Cu atom, provide poor d−d and ligand-to-metal exctiation energies 8,10,11 and misleading predictions of magnetic anisotropy tensors. 8,9,13,19 Although it is possible to design tailored functionals for these systems, with higher percentages of Hartree-Fock exchange, this pragmatic approach has limited transferability and limited predictive power.
On the other hand, studies using wavefunction methods have also only been partially successful. Transtition metal complexes are considered to be strongly correlated systems and Complete Active Space Self Consistent Field (CASSCF) theory is usually applied, with multirefernce perturbation or truncated CI corrections for dynamic correlation. The computed energies are found to be highly sensitive to the choice of active space and the level of coupling between the treatment of static and dynamic correlation, but the number of orbitals involved in the coordination at the transition metal centre prohibits brute force convergence with respect to the size of the active space. Although the relatively high density of low lying electronic states and the large values of T 1 diagnostics observed for transition metal complexes discourages the use of single reference methods, the accuracy of single reference have been studied extensively and the EPR spectra, spin density, g-tensor and electronic excitation energies are well characterised experimentally [20][21][22][23] . The schematic ligand-field diagram ? is displayed in Fig. 1. The low-lying excited states all correspond to doublets where one of the more low lying d orbitals becomes the SOMO. In multi-reference computational studies of two of these systems 8,9,11,14,19 Neese et al.
and Pierloot et al. observed that in order to correctly describe the electronic spectrum and magnetic properties it is necessary to include the ligand donor orbital in the active space, even though this orbital is doubly occupied and has a relatively low orbital eigenvalue. They also found that CASPT2 performs poorly and sophisticated methods such as SORCI 9 or MS-CASPT2 14 are required, which couple the dynamic correlation into the multi-reference treatment. The general importance of ligand donor orbitals was highlighted by Nieuwpoort, Broer and coworkers in their pioneering work on cluster models of transition metal oxides, 1,2,7,24 where they showed that ligand-to-metal charge transfer (LMCT), and associated orbital relaxation, forms a significant component of the wavefunction. Many subsequent studies have confirmed the importance of LMCT in a range of transition metal systems, 3,6,12,25,26 and recent work by one of us 17 and an analysis of the wavefunctions and LMCT in section III. In section IV we discuss the performance of multi-reference methods for these systems, addressing the appropriate minimal active space required to capture the dominant physical processes at play. The performance of single reference methods is discussed in section V, where we demonstrate that non-size-extensivity errors severvely degrade theoretical predictions, even for these small molecules. In section VI, near basis set limit reference transition energies are reported. Our conclusions are summarised in section VII.

II. BENCHMARK NEAR FCI ENERGIES AND WAVE FUNCTIONS
Near FCI wavefunctions for the ground, first and third electronic states of each of the three complexes were computed using the CIPSI method in a 6-31G basis set. level of theory using a 6-31G* basis set.
The CIPSI approach approximates the FCI energy through an adaptively refined selected CI procedure, corrected for discarded determinants through second-order multireference perturbation theory. The CIPSI class of methods build upon selected CI ideas [29][30][31][32][33][34][35] and have been successfully used to converge to FCI correlation energies, one-body properties and nodal surfaces. 33,[36][37][38][39][40][41][42][43] The CIPSI algorithm used in this work uses iteratively enlarged selected CI and Epstein-Nesbet 44,45 multi-reference perturbation theory. The CIPSI energy is where I denotes determinants within the CI reference space R and µ a determinant outside it. To reduce the cost of evaluating the second-order energy correction, the semi-stochastic multi-reference approach of Garniron et al 46 where X denotes the ligand. In the ROHF wavefunction, the spin density is concentrated at the copper atom, whereas in the FCI wavefunction, the LMCT excitations delocalise the spin-density onto the ligands. The spin-densities are listed in Table III.

C. Composition of the excited state wave functions
The composition of the CIPSI wave function for the various excited states presents strong similarities with the ground state wave functions. In all cases, LMCT single excitations

III. PERTURBATION THEORY ANALYSIS OF LMCT
Our benchmark near FCI wavefunctions and energies in the 6-31G basis set reveal that the ground state is more correlated than the excited states in all three complexes, and that the LMCT processes are stronger in the ground state than in the excited states. In this section we analyse in greater depth the role of electron correlation and LMCT in the transition energies from the perspective of single reference perturbation theory. Here and throughout, all the orbitals doubly occupied in the ROHF Slater determinant are referred as i, the ligand donor orbital for each state is called L, the SOMO as S and the virtual orbitals as a. Also, the S z component is assumed to be 1 2 so the unpaired electron has α spin. The LMCT determinant is The prevalence of the LMCT determinants in all ground and some excited states wavefunctions can be considered as quite unusual at least for two reasons. First, all LMCT determinants are more than 12 eV higher in energy than the ROHF determinant, which is clearly not a near degeneracy situation. Second, coefficient of the LMCT determinant at first-order in Møller-Plesset perturbation (MP) theory 47 is which vanishes because of the Brillouin Theorem. The large coefficients of the LMCT determinant in the near FCI wave function come necessarily from their interactions with determinants of higher excitation rank. The first non-vanishing contribution to the LMCT coefficient appears at second order in the MP expansion: The contribution δc D of each double excitation |D to the second-order coefficient can be Among these, the largest |δc D | occur when i is a 3d and a a 4d orbital, where the the interaction elements LMCT|H| S L a i are found to be around 7 eV, which is very large for off-diagonal Hamiltonian matrix elements. The large magnitude can be easily understood: applying the Brillouin-Theorem and neglecting minor exchange contributions, the pertinent matrix elements are where the standard chemical notation is used for the two-electron repulstion integrals. The The fact that the single excitations from |LMCT are important can be interpreted physically as the need to relax the orbitals of the |LMCT determinant. The ROHF orbitals are not optimal for the |LMCT determinant since the former represents the copper atom in its Cu 2+ , wheras the latter represents the copper atom in its Cu + state where the orbitals are more diffuse. This is nothing other than the breathing-orbital effect, well known in the VB framework. 48 These considerations all point to a subtle interplay between electronic correlation and metal-ligand delocalization in the spectroscopy of these transition metal complexes.

IV. MULTI-REFERENCE METHODS
When wavefunction approaches are applied to transition metal systems, multi-reference (MR) methods are usually selected. The results obtained often depend critically on the choice of active space and in this section we examine the influence of the active space on the transition energies and spin densities.

A. CASSCF
A common choice of active space in transition metals is the so-called 'double d-shell', which involves all valence 3d electrons in the 3d and 4d orbital sets. For 3d 9 copper complexes, this results in a CAS(9-10), nine electrons in ten orbitals. The CASSCF transition energies and spin densities are reported in Tables I and III The analysis in the previous section highlights the importance of LMCT single excitations 3d → L, which are missing from the CAS(9-10) active space that contains only 3d → 4d excitations. Adding the ligand donor orbital L to the active space results in a CAS  and the CASSCF transition energies and spin densities are also reported in Tables I and III. The results are substantially improved due to the presence of LMCT single excitations in the active space, but significant deviations from the reference CIPSI values remain.

B. A minimal CI space: FOBOCI
The perturbation analysis of Sec. III suggests that it is possible to define a minimal selected CI, refered to as FOBOCI 18 It is also interesting to note that CAS  performs systematically worse than FOBOCI, even though the CAS(11-11) total energies are ∼0.1 Hartree below the FOBOCI values. The main difference between the FOBOCI and the CAS  wavefunctions is that the former contains | Sa Li determinants with i and a located on the ligands, which are missing from the CAS(11-11) active space. The | Sa Li determinants with i and a are 3d and 4d orbitals allow for the dilatation of the copper orbitals due to the transfer of charge from the ligand to the copper, and the | Sa Li determinant with i and a located on the ligands allows for the corresponding relaxation of the ligand orbitals. Both are required for quantitative agreement with the near FCI results.

C. Perturbation analysis of FOBOCI
Having established the accuracy and reliability of FOBOCI, this greatly simplified wavefunction can be analysed in detail to gain further insight into the relative levels of correlationinduced spin-delocalisation among the low-lying states. We use the Møller-Plesset perturbation series for this purpose, which corresponds to a Taylor expansion of the CI equations in this subspace. At second order, neglecting exchange integrals, and the singles contribution, the FOBOCI energy is The diagramatic representation is displayed in Figure 6. As previously noted, the secondorder energy already shows a differential role between the ground state and the excited states.
The electrostatic interaction between the SOMO S and donor ligand L orbitals dictates the crystal field splitting and is therefore larger in the ground state than in the excited states.
The integrals (SL|ia) are correpondingly larger in the ground state, resulting in a larger correlation energy, which raises the electronic transition energy. Figures 3, 4, 5, 9, 10, 11, 12, 13 and 14 depict the SOMO and ligand donor orbitals.
The first contribution to the energy from the LMCT determinant is obtained at fourth order: the second order LMCT coefficient modifies the coefficients of the double excitations at third order. Neglecting minor exchange contributions, the second order coefficient c The diagramatic representation is displayed in Figure 7. Since the integrals (SL|ia) are larger in the ground state than the excited state, c LMCT is also larger for the ground state. The full fourth-order energy expression is involved even for the FOBOCI space. The part that can be directly compared to the second-order energy is dominant and is given by The diagramatic representation of the approximation to e (4) is displayed in Figure 8. As the energy denominators are always negative, and the numerators always positive, the higherorder effects enhance the second order energy correction through an interaction between LMCT and the double excitations | S L a i . The differential effects at second order are magnified by the LMCT, which explains the importance of both the LMCT and the double excitations | S L a i in the correct prediction of the electronic transition energies. This perturbation perspective can be connected to the VB picture through a decomposition in terms of strongly localised orbitals. This analysis is somewhat involved, but the conclusions are that two physical effects at work: a small dispersive interaction between the ligand lone pairs and the electron in the SOMO; and a comparatively large breathing orbital relaxation induced by the LMCT VB component.

V. SINGLE-REFERENCE METHODS
The success of the FOBOCI shows that very reasonable descriptions of the wavefunctions, electronic transitions and spin densities can be obtained at reduced computational cost through a careful selection of the CI space. The excitation manifold in FOBOCI is a subset of CISD, since it contains all single excitations and a specific class of double excitations, and CISD may be anticipated to be even more accurate. This section is dedicated to the investigation of the performance of single-reference wave function based methods.

A. CISD and CCSD: the size extensivity error
The results of the CISD calculations are reported in Tables I, II and III for the electronic transitions, the amplitudes of the LMCT and the spin density, respectively. Contrary to expectation, CISD performs systematically worse than FOBOCI, underestimating the electronic transitions by at least 5 mH, with a corresponding underestimation of the amplitudes of the LMCT by at least a factor of two. The results of CCSD calculations are also reported.
CCSD is in much better agreement with the near FCI results, with a maximum error of only 1.6 mH for the transition energies. In the discussion below we demonstrate that the failure of CISD is a direct consequence of the lack of size extensivity of the CISD wavefunction and energy.
The CISD and CCSD equations can be directly compared when using the unlinked CCSD formalism. In both CISD and CCSD, discarding the spin polarisation energy from the Brillouin terms, the correlation energy is In the second line we have introduced a decomposition into linked and unlinked contributions with respect to to a particular double excitation | Sa Li . The unlinked correlation energy whereas the corresponding CCSD equation is where T, Q are triple and quadruple excitations and the c Sa Li E U L corr (LiSa) term from the denominator exactly cancels the unlinked parts of I∈T,Q Sa Li |H|I I|CCSD . In the CCSD equations, only the linked correlation energy survives in the demoninator, whereas the total correlation energy remains in the denominator of the CISD equations. For the copper complexes, the total correlation energy is in the order of 10 eV, and the unlinked component accounts for more than 95%. The presence of E U L corr (LiSa) in the CISD equations therefore introduces a spurious 10 eV shift in all energy denominators, which dramatically reduces the coefficients of the double excitations c Sa Li . Consequently, the correlation energy in general, and the differential correlation effects arising from c Sa Li in particular, are systematically underestimated at the CISD level, resulting in poor transition energies. Tables I, II and III also report the results from the CISD(SC) 2 method, [50][51][52] where the CISD equations are modified by removing E U L corr (LiSa) from the denominator. CISD(SC) 2 repairs the errors of CISD and indeed performs comparably to CCSD, indicating that the higher-order linked terms that are missing from CISD(SC) 2 do not play a large role in the energy differences between the excited states. To conclude this analysis, we note that FOBOCI does not contain unlinked terms with respect to | Sa Li . The FOBOCI correlation energy can be expressed as and is therefore always linked with respect to any double excitation | Sa Li present in the FOBOCI. The FOBOCI also therefore does not suffer from size inconsistency errors for the terms that dominante the differential correlation effects among the low-lying electronic states.   Table IV. The necessity for three-body correlation in approaching near FCI quality transition energies is clearly apparant. Unfortunately, it is not possible to comment on the relative merits of BCCD(T) over CCSD(T) based on these numerical tests.

ENERGIES
The success of CCSD(T) and BCCD(T) in reproducing near FCI transition energies in small basis sets, encourages us to use these methods to obtain high-quality reference values near the basis set limit. Table V reports the results of ROHF-UCCSD(T) (F12*) and UBCCD(T) (F12*) calculations 53-55 using the aug-cc-pwCVTZ-DK basis sets. The X2C method 56 was used to account for scalar relativistic effects and an exponent of 1.2 a −1 0 was used in the F12 correlation factor. 57 Table V lists the additive contributions from the CCSD, (T), the CABS singles correction 55? and the F12 correction for frozen core calculations, where an argon core was used for copper, and a helium core for oxygen and nitrogen. The core-valence correlation correction is the difference between the full valence only ROHF-UCCSD(T) (F12*) or UBCCD(T) (F12*) calculation and calculations correlating all electrons except the neon core at the copper.
Concerning the dependence of the excitation energies on basis set, we find that while the values differ substantially from those computed with a 6-31G basis, the CABS singles correction is negligable and the ROHF energies are converged to within 0.2 mH at the aug-cc-pwCVTZ-DK level. The F12 contribution is also less than a mH, suggesting that short-range dynamical correlation is not decisive in the ordering of the excited states and that the CCSD(T) and BCCD(T) values are well converged with respect to one-particle basis set size at the aug-cc-pwCVTZ-DK level. We note that since the F12 correction is based on the cusp condition for the first-order amplitudes, 54,58 it contains contributions of the type f 12 |Li , but misses F12 contributions of the type f 12 |Sa , and can therefore be expected to give slightly too low excitation energies with small basis sets. The magnitude of this effect, however, would appear to be very small, particularly in the Brueckner calculations where the orbital optimisation reduces this bias considerably. We ascribe a basis set incompleteness error bar of 0.5 mH for CCSD(T) (F12*) excitation energies, and 0.2 mH for BCCD(T) (F12*) excitation energies.
Concerning the dependence of the excitation energies on the level of correlation treatment, we find that BCCD systematically predicts lower transition energies than CCSD. This is pattern is reversed when comparing CCSD(T) and BCCD(T). Although the (T) energy is smaller for BCCD than for CCSD, the differential effect of the (T) triples correction on the excitation energies is larger for BCCD than for CCSD. While the inclusion of highorder orbital relaxation effects in BCCD would favour this method, the (T) correction is anticipated to be biased to the ground state in both cases. Without benchmark calculations in a larger basis set, it is difficult to be sure which method is superior. We therefore quote the UBCCD(T) (F12*) as reference values for the transitions and assume that the difference between the ROHF-UCCSD(T) (F12*) and UBCCD(T) (F12*) values is a minimum estimate for the error bar.
The ab initio reference values are not expected to agree perfectly with experimental values, since the former are gas phase data and the latter are obtained from electronic absorption spectroscopy of single crystals containing the gas phase chromophore, which are subject to crystal field effects and geometric relaxation.   and Cu + oxidation states.
A perturbation analysis of the coefficient of these Slater determinants in the ground and excited state wave functions revealed that these determinants arise predominantly due to the so called breathing orbital effect, an orbital relaxation induced by the change in oxidation state at the Cu as a consequence of correlating the electrons. This effect plays a key role in the differential energies between the spin states and must be properly represented in the wavefunction for a correct qualitative description of this class of systems. Using these insights, we propose of a minimal CI space, the FOBOCI, which captures the key physical effects, and we demonstrate numerically that it is able to reproduce quantitatively the energy differences and spin density of these three Cu 2+ complexes, even though it recovers less than 3% of the total correlation energy. The numerical evidence is further supported by a perturbational analysis, up to fourth-order in the energy, which, together with some simple physico-chemical considerations, explains the success the FOBOCI in accurately describing the energy differences.
Having obtained detailed physical and mathematical insight into the theoretical description of these systems, we proceeded to investigate the performance of the commonly applied wave function based methods, both single-and multi-reference. Regarding multi-reference methods, the performance depends strongly on the choice of active space. The minimal active space required for a qualitatively correct description is one containing the ligand orbital involved in the SOMO LMCT together with the double-d shell, for the orbital relaxation.
Regarding single reference methods, we find that the correct description is obtained provided that • The wavefunction contains both the ROHF and SOMO LMCT configurations and all single excitations from each • The wavefunction coefficients are obtained to at least 2nd order in perturbation theory (fourth order in the energy) • The wavefunction coefficients are obtained in a size extensive manner In this respect, our study reveals that CC-based methods are perfectly suited for the study of these Cu 2+ complexes, since the excitation manifold of singles and doubles contains all important configurations, the iteratively optimised amplitudes correspond to high order in perturabtion theory, and the method is size extensive. We find that CCSD(T) performs well despite exhibiting large T 1 and D 1 diagnostics for all wavefunctions. Indeed, BCCD(T) and are large when there are strong orbital relaxation effects and are an indirect indication of multi-reference character at best. In this case the assumption that large T 1 and D 1 values predict the failure of CCSD(T) is incorrect.
Our study also reveals that CISD performs poorly. Our analysis proves that the non-size extensive nature of the CISD equations leads to erroneous supression of correlating excitations, biasing spin states with smaller correlation energies. We expect that our observation that size extensivity errors plague calculations of vertical spectrum of molecular complexes at equilibrium geometry as well as dissociation energies will be generally applicable to all systems, since the errors simply grow with the magnitude of the correlation energy.
Finally, having established the reliability of the CC-based methods for the determination of the energies of the spin states, we performed CCSD(T) and BCCD(T) calculations in a large basis set using explicitly correlated corrections in order to establish reference values for the energy differences of these three Cu 2+ complexes (see Table V). Our near basis set limit core-valence correlated energies with scalar relativistic effects included agree with observed energy differences from single crystal electronic absorption spectroscopy to within 5 mH, which is the same magnitude as the change expected due to placing the gas phase ion in the solid state crystal environment.
This study provides futher confirmation of the importance of LMCT in the determination of the properties of many 3d transition metal containing molecular complexes and highlights once more that metal-ligand delocalisation is very sensitive to the level to which electronic correlation is treated.