A QFT is a highly reducible representation of the Poincare group, and lots of different irreps occur. (In a free massless spin 1/2 theory you even have all physical irreps present!) Thus seeing different representations in a construction is no cause for concern.

The defining representation of a QFT specifies the representation of the Heisenberg fields figuring in the action. The S-matrix, in contrast, employs asymptotic fields, one corresponding to each bound state. This is since it relates the asymptotic behavior at $t=-\infty$ and $t=+\infty$. The asymptotic limit is based on the interaction picture; see Chapter 3 of Vol III of Thirring's math physics series for a rigorous asymptotic limit and the definition of asymptotic observables in QM.

If the two kinds of fields are essentially the same, the Fock space in which the free Heisenberg fields are defined can be identified with the Fock space of the asymptotic particles. This is the situation where perturbation theory is valid and gives a good S-matrix, and this is the situation discussed in the textbooks (usually without warning the reader that it works only under these circumstances).

In QFT, the discrepancy between the two kinds of fields is precisely the notorious infrared problem. It arises since if the field structure is different, the asymptotic Hilbert space of the S-matrix is structurally very different from the Heisenberg Hilbert space. (Note that all separable Hilbert spaces are isomorphic. Thus there exists some unitary isomorphism between the two, mediated by the so-called Moeller operator. But this operator is no longer perturbatively accessible.) Thus a match is impossible. The conclusion is that one cannot get a meaningful perturbative S-matrix if the bound state field content is different from the Heisenberg content of a field theory. It is also well-known from QM that the Born series diverges if there is a bound state - it must, because the T-matrix develops a pole. Path integral perturbation theory is just a covariant version of the Born series, hence has the same problem. See also http://www.physicsoverflow.org/808/different-kinds-of-s-matrices?show=927#a927 for a discussion of how the textbook S-matrix treatment may break down.

For example, in QED one has an infrared problem since an asymptotic electron, being charged, is dressed with an intrinsic electromagnetic field (which means that, formally - before the IR limit - it is a superposition of a Heisenberg electron and infinitely many virtual photons and virtual electron-positron pairs), while the Heisenberg electron has charge but no associated electromagnetic field.

In QCD (and in the standard model) one has a much more severe infrared problem since asymptotic bound states (mesons, baryons, glueballs - and in the standard model also nuclei, atoms, and molcules) are white, while the Heisenberg fields (quarks and gluons) have color.

The asymptotic states emerge naturally (as in QM) from the interactions by taking an asymptotic limit. Before the limit is taken, they are (formally) superpositions of arbitrarily many Heisenberg particles, but in the limit, the formal contact with the Heisenberg representation gets lost. (This is why in QM, bound states are never discussed in the Heisenberg picture - it is useful only for finite times!)

In a theory where the Heisenberg fields are bosons it is possible because of topological subtleties that an asymptotic field is a fermion field (there are explicitly solvable rigorous examples in 1+1D). In the classical limit, the latter are seen as solitons. This is inverse to the phenomenon in 1+1D called bosonization. The simplest rigorous example is that of a free fermion theory, where the current field satisfies the commutation rule of a free boson field. Conversely, one can reconstruct (by an explicit but messier construction) inside the algebra of operators on a boson Fock space a (soliton) field with the commutation relations of a fermion field, in such a way that its current field equals the original boson field. The result is that in 2D there is no theorem relating spin and statistics. In particular, the bosonic Sine-Gordon QFT and the fermionic Thirring model are equivalent QFTs, since one can express the operators of each in terms of the other.

In 4D, similar things can apparently happen (because solitons actually exist in some cases), but only for interacting theories. The spin-statistics theorem - a theorem about representations of free fields - forbids this in the free case (only).

Soliton states are orthogonal to all boson states (which includes the vacuum), for essentially the same reason that bound states of different energy are orthogonal. When approximated at finite IR and UV cutoff, solitons are bound states of infinitely many bosons and therefore only approximately orthogonal to boson states. But when the IR cutoff is removed they move into a different superselection sector of the theory. (Due to renormalization effects, the Hilbert space structure changes radically in this limit; in fact, in 4D, it seems that no sensible Hilbert space is left, which is the cause for the hardness of the YM millennium problem.) Note that the superpositions of the bosons that make approximate fermions cease to be superpositions only in the infrared limit (where the compactness of space is removed) where the approximation becomes exact and solitons appear. (Solitons cannot exist in a compact space, as they are defined by different boundary conditions at spatial infinity when $t\to \pm\infty$.)

Therefore, it doesn't make sense to regard the solitons as bound states in the final, covariant theory. Indeed, at the level of the S-matrix, one has democracy of particles - all bound states appear in the same way. (This inspired Heisenberg's idea about using the S-matrix as basic principle to organize the in his time ill-understood structure of the particle zoo.) What is elementary and what is composite is not determined by the S-matrix but by the action that generates it. If - as in 1+1 D in the example mentioned - two different actions produce isomorphic QFTs (in the sense that their Borchers class of local fields are isomorphic) and hence equivalent S-matrices, the elementary particle content is not unique, and depends on the action preferred by the user. This is the same kind of user preferences as in a classical theory written in different coordinate systems. The physics is independent of the coordinate system, but not the resulting interpretation. In the same way, a QFT is in an intrinsic sense action-independent. (There are even 1+1D QFTs satisfying the Wightman axioms but where no associated action is known.)

To see the wave functions of a solitonic bound state one cannot use the Fock space formulation but needs to work in the functional Schroedinger picture. There $\psi$ is a functional of the classical field $\phi$, and a semiclassical interpretation in terms of density and phase is possible, which gives some limited intuitive understanding. See http://www.physicsoverflow.org/22012/functionals-of-quantum-states-in-qft?show=22145#a22145

I cannot answer your query (3) since all Jackiw does is just plausible reasoning, picking the arguments that were found working in practice. Most of QFT is not yet a theory where logic can be applied everywhere. (Actually, it seems that your argument is wrong since a soliton state $|P\rangle$ is not translation invariant. - Will check this.)