Research update — June 2026
Since posting this project, I have continued the literature review and theoretical formulation. The project has narrowed into a more concrete technical question:
Can chain-of-thought reasoning be modelled as a learned local transition over hidden representation states, repeatedly applied across decoding time, and validated through token-level long-chain representation trajectories?
This is a sharpening of the original proposal rather than a pivot. The original project focused on token-level interpretability of long-chain reasoning in transformer models. The current formulation makes that object more precise: the aim is to understand the local transition by which a model moves through representation states as it generates each token in a chain of thought.
The relevant literature already covers many adjacent ingredients: CoT as intermediate computation, CoT as extra serial depth, CoT expressivity, sample-efficiency gains from CoT, test-time compute, latent/non-natural-language reasoning substrates, mechanistic state tracking, and representation trajectories during reasoning. The gap I am now targeting is the synthesis: a predictive or mechanistic account of CoT as a weight-tied, prefix-recursive local transition system over representation states.
This gives the project a clearer near-term timeline. My current working plan is:
1. finish the literature-grounded reformulation by the end of June;
2. spend July deriving and stress-testing the central technical insight;
3. spend August setting up and running experiments, drafting the manuscript, and preparing a publishable result if the direction continues to hold.
The target remains an ICLR-oriented technical result if the work produces sufficiently strong evidence. If the strongest version of the hypothesis fails, the fallback output would be a narrower methodological paper or useful negative result clarifying the limits of token-level representation-trajectory modelling.