Dither the project became Dither AI when it became obvious the consensus view was incorrect (consensus view in AI at the time of our first successful model was “the transformer architecture could not perform time-series operations”). Since our first prototype models, we have seen the proliferation of Omni models and AI agents in other fields of AI. We have also seen many quirks of large time-series models firsthand which lead us to a final conclusion:
“Time-Series by Dither AI will begin as a tool to be used by humans and agents but will soon become a modality atop large omni agents.”
Omni models are named because they can take a variety of modalities (audio, image, video, text, etc) as input and output a variety of modalities. This is your 4o by OpenAI. This happens natively in the model with a series of tokenization/embedding layers, a central processing structure, and a series of output layers. These architectures imply that all modalities can be compressed into the same embedding space.
The first version of these models were the large vision models (LVM) that could take both text and images as input. It was obvious then you could embed both image and text domains into the same embedding space. An image of a dog was represented similarly to the word ‘dog’ to the model. It could then perform operations for an output. At this point, it was clear that eventually we would get large time-series language models (LTLM).
If you wanted to jerry-rig a large vision model, you would start with an image to text model. You would then convert the image to text and append this to your prompt to use for a LLM. This takes two models. The next step is to embed the image using the vision model embedding layers. With some fine-tuning, you could use this embedding directly in the language model. This still requires training two models, but the end result is one model with two embedding layers. Let’s call this the 1.5 model stage. The final iteration is training both the text and vision embeddings during the pretraining stage of the LVM. This is when you truly would only need a single model.
We are currently at the “two model” stage for LTLM. It is arguably even early days of the two-model stage because of the intricacies with Transformer-based time-series models.
Agents are like bots. They are designed to do a number of tasks. The key differentiation is the determinism of the actions. Bots are deterministic. Agents are stochastic. No matter what, you want bots to function in a predictable manner. Agents, however, might decide a different action based upon the temperature of the GPU which was performing inference. Agents are also much slower than bots.
If you are performing high-speed trading then you are going to use niche DL models or more likely classical statistical techniques with bots. You want to minimize stochasticity in your system. However, this means it is extraordinarily difficult to develop a generalized system to interact with the environment. Agents have the promise of generalized operations. It is still difficult but has more promise in operating in generalized systems.
Currently, most ‘agents’ act as bots. The only agent-esque utility is their ability to operate in human text domains like a native. Otherwise, they are decision-tree-based bots calling specific functions. There are exceptions to this rule, but agents are nascent.
The ideal agent is like a terminal of truth. It plays a turn-based game. New information from sources comes in and is processed. Tools are chosen based upon actions of the environment by the model itself instead of a decision tree. The model should be doing the tool calling by itself. Inference in this case is less “create a response” and closer to “I’ve chosen to respond with …”. Once we are at this point, agents will get very exciting.
One of the more interesting quirks in transformer-based time-series models (TTSM) is their proficiency in long-term forecasting abilities over short-term forecasting. We believe this is inherently caused by the transformer architectures and can be overcome. TTSM can also be better than most classical methods (DL and statistical) for long-term forecasting. This longer time horizon fits well with larger models because their inference is slower.
TTSM should not be competing with short-term trading bots. Instead, they should leverage their advantage and be used for mid to long-term resource allocation. TTSM show promise in enabling humans to forecast out to the edge of unpredictability. The key is to determine where the prediction levels break down and overcome challenges in training a large TTSM.
One of the limits of humans is their emotional responses. One human strength is they operate at longer time horizons than most bots and high-frequency traders. TTSM will eliminate that emotional weakness but will function on the same time horizon as people. This will provide stability to all resource allocation markets that is currently lacking. This goes beyond financial markets.
We believe:
Thus, over the short term, humans and agents will be operating tools built with our TTSM forecasting models. Over the midterm, we will begin to see true agents using our forecasting abilities to perform better than humans in generalized game spaces like crypto and finance. Over the long term, we will see omni-models using TTSM embedding layers to operate natively in mixed modalities.
“Time-Series by Dither AI will begin as a tool to be used by humans and agents but will soon become a modality atop large omni agents.”
Main: Return to the Main Page
Free Tier: Explore Dither Telegram and Access Free Demos
Premium Tier: Verify and Access Premium Demos
@BondedPump: Access Pump Fun Bonding Directly on Crypto Twitter
@Dither_Solana: Follow Dither AI on Twitter
Hypothesis: Check Out All Our Current Hypothesis
History: View Our Full History
Demos: Explore our Demos and API