Roadmap
The project will be developed in multiple phases to ensure the gradual maturation of the technology and the stability of the ecosystem. Below is our key roadmap:
Phase 1: Research & Prototype Development (Q1 2025)
Develop the Browser Agent prototype, building the Environment Recognition Module to support web DOM parsing and basic UI interaction.
Open-source the LLM-based Control Agent Framework, enabling community participation and making it the first open-source framework for browser control.
Collect and curate the initial dataset, combining open-source data with limited human annotations.
Publish technical whitepaper and establish the developer community.
Phase 2: Decentralized AI & Web3 Integration (Q2 2025)
Introduce decentralized data training and validation, leveraging blockchain and Web3 technologies to enhance incentives for data contributors.
Utilize the Codatta ecosystem to promote data assetization, ensuring transparent access and fair distribution of AI training data.
Phase 3: System Expansion & Performance Optimization (Q3 2025)
Implement multimodal environment perception, integrating text, screenshots, OCR, and other data sources to enhance task comprehension.
Strengthen the Actions Planning Model, optimizing execution pathways to improve task automation success rates and efficiency.
Expand support for multiple browsers and operating systems, enhancing compatibility.
Collaborate with the Codatta platform to establish a high-quality data sourcing and labeling pipeline, refining AI task planning capabilities.
Release the developer SDK and open APIs for third-party integrations.
Phase 4: Computer Agent Development (Q4 2025)
Expand from Browser Agent to Computer Agent, enabling support for desktop and mobile applications, including file management, software operation, and system task automation.
Develop security policies and permission management systems to ensure safe and controlled AI task execution.
Upgrade the LLM-based Control Agent Framework to facilitate broader control over desktop applications beyond the browser environment.
Launch an experimental commercial application, inviting enterprises and developers to test AI-powered task automation.
Phase 5: Fully Autonomous AI Agent (2026 and Beyond)
Integrate a high-autonomy AI task execution model, enabling complex cross-software and cross-device operations.
Develop self-optimization and learning mechanisms, allowing the Agent to refine its task execution strategies based on user scenarios.
Expand the AI application ecosystem, supporting industries such as finance, healthcare, and customer support.
Introduce a decentralized Agent network, enabling global users to contribute and share AI task execution capabilities, achieving true AI-powered computing.
This roadmap ensures a steady technological evolution, while also focusing on real-world applications and commercialization, ultimately creating a highly efficient, secure, and scalable Computer Control Agent ecosystem.
Last updated