How We Train Our Models and Built Peachi OS

At COREA Starstroupe, a non-profit based in Cebu City, Philippines, we are dedicated to advancing open-source AI innovation. This page outlines the training methodologies for our Auralis and Nexora models and the development process for Peachi OS, as detailed in our research paper archive. Our work emphasizes lightweight, efficient, and interpretable AI solutions for resource-constrained environments and conversational intelligence.

Training Auralis: Lightweight and Interpretable NLP

Auralis is a compact natural language processing (NLP) model designed for instruction-following and real-time comprehension in low-resource settings. Our training approach, as documented in our papers, focuses on efficiency and interpretability.

Key Techniques

Training Process

Our training pipeline for Auralis involves:

  1. Data Preparation: Curating and cleaning datasets of dialogues and instructions, augmented with synthetic examples.
  2. Pre-Training: Initial training on a general language corpus to establish base linguistic capabilities.
  3. Fine-Tuning: Task-specific tuning for instruction-following and dialog intent extraction.
  4. Optimization: Applying quantization and pruning to reduce model size without sacrificing performance.
  5. Evaluation: Testing on benchmark tasks like intent recognition and response accuracy, with iterative refinement based on interpretability metrics.

These methods ensure Auralis is both powerful and practical for real-world applications, from smart assistants to educational tools.

Training Nexora: Compact NLP for Edge Devices

Nexora, including its NeuroLite variants, is designed for real-time NLP on resource-constrained devices like microcontrollers. Our training methodologies, detailed across multiple papers, prioritize low-latency inference and stability.

Key Techniques

Training Process

The Nexora training pipeline includes:

  1. Dataset Distillation: Creating small, high-quality datasets from larger corpora to reduce training overhead.
  2. Pre-Training: Building foundational language understanding on general text data.
  3. Knowledge Distillation: Transferring knowledge from larger models to compact architectures.
  4. Fine-Tuning with RLHF: Aligning the model with specific NLP tasks using human feedback loops.
  5. Optimization and Deployment: Applying compression techniques and testing on microcontrollers to ensure low-latency performance.

This approach allows Nexora to power NLP applications in environments with limited computational resources, such as IoT devices and embedded systems.

Building Peachi OS: A Lightweight AI Operating System

Peachi OS is a lightweight operating system designed for NLP workloads, emphasizing modularity and conversational intelligence. Our development process, as documented in our papers, reflects a commitment to open-source innovation.

Key Development Principles

Development Process

The creation of Peachi OS involved:

  1. Architecture Design: Defining a microkernel structure with modular components for NLP processing, memory management, and user interaction.
  2. Core Implementation: Building foundational systems, such as token scheduling and intent vector processing, using C and Assembly for performance.
  3. Integration of NLP Models: Embedding models like Auralis and Nexora to handle conversational tasks, with optimized interfaces for real-time dialogue.
  4. Testing and Refinement: Conducting rigorous testing on low-power hardware to ensure stability and efficiency, with iterative improvements based on performance metrics.
  5. Documentation and Release: Publishing detailed documentation and source code to engage the open-source community and support further development.

Peachi OS powers conversational AI applications, from chatbots to voice assistants, with a focus on accessibility and adaptability.

Our Commitment to Open-Source AI

At COREA Starstroupe, we believe in transparent, ethical, and accessible AI development. Our training and development processes for Auralis, Nexora, and Peachi OS reflect this mission, prioritizing efficiency, interpretability, and community collaboration. For more technical details, explore our research paper archive, where we share our methodologies and findings openly.

If you have questions or wish to contribute, contact us at contact@coreastarstroupe.org. Join us in advancing AI for the global good!