TLDR: Typedef has launched Project Fenic, an open-source, PySpark-inspired DataFrame framework designed to bring structure and reliability to unstructured data processing for Large Language Models (LLMs) and agentic AI applications. It offers semantic operators, efficient batch inference, and aims to significantly streamline AI workflows.
Typedef has officially introduced Project Fenic, an innovative open-source DataFrame framework poised to transform how Large Language Models (LLMs) and agentic AI applications manage and process data. Described as a ‘dataframe for LLMs’ or ‘dataframes for text data,’ Fenic aims to inject structure and determinism into the often chaotic world of unstructured data, a critical challenge in modern AI development.
Inspired by PySpark, Fenic provides a familiar DataFrame abstraction, enabling developers to apply well-understood data operations to complex AI workloads. Unlike traditional data tools that are retrofitted for LLMs, Fenic’s core query engine has been built from the ground up with inference in mind. This purpose-built design includes features like automatic batch optimization for API calls, integrated retry logic, rate limiting, and comprehensive token counting and cost tracking, ensuring efficiency and reliability in production environments.
One of Fenic’s standout capabilities is its ‘semantic intelligence,’ which enhances DataFrame operations with semantic operators. It offers first-class support for various unstructured data types, including markdown and transcripts, alongside efficient batch inference across a wide array of model providers such as OpenAI, Anthropic, Google, and Cohere. This allows for seamless integration of structured and unstructured data pipelines, a significant leap forward in AI data infrastructure.
Industry leaders have lauded Fenic’s potential. Eric Dodds, Head of Product at RudderStack, stated, ‘Typedef transforms our OLAP warehouse into a dynamic product-signal engine by integrating LLM inference and agents. Previously, product managers spent weeks manually processing data for basic queries. Now, they can easily query and dive deep across diverse datasets, leveraging LLM categorizations and summarizations by feature, product group, or customer. This is 100x time savings and a game changer for us.’ Wes McKinney, creator of the Python pandas project, echoed this sentiment, remarking, ‘Typedef’s design of Fenic is a natural evolution of the DataFrame abstraction, bringing the same clarity and composability that made pandas indispensable, now applied to modern AI and unstructured data workloads. It’s exciting to see the DataFrame API extended into the AI era and love what typedef is building.’
Fenic also addresses the need for more predictable and responsive AI agents through its decoupled architecture. By offloading heavy inference tasks from the agent runtime, it optimizes resource utilization with batched LLM calls and establishes a cleaner separation between planning/orchestration and execution. This approach not only streamlines development but also significantly reduces the time and cost associated with building and deploying semantic extraction pipelines, minimizing Errors and Omissions (E&O) risk.
Mike Eastham, Former Founding Engineer & Chief Architect at Tecton, highlighted this aspect: ‘Having built data platforms at scale, I’m blown away by how typedef makes LLM inference feel like a first-class citizen in the data pipeline. It’s the first time I’ve seen unstructured AI workloads treated with the same rigor and simplicity as structured data.’ Chris Riccomini, Creator of SlateDB and Apache Airflow PMC, added, ‘Pretty excited about typedef and their new open source Fenic library. Fenic is dataframes for text data. Combined with typedef, you get AI pipelines as core primitives—like Airflow did for ETL orchestration. It’s built the way good infrastructure should be: composable, opinionated, and open.’
Also Read:
- AWS Releases Open-Source Strands SDK to Empower AI Agent Development
- Vast Data Introduces SyncEngine to Accelerate AI Data Pipelines from Disparate Sources
Project Fenic supports Python versions 3.10, 3.11, and 3.12, making it accessible to a broad developer community. Its open-source nature underscores Typedef’s commitment to fostering innovation in the AI ecosystem by providing foundational tools that unify inference, search, and analytics.


