Tianlong Chen, PhD

Co-founder, CTO

tianlong.chen@daicelabs.com

Building Robust AI Systems

Tianlong Chen received the Ph.D. degree in Electrical and Computer Engineering from University of Texas at Austin, TX, USA, in 2023. He currently holds an appointment as Assistant Professor of Computer Science at The University of North Carolina at Chapel Hill. Before that, he was a Postdoctoral Researcher at Massachusetts Institute of Technology (CSAIL@MIT), Harvard (BMI@Harvard), and Broad Institute of MIT & Harvard in 2023-2024.


His research focuses on building accurate, trustworthy, and efficient machine learning systems. His work spans two key areas: machine learning fundamentals (composite AI architectures, sparsity, robustness, optimization, graph learning, and diffusion models) and interdisciplinary applications in bioengineering. His research excellence is recognized through multiple honors including IBM and Adobe Ph.D. Fellowships, Graduate Dean's Prestigious Fellowship, AdvML Rising Star, and the Best Paper Award at LoG 2022.


At Daice Labs, Tianlong leads the development of composite AI architectures inspired by cellular systems and nature's resilience principles. His expertise in machine learning fundamentals and interdisciplinary applications uniquely positions him to translate adaptive and resilience principles into robust AI ecosystems. Under his expertise, Daice Labs is pioneering novel approaches to build adaptive architectures for key strategic domain applications.

Key publications

Flex-MoE: Modeling Arbitrary Modality Combination via the Flexible Mixture-of-Experts. Chen§ et al., Arxiv 2024. Summary: Flex-MoE addresses the challenge of missing modalities in multimodal learning through a novel combination of a missing modality bank and Sparse MoE framework, utilizing generalized and specialized routers to effectively handle arbitrary modality combinations in medical applications (§ supervised this work).


Model-GLUE: Democratized LLM Scaling for A LargeModel Zoo in the Wild. Chen§ et al., Arxiv, 2024. Summary: Model-GLUE presents a holistic guideline for scaling Large Language Models through optimal combination of existing aggregation techniques, achieving significant performance improvements by strategically selecting and integrating models from heterogeneous model zoos (§ supervised this work).


Cut the Crap: An Economical Communication Pipeline for LLM-based Multi-Agent Systems. Chen§ et al., Arxiv, 2024.

Summary: AgentPrune is a cost-efficient framework for LLM-powered multi-agent systems that prunes redundant communications, reducing token usage by up to 72.8% while maintaining performance and enhancing security, all at just 13% of standard costs (§ supervised this work).


GTBENCH: Uncovering the Strategic ReasoningLimitations of LLMs via Game-Theoretic Evaluations. Chen§ et al., Arxiv, 2024. Summary: GTBENCH introduces a comprehensive framework for evaluating LLMs' strategic reasoning through game-theoretic tasks, revealing performance patterns across different game types and models, with commercial LLMs generally outperforming open-source ones except for code-pretrained models like Llama-3-70b-Instruct (§ supervised this work).


MoE-RBench: Towards Building Reliable Language Models with SparseMixture-of-Experts. Chen§ et al., Arxiv, 2024.

Summary: MoE-RBench introduces the first systematic evaluation of MoE model reliability in language models, demonstrating that with proper configuration, MoE architectures can exceed dense networks' reliability across safety, adversarial attacks, and out-of-distribution scenarios (§ supervised this work).


AdaMV-MoE: Adaptive Multi-Task Vision Mixture-of-Experts. Chen et al., IEEE, 2023.

Summary: AdaMV-MoE presents a novel multi-task learning approach that dynamically adjusts network capacity per task, replacing fixed expert allocation with adaptive determination, demonstrating superior performance in complex vision recognition tasks (§ supervised this work).


DLO: Dynamic Layer Operation for Efficient Vertical Scaling of LLMs, Chen§ et al., Arxiv, 2024.

DLO introduces vertical scaling for LLMs through dynamic layer management (expanding, activating, or skipping), offering comparable performance to larger models with improved efficiency during fine-tuning, eliminating need for continual pre-training (§ supervised this work).

Team members

We are hiring!

Privacy policy

OK