Automatic data exploration and visualisation generation.
Claim this tool to publish updates, news and respond to users.
Sign in to claim ownership
Sign InLIDA is an open-source Python library developed by Microsoft Research for the automated generation of data visualizations and infographics from datasets. Its core value proposition lies in leveraging large language models (LLMs) to understand data context and user intent, thereby automating the entire workflow from data summarization and goal-oriented chart creation to the refinement and explanation of visualizations. This significantly lowers the barrier to advanced data exploration, enabling users to move from raw data to actionable insights with minimal manual coding or design effort.
Key features: LIDA operates through a multi-stage pipeline that includes data summarization, where it creates a rich textual and statistical summary of the dataset. It then supports goal-driven visualization generation, allowing users to request charts using natural language queries (e.g., 'show sales trend by month'). The system can generate, refine, and optimize visualization code (e.g., in Matplotlib, Seaborn, or Altair) and also produce automated, textual explanations and alt-text for accessibility. Furthermore, it includes capabilities for creating infographics and maintaining consistency across multiple related charts.
What sets LIDA apart is its model-agnostic architecture, which can integrate with various LLM backends like OpenAI GPT, Azure OpenAI, or local models via Hugging Face, providing flexibility in deployment and cost. It is not a standalone application but a developer toolkit designed to be embedded into data science platforms, notebooks, or applications. Its open-source nature and focus on explainability, including the generation of visualization rationale and accessibility text, differentiate it from simpler chart auto-suggestion tools by offering a comprehensive, reasoning-based approach to visual data storytelling.
Ideal for data scientists, analysts, and developers looking to integrate automated visualization capabilities into their data platforms or streamline their exploratory data analysis (EDA) workflow. Specific use cases include rapidly prototyping dashboards, enhancing business intelligence tools with natural language interfaces, generating reports with consistent visual narratives, and improving data accessibility through automated alt-text. It is particularly valuable in industries like finance, research, and business analytics where speed and depth of insight are critical.
While the core library is free and open-source, using it with premium LLM APIs like OpenAI incurs the standard costs of those services. The library itself imposes no usage limits, but its capabilities are contingent on the power and associated costs of the chosen underlying language model.