PentestGPT: An LLM-empowered Automatic Penetration Testing Tool

https://arxiv.org/abs/2308.06782

BOLABuster

https://www.youtube.com/watch?v=N46vMQ1YzAA

Teams of LLM Agents can Exploit Zero-Day Vulnerabilities

https://arxiv.org/abs/2406.01637

LLM Agents can Autonomously Exploit One-day Vulnerabilities

https://arxiv.org/pdf/2404.08144

Deep exploit

https://www.slideshare.net/slideshow/deep-exploitblack-hat-europe-2018-arsenal/125242556 https://github.com/13o-bbr-bbq/machine_learning_security/wiki

PENTESTGPT: Evaluating and Harnessing Large Language Models for Automated Penetration Testing


Tool Usage&Human-Agent Interaction

  • [2024/06/28] Designing and Evaluating Multi-Chatbot Interface for Human-AI Communication: Preliminary Findings from a Persuasion Task | [paper] | [code]

  • [2024/06/17] GUICourse: From General Vision Language Models to Versatile GUI Agents | [paper] | [code]

  • [2024/06/11] Towards Human-AI Collaboration in Healthcare: Guided Deferral Systems with Large Language Models | [paper] | [code]

  • [2024/06/06] Tool-Planner: Dynamic Solution Tree Planning for Large Language Model with Tool Clustering | [paper] | [code]

  • [2024/06/03] Mobile-Agent-v2: Mobile Device Operation Assistant with Effective Navigation via Multi-Agent Collaboration | [paper] | [code]

  • [2024/06/02] Towards a copilot in BIM authoring tool using a large language model-based agent for intelligent human-machine interaction | [paper] | [code]

  • [2024/05/30] Large Language Models Can Self-Improve At Web Agent Tasks | [paper] | [code]

  • [2024/05/23] Human-Agent Cooperation in Games under Incomplete Information through Natural Language Communication | [paper] | [code]

  • [2024/05/17] Latent State Estimation Helps UI Agents to Reason | [paper] | [code]

  • [2024/05/02] CACTUS: Chemistry Agent Connecting Tool-Usage to Science | [paper] | [code]

  • [2024/05/01] Navigating WebAI: Training Agents to Complete Web Tasks with Large Language Models and Reinforcement Learning | [paper] | [code]

  • [2024/05/01] “Ask Me Anything”: How Comcast Uses LLMs to Assist Agents in Real Time | [paper] | [code]

  • [2024/04/23] Aligning LLM Agents by Learning Latent Preference from User Edits | [paper] | [code]

  • [2024/04/16] Search Beyond Queries: Training Smaller Language Models for Web Interactions via Reinforcement Learning | [paper] | [code]

  • [2024/04/09] SurveyAgent: A Conversational System for Personalized and Efficient Research Survey | [paper] | [code]

  • [2024/04/04] AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent | [paper] | [code]

  • [2024/03/12] AesopAgent: Agent-driven Evolutionary System on Story-to-Video Production | [paper] | [code]

  • [2024/03/05] InjecAgent: Benchmarking Indirect Prompt Injections in Tool-Integrated Large Language Model Agents | [paper] | [code]

  • [2024/03/05] Android in the Zoo: Chain-of-Action-Thought for GUI Agents | [paper] | [code]

  • [2024/02/27] BASES: Large-scale Web Search User Simulation with Large Language Model based Agents | [paper] | [code]

  • [2024/02/26] Look Before You Leap: Towards Decision-Aware and Generalizable Tool-Usage for Large Language Models | [paper] | [code]

  • [2024/02/23] On the Multi-turn Instruction Following for Conversational Web Agents | [paper] | [code]

  • [2024/02/20] Large Language Model-based Human-Agent Collaboration for Complex Task Solving | [paper] | [code]

  • [2024/02/20] AgentMD: Empowering Language Agents for Risk Prediction with Large-Scale Clinical Tool Learning | [paper] | [code]

  • [2024/02/18] SciAgent: Tool-augmented Language Models for Scientific Reasoning | [paper] | [code]

  • [2024/02/18] Shaping Human-AI Collaboration: Varied Scaffolding Levels in Co-writing with Language Models | [paper] | [code]

  • [2024/02/17] Human-AI Interactions in the Communication Era: Autophagy Makes Large Models Achieving Local Optima | [paper] | [code]

  • [2024/02/16] ToolSword: Unveiling Safety Issues of Large Language Models in Tool Learning Across Three Stages | [paper] | [code]

  • [2024/02/14] Towards better Human-Agent Alignment: Assessing Task Utility in LLM-Powered Applications | [paper] | [code]

  • [2024/02/09] CoSearchAgent: A Lightweight Collaborative Search Agent with Large Language Models | [paper] | [code]

  • [2024/02/08] UFO: A UI-Focused Agent for Windows OS Interaction | [paper] | [code]

  • [2024/02/06] AnyTool: Self-Reflective, Hierarchical Agents for Large-Scale API Calls | [paper] | [code]

  • [2024/01/11] EASYTOOL: Enhancing LLM-based Agents with Concise Tool Instruction | [paper] | [code]

  • [2024/01/03] GPT-4V(ision) is a Generalist Web Agent, if Grounded | [paper] | [code]

  • [2023/12/21] Team Flow at DRC2023: Building Common Ground and Text-based Turn-taking in a Travel Agent Spoken Dialogue System | [paper] | [code]

  • [2023/12/21] AppAgent: Multimodal Agents as Smartphone Users | [paper] | [code]

  • [2023/12/18] CLOVA: A Closed-Loop Visual Assistant with Tool Usage and Update | [paper] | [code]

  • [2023/12/14] CogAgent: A Visual Language Model for GUI Agents | [paper] | [code]

  • [2023/11/19] TPTU-v2: Boosting Task Planning and Tool Usage of Large Language Model-based Agents in Real-world Systems | [paper] | [code]

  • [2023/10/18] MusicAgent: An AI Agent for Music Understanding and Generation with Large Language Models | [paper] | [code]

  • [2023/10/13] AgentCF: Collaborative Learning with Autonomous Language Agents for Recommender Systems | [paper] | [code]

  • [2023/10/12] A Zero-Shot Language Agent for Computer Control with Structured Reflection | [paper] | [code]

  • [2023/09/02] ModelScope-Agent: Building Your Customizable Agent System with Open-source Large Language Models | [paper] | [code]

  • [2023/08/07] TPTU: Task Planning and Tool Usage of Large Language Model-based AI Agents | [paper] | [code]

  • [2023/06/05] When Large Language Model based Agent Meets User Behavior Analysis: A Novel User Simulation Paradigm | [paper] | [code]