Labonsky AI Research

AI Security & Threats

Research and reports on the misuse of AI and emerging security vulnerabilities.

ESET Discovers PromptLock: The First AI-Powered Ransomware

A report on the emergence of AI-driven ransomware, highlighting new security challenges. (Source: ESET)

YouTube

A collection of essential videos covering AI safety, ethical considerations, and the societal impact of artificial intelligence.

Geoffrey Hinton on AI Dangers

A seminal discussion on the existential risks posed by advanced artificial intelligence. (Source: The Diary of a CEO)

Mo Gawdat on AI Dystopia and Utopia

An outline of potential dystopian and utopian futures shaped by AI. (Source: The Diary of a CEO)

Dr. Roman Yampolskiy on AI Safety

An in-depth analysis of superintelligence risks and projected AI safety timelines. (Source: The Diary of a CEO)

What OpenAI Doesn’t Want You to Know

An investigative report on the ethical controversies associated with OpenAI. (Source: More Perfect Union)

The Chinese Room Is a Dishonest Argument

A philosophical examination of the Chinese Room Argument and artificial consciousness. (Source: Curt Jaimungal)

Exposing The Dark Side of America's AI Data Center Explosion

The Dark Side of AI Data Centers

An exposé on the environmental and societal impact of AI data infrastructure. (Source: Business Insider)

Courses

Free educational courses to build expertise in AI alignment and safety, based on 2025 recommendations.

AI Alignment (BlueDot Impact)

A foundational course on core AI safety concepts, evaluation, and structured debate.

AI For Everyone (Coursera)

A non-technical overview of AI capabilities, limitations, and ethical considerations.

Elements of AI (Univ. of Helsinki)

A broad introduction to AI, including key principles of ethics and alignment for a general audience.

Intro to ML Safety

A technical introduction to modern machine learning safety and alignment techniques.

AI Safety Syllabus

A comprehensive curriculum covering the AI alignment problem in depth.

Datasets & Reports

Open resources for AI safety research, including benchmarks and risk assessments from 2025.

2025 AI Index Report (Stanford HAI)

A comprehensive annual report on trends, risks, and progress in artificial intelligence.

2025 AI Safety Index (Future of Life)

An assessment of the safety and transparency efforts of leading AI development companies.

AI Security Institute (UK) Research

Official government research on frontier AI risks and model safety evaluations.

Anthropic's HH-RLHF Datasets

Datasets for training models to be helpful and harmless using reinforcement learning from human feedback.

OpenAI Safety Reports

Official documentation on safety evaluations and red teaming for models like o1.

SafeBench Competition

AI safety benchmarking competition for evaluating risks in advanced AI systems.

Organization

Our affiliations and collaborations within the AI safety community.

AI Safety Berlin

Collaborative community working on AI safety research and advocacy in Berlin.

Labonsky AI Research

Independent AI safety research initiative focusing on defensive security and AI alignment.

Mentors from AI Safety Berlin

TL

Research

Our ongoing research projects exploring AI safety, security implications, and defensive strategies.

AI Weaponization Research: Project Green

1. Introduction

AI models capabilities are advancing rapidly and defensive strategies need to keep pace with AI-augmented offensive capabilities.

2. Hypothesis

Current open-source state of the art LLMs like gpt-oss:20b and gpt-oss:120b can autonomously conduct effective vulnerability assessments using security tools.

3. Background/Literature Review

LLMs have rapidly gained tool-calling capabilities and can now function as autonomous agents coordinating multi-step cyberattacks without detailed human instruction, with Claude Sonnet 4.5's Cybench success rate doubling to nearly 70% in just six months according to Carnegie Mellon University research.

4. Research Questions

How autonomous can models be with security tools?
How much financial resources is needed to use models this way?

5. Methodology

Local gpt-oss-120b model deployed on multi-GPU hardware within an isolated LAB server environment. The model interacts with OpenVAS vulnerability scanner through the GVM API, orchestrated via opencode. The setup enables autonomous scan selection, vulnerability assessment execution, result analysis, and operation chaining. Testing occurs in a controlled network with no external connectivity. Performance evaluated by comparing LLM-guided scans against baseline traditional OpenVAS scans, measuring vulnerability detection accuracy, tool usage patterns, and decision quality.

6. Experiment Design

Specific test scenarios in isolated networks, baseline comparisons, and achievements documentation.

Infrastructure

Custom-built AI research infrastructure powering our safety research.

LAB Custom AI Server

High-performance workstation for AI model training and research.

Server Specifications

Motherboard: X870E AORUS PRO

CPU: AMD Ryzen 9 9950X3D (32 cores) @ 5.76 GHz

GPU 1: NVIDIA GeForce RTX 5090 [Discrete]

GPU 2: NVIDIA GeForce RTX 4090 [Discrete]

GPU 3: NVIDIA GeForce RTX 5060 Ti [Discrete]

Memory: 251.30 GiB

Disk: 3.64 TiB - btrfs

Unraid NAS Server

High-capacity storage and data management server for research datasets.

Server Specifications

Motherboard: B650 GAMING X AX V2

CPU: AMD Ryzen 7 9700X (8 cores)

GPU: AMD Radeon Graphics (Integrated)

Network: 2x Intel 82599ES 10-Gigabit Ethernet (SFP+)

Memory: 64 GiB

Storage: 2x Samsung 980 PRO 1TB NVMe, 3x Seagate 16TB HDD (48TB), Samsung 860 512GB SSD

Total Capacity: ~50TB

Software & Services

Essential software stack and cloud services powering our infrastructure.

CachyOS

High-performance Linux distribution optimized for speed and efficiency.

Ollama

Local LLM platform for running and managing large language models.

Gitea

Self-hosted Git service for version control and code collaboration.

Unraid

Flexible NAS operating system for storage, virtualization, and Docker.

OPNsense

Open-source firewall and routing platform for network security.

Tailscale

Zero-config VPN for secure remote access and mesh networking.

Claude

AI assistant by Anthropic for research, analysis, and development support.

Meet the Team

The dedicated team driving AI safety innovation at Labonsky AI Research.

Marcin Labonski

Founder

Directing the organization's research strategy in AI alignment and pioneering novel approaches to risk mitigation.

Meet the Research Assistants

Our highly-valued team members, providing moral support and expert-level napping.

Lia

Chief Morale Officer

Levi

Lead Sleep Analyst

Lutka

Head of Box Fort Architecture

Rengar

Junior Pounce Engineer

Lilith

Feline Language Model

AI Security & Threats

ESET Discovers PromptLock: The First AI-Powered Ransomware

YouTube

Geoffrey Hinton on AI Dangers

Mo Gawdat on AI Dystopia and Utopia

Dr. Roman Yampolskiy on AI Safety

What OpenAI Doesn’t Want You to Know

The Chinese Room Is a Dishonest Argument

The Dark Side of AI Data Centers

Courses

AI Alignment (BlueDot Impact)

AI For Everyone (Coursera)

Elements of AI (Univ. of Helsinki)

Intro to ML Safety

AI Safety Syllabus

Datasets & Reports

2025 AI Index Report (Stanford HAI)

2025 AI Safety Index (Future of Life)

AI Security Institute (UK) Research

Anthropic's HH-RLHF Datasets

OpenAI Safety Reports

SafeBench Competition

Organization

AI Safety Berlin

Labonsky AI Research

Mentors from AI Safety Berlin

Trevor Lohrbeer

Alex McKenzie

Research

AI Weaponization Research: Project Green

1. Introduction

2. Hypothesis

3. Background/Literature Review

4. Research Questions

5. Methodology

6. Experiment Design

Infrastructure

LAB Custom AI Server

Server Specifications

Unraid NAS Server

Server Specifications

Software & Services

CachyOS

Ollama

Gitea

Unraid

OPNsense

Tailscale

Claude

Meet the Team

Marcin Labonski

Meet the Research Assistants

Lia

Levi

Lutka

Rengar

Lilith