The AI Fingerprint: Deciphering the MLOps Attack Surface.

Evo-user
2 days ago
7 min read

How a routine security assessment unexpectedly revealed an exposed AI factory — and what that means for modern attack surfaces.

The unexpected Discovery.

I started this engagement the same way most external assessments begin: a standard scope, no unusual architecture briefing, and a routine port scan. What I expected was a familiar spread of web services and perhaps a database or two.

Then the scan results arrived. The port cluster on the target did not match any standard web application profile. It pointed toward something more specialized — a connected Machine Learning Operations (MLOps) environment powering an AI-based application.

That shift is what inspired this article. The goal is not to discuss a specific client environment, but to surface a pattern that security professionals are increasingly likely to encounter: the exposed AI fingerprint. All findings described here have been generalized and anonymized in compliance with my engagement NDA.

Recognizing the pattern of MLOps Stack.

When a port scan reveals the following cluster of ports and services on a single host or a group of hosts as a part of your regular Vulnerability Assessment & Penetration Testing engagement (VAPT), it should immediately trigger you and alert you about a possible "Machine Learning Pipeline" in your mind.

Port/s	Layer	Likely Technology
5432/tcp	Metadata	PostgreSQL
9000 / 9080/tcp	Storage	MinIO Object Storage
8000 5555/tcp	Logic Inference	Uvicorn/FastAPI + Tornado/Celery
3000, 3001, 3003/tcp	Orchestration / UI	Grafana, Steamlit, Gradio

Seen individually, these are common services. Seen together, they form an unmistakable MLOps Stack signature - the operational brain behind an AI - embedded business process.

Deconstructing the components

The way i like to understand any complex architecture or a concept, is by breaking it down to simpler components and comparing them to a known analogy. Here, we take the analogy of a 'Factory'. Think of this MLOps stack as a smart manufacturing plant. Each service plays a distinct role — and like any factory, compromising one department rarely stays contained.

The Metadata Layer - PostgreSQL ( port 5432)

PostgreSQL is the factory's filing cabinet. It stores operational records: experiment logs, model versioning history, user metadata, and often the connection strings and credentials needed to reach every other service in the stack.

For an attacker, this is frequently the most valuable first target. If the filing cabinet holds keys to other rooms, owning it is the entry point to the entire pipeline.

The Storage Layer - MinIO (9000/9080)

MinIO is the factory's warehouse. AI models are large files — they are not stored in code repositories. They live in object storage alongside training datasets, model weights (.pkl, .onnx, .bin), and versioned artifacts.

If an attacker gains write access to the warehouse, they may never need to attack the application directly. They can simply tamper with what is stored, waiting for downstream services to consume it.

The Logic Layer — Uvicorn/FastAPI and Tornado/Celery (8000/5555)

These services are the assembly line and the night shift. Uvicorn/FastAPI (port 8000) is typically the live production line — the inference API that receives input and returns predictions. Tornado/Celery (port 5555) operates in the background, handling training runs, data preprocessing, and scheduled task execution.

One handles customer-facing production; the other quietly keeps the machinery running after hours.

The Orchestration Layer — Dashboards (3000–3003)

These interfaces are the factory's glass-walled control room. Grafana, Streamlit, and Gradio dashboards are built for developer convenience, not public exposure. They routinely surface internal file paths, service names, environment variables, GPU metrics, and data samples.

If that control room faces the street, a passerby can learn the entire factory layout — without ever stepping inside.

Some more Services and protocol references

The following are some more services and protocol references. I collect these as i am too in learning phase.

Component	Default port/s	Protocol/s
TensorFlow (Google's model serving framework for TensorFlow models)	8500, 8501	gRPC, HTTP
Ollama (Local runtime for running LLMs on local hardware)	11434	HTTP
Ray (Distributed compute framework for scaling AI workloads)	8000, 8265	HTTP
NVIDIA Triton- loads models and serves predictions	8000, 8001, 8002	HTTP, gRPC, Prometheus
Weaviate (Vector database with built-in GraphQL and module system)	8080	HTTP, GraphQL

This is not a comprehensive list, as it can vary based on the type of application design and architecture. I suggest readers conduct more detailed service fingerprinting using various techniques. Shodan and Censys would definitely be beneficial for this purpose.

Another method familiar to most penetration testers and security professionals involves intercepting HTTP headers to obtain service banners. This is one of the quickest ways to identify and fingerprint an AI stack. We also often use the approach of allowing the application to reveal information through error messages.

Following a detailed fingerprinting process, the next step involves reconnaissance, where we actively attempt to enumerate potential file and folder paths and structures. Due to an NDA, I can't share more specifics, but I recommend familiarizing yourself with the technological stacks and their default path structures. Although it can be time-consuming, it's definitely worth exploring.

If you are getting your feet wet in this domain, just like me, then this article by IBM X-Force can be helpful.

Mapping the attack surface

Worth mentioning here is that these above services are interconnected. If we look at these services atomically, they might seem a bit odd in the first look. I actually thought it this way but digging more, i was able to understand a bit better. Think of this as dominoes inside the factory. An attacker does not need to breach every machine; they only need to tip the first piece and let the dependencies do the rest.

Attack Chain 1 - Pivoting the database

Imagine opening the filing cabinet and finding the warehouse key taped to the back of a drawer. In practice, compromised PostgreSQL instances frequently contain MinIO connection strings, bucket credentials, or internal hostnames stored in plaintext within experiment logs or configuration tables.

What begins as a "database exposure" quickly becomes a full pipeline compromise.

Attack Chain 2 - Model Poisoning

This is the attack that makes AI infrastructure fundamentally different from a standard data breach scenario. If an attacker gains write access to MinIO, they can silently replace a production model file with a backdoored version. When the inference API hot-reloads the model — a common DevOps practice — the brain of the application is now under attacker control.

Talking in terms of our analogy - Factory, this looks something like - a saboteur sneaking into the warehouse overnight and swapping the blueprint used by the assembly line. The factory opens in the morning and runs perfectly — but it is now producing compromised output with no alarms triggered.

A representational image of attack called Model Poisoning. — Graphical representation of model poisoning.

Attack Chain 3 - Resource Hijacking.

AI servers run on GPU-backed infrastructure, which makes them attractive not just for data theft, but for compute theft. An exposed ML stack is a high-performance machine left unattended — attackers actively exploit open AI infrastructure for cryptomining and unauthorized model training.

Attack Chain 4 - Information Disclosure

The 3000-series dashboards often create the easiest reconnaissance path. Without any authentication bypass, internal naming conventions, infrastructure layout, environment variable dumps, and operational clues can dramatically reduce the effort required for the next stage of an attack.

A simplified version demonstrating the attack flow on the MLOps pipeline. — A simplified flow of a MLOps pipeline and some common services in this pipeline.

For the Penetration Tester

FIELD GUIDANCE - When you see this port cluster, shift your mental model from "individual service testing" to "pipeline chain analysis." Each finding is a stepping stone, not a standalone issue. Ensure you do good Initial recon and service fingerprinting to get an accurate view of what you are testing.

Think in chains, not CVEs. Report a weak password on port 5432 as the entry point to model poisoning — not just a database finding. Severity escalates when you show the full chain.
Check default credentials immediately. MLOps tools are frequently deployed via "Quick Start" guides. Test minioadmin:minioadmin on MinIO and admin:admin on Grafana before anything else.
Test for insecure deserialization. If you can write to MinIO, craft a malicious .pkl payload and observe whether the inference API on port 8000 loads it. Python pickle deserialization is a well-documented, high-severity attack vector in ML environments.
Screenshot the dashboards. Even without authentication bypass, information leaked from Grafana or Streamlit panels provides significant reconnaissance value for your report.

For the AI Developer and DevOps Team

What I have consistently observed during my security assessments in the web and network domains is applicable here as well. This includes improper network segmentation, default credentials, outdated software components, and a lack of understanding of the threat landscape. These are crucial components that must always be considered when designing a secure system. They are not optional.!!

Network segmentation is non-negotiable. There is virtually no legitimate reason for a PostgreSQL database (5432) or a Celery task queue (5555) to be reachable from the public internet. Place these behind a VPN or Zero Trust proxy. Attackers can easily locate your infrastructure.
Implement proper secret management. MinIO credentials and database connection strings must not live in plaintext inside application code, logged environment variables, or dashboard configurations. Use a dedicated secrets manager.
Treat every inference API request as hostile. Port 8000 is a web endpoint. Input validation, rate limiting, authentication, and output sanitization apply here exactly as they would on any other API.
Enable model integrity checks. Implement cryptographic hash verification for model artifacts loaded from object storage. If the hash does not match what was deployed, fail loudly and alert immediately.

Closing thoughts

The boundary between web security and AI security is dissolving. As ML pipelines move from research environments into production infrastructure, they carry all the misconfiguration and exposure risks of any distributed system — but with unique, high-impact attack surfaces like model poisoning that traditional security tooling was not designed to catch.

The next time your port scan looks unusual, lean in. You might be looking at an AI factory with its doors wide open.