google cloud platform - GKE pod with filestore RWX volume takes 30 minutes to start Error syncing pod, skipping" err=&a

I have a GKE pod mounted with RWX volume with Filestore. Below are my storage class,PV,PVC configs.GKE

I have a GKE pod mounted with RWX volume with Filestore. Below are my storage class,PV,PVC configs.

GKE Version - 1.30.9-gke.1127000

For all the pods which uses this multishare volume takes about 30 minutes to start and in kubelet events I see the below error :

Error syncing pod, skipping" err="unmounted volumes=[filestore-rwx-volume], unattached volumes=[], failed to process volumes=[]: context deadline exceeded"

I have verified the connectivity from node and pod to Filestore instance in 2049, it's working fine. Even nodes are healthy.

allowVolumeExpansion: true
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  labels:
    addonmanager.kubernetes.io/mode: EnsureExists
    k8s-app: gcp-filestore-csi-driver
  name: rwx-sc
parameters:
  instance-storageclass-label: rwx
  multishare: "true"
  network: prvpc
  tier: enterprise
provisioner: filestore.csi.storage.gke.io
reclaimPolicy: Retain
volumeBindingMode: WaitForFirstConsumer
'''
'''
apiVersion: v1
kind: PersistentVolume
metadata:
  annotations:
    pv.kubernetes.io/provisioned-by: filestore.csi.storage.gke.io
    volume.kubernetes.io/provisioner-deletion-secret-name: ""
    volume.kubernetes.io/provisioner-deletion-secret-namespace: ""
  creationTimestamp: "2025-01-26T17:07:36Z"
  finalizers:
  - kubernetes.io/pv-protection
  name: pv-pr
spec:
  accessModes:
  - ReadWriteMany
  capacity:
    storage: 60Gi
  claimRef:
    apiVersion: v1
    kind: PersistentVolumeClaim
    name: commerce-prodlive-assets-pvc
    namespace: commerce
    resourceVersion: "3125440"
    uid: 333cea1a-b49160c4d6e8
  csi:
    driver: filestore.csi.storage.gke.io
    volumeAttributes:
      ip: 10.xx.xx.x
      max-share-size: "1099511627776"
      storage.kubernetes.io/csiProvisionerIdentity: 123312-63xxx19-filestore.csi.storage.gke.io
      supportLockRelease: "true"
    volumeHandle: modeMultishare/enterprise-multishare-rwx-/test-k8s/europe-west1/fs-id/pv-pr
  persistentVolumeReclaimPolicy: Retain
  storageClassName: enterprise-multishare-rwx-custom
  volumeMode: Filesystem

We are clueless on why it's taking too long for this Filestore volume mount, if i attach a different volume other than Filestore it works fine.

Please find below my filestore pod log output its same for all the 3

kubectl logs -f filestore-node-dsdsd -n kube-system
Defaulted container "csi-driver-registrar" out of: csi-driver-registrar, gcp-filestore-driver, nfs-services, filestorecsi-metrics-collector
I0309 17:00:06.274180       1 main.go:135] Version: v2.9.4-gke.27-0-gf3945690
I0309 17:00:06.274296       1 main.go:136] Running node-driver-registrar in mode=
I0309 17:00:06.274304       1 main.go:157] Attempting to open a gRPC connection with: "/csi/csi.sock"
I0309 17:00:06.274893       1 connection.go:214] Connecting to unix:///csi/csi.sock
I0309 17:00:11.037725       1 main.go:164] Calling CSI driver to discover driver name
I0309 17:00:11.037755       1 connection.go:243] GRPC call: /csi.v1.Identity/GetPluginInfo
I0309 17:00:11.037762       1 connection.go:244] GRPC request: {}
I0309 17:00:11.041351       1 connection.go:250] GRPC response: {"name":"filestore.csi.storage.gke.io","vendor_version":"v1.6.17-gke.15"}
I0309 17:00:11.041365       1 connection.go:251] GRPC error: <nil>
I0309 17:00:11.041374       1 main.go:173] CSI driver name: "filestore.csi.storage.gke.io"
I0309 17:00:11.041408       1 node_register.go:55] Starting Registration Server at: /registration/filestore.csi.storage.gke.io-reg.sock
I0309 17:00:11.072876       1 node_register.go:64] Registration Server started at: /registration/filestore.csi.storage.gke.io-reg.sock
I0309 17:00:11.072997       1 node_register.go:88] Skipping HTTP server because endpoint is set to: ""
I0309 17:00:11.636522       1 main.go:90] Received GetInfo call: &InfoRequest{}
I0309 17:00:11.780566       1 main.go:101] Received NotifyRegistrationStatus call: &RegistrationStatus{PluginRegistered:true,Error:,}

I have a GKE pod mounted with RWX volume with Filestore. Below are my storage class,PV,PVC configs.

GKE Version - 1.30.9-gke.1127000

For all the pods which uses this multishare volume takes about 30 minutes to start and in kubelet events I see the below error :

Error syncing pod, skipping" err="unmounted volumes=[filestore-rwx-volume], unattached volumes=[], failed to process volumes=[]: context deadline exceeded"

I have verified the connectivity from node and pod to Filestore instance in 2049, it's working fine. Even nodes are healthy.

allowVolumeExpansion: true
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  labels:
    addonmanager.kubernetes.io/mode: EnsureExists
    k8s-app: gcp-filestore-csi-driver
  name: rwx-sc
parameters:
  instance-storageclass-label: rwx
  multishare: "true"
  network: prvpc
  tier: enterprise
provisioner: filestore.csi.storage.gke.io
reclaimPolicy: Retain
volumeBindingMode: WaitForFirstConsumer
'''
'''
apiVersion: v1
kind: PersistentVolume
metadata:
  annotations:
    pv.kubernetes.io/provisioned-by: filestore.csi.storage.gke.io
    volume.kubernetes.io/provisioner-deletion-secret-name: ""
    volume.kubernetes.io/provisioner-deletion-secret-namespace: ""
  creationTimestamp: "2025-01-26T17:07:36Z"
  finalizers:
  - kubernetes.io/pv-protection
  name: pv-pr
spec:
  accessModes:
  - ReadWriteMany
  capacity:
    storage: 60Gi
  claimRef:
    apiVersion: v1
    kind: PersistentVolumeClaim
    name: commerce-prodlive-assets-pvc
    namespace: commerce
    resourceVersion: "3125440"
    uid: 333cea1a-b49160c4d6e8
  csi:
    driver: filestore.csi.storage.gke.io
    volumeAttributes:
      ip: 10.xx.xx.x
      max-share-size: "1099511627776"
      storage.kubernetes.io/csiProvisionerIdentity: 123312-63xxx19-filestore.csi.storage.gke.io
      supportLockRelease: "true"
    volumeHandle: modeMultishare/enterprise-multishare-rwx-/test-k8s/europe-west1/fs-id/pv-pr
  persistentVolumeReclaimPolicy: Retain
  storageClassName: enterprise-multishare-rwx-custom
  volumeMode: Filesystem

We are clueless on why it's taking too long for this Filestore volume mount, if i attach a different volume other than Filestore it works fine.

Please find below my filestore pod log output its same for all the 3

kubectl logs -f filestore-node-dsdsd -n kube-system
Defaulted container "csi-driver-registrar" out of: csi-driver-registrar, gcp-filestore-driver, nfs-services, filestorecsi-metrics-collector
I0309 17:00:06.274180       1 main.go:135] Version: v2.9.4-gke.27-0-gf3945690
I0309 17:00:06.274296       1 main.go:136] Running node-driver-registrar in mode=
I0309 17:00:06.274304       1 main.go:157] Attempting to open a gRPC connection with: "/csi/csi.sock"
I0309 17:00:06.274893       1 connection.go:214] Connecting to unix:///csi/csi.sock
I0309 17:00:11.037725       1 main.go:164] Calling CSI driver to discover driver name
I0309 17:00:11.037755       1 connection.go:243] GRPC call: /csi.v1.Identity/GetPluginInfo
I0309 17:00:11.037762       1 connection.go:244] GRPC request: {}
I0309 17:00:11.041351       1 connection.go:250] GRPC response: {"name":"filestore.csi.storage.gke.io","vendor_version":"v1.6.17-gke.15"}
I0309 17:00:11.041365       1 connection.go:251] GRPC error: <nil>
I0309 17:00:11.041374       1 main.go:173] CSI driver name: "filestore.csi.storage.gke.io"
I0309 17:00:11.041408       1 node_register.go:55] Starting Registration Server at: /registration/filestore.csi.storage.gke.io-reg.sock
I0309 17:00:11.072876       1 node_register.go:64] Registration Server started at: /registration/filestore.csi.storage.gke.io-reg.sock
I0309 17:00:11.072997       1 node_register.go:88] Skipping HTTP server because endpoint is set to: ""
I0309 17:00:11.636522       1 main.go:90] Received GetInfo call: &InfoRequest{}
I0309 17:00:11.780566       1 main.go:101] Received NotifyRegistrationStatus call: &RegistrationStatus{PluginRegistered:true,Error:,}
Share Improve this question edited Mar 11 at 19:14 saurabh umathe asked Mar 10 at 13:55 saurabh umathesaurabh umathe 4136 silver badges29 bronze badges 3
  • Please try the following as the first steps to provide more details, as this is very specific issue. 1. During startup delay check the status/logs of the Filestore's DaemonSets. Use this for logs: resource.type="k8s_container" resource.labels.location="LOC" resource.labels.cluster_name="NAME" labels.k8s-pod/k8s-app="gcp-filestore-csi-driver" severity>="ERROR"; 2. During startup check for the status/errors (via kubectl describe) of the PV and PVC; 3. Try to mount your Filestore instance to a Node manually to check whether it's a Filestore or CSI Driver issue; 4. GKE version? – mikalai Commented Mar 11 at 17:24
  • Hi @mikalai, i have attached pod output of filestore, also in pv and pvc there no events logged. gke version - 1.30.9-gke.1127000 – saurabh umathe Commented Mar 11 at 19:15
  • Could you please provide other requested details so that I can see the whole picture? – mikalai Commented Mar 14 at 14:08
Add a comment  | 

1 Answer 1

Reset to default 0

The delay you’re experiencing during the mounting process is a usual behavior when mounting Filestore volume to GKE as the pod.spec.securityContext.fsGroup setting causes kubelet to run chown and chmod on all the files in the volumes mounted for a given pod. As per Kubernetes documentation:

By default, Kubernetes recursively changes ownership and permissions for the contents of each volume to match the fsGroup specified in a Pod's securityContext when that volume is mounted.

Checking and changing ownership and permissions is very time consuming especially for large volumes with many files, slowing Pod startup. To resolve this, use fsGroupChangePolicy: OnRootMismatch. This can help you control the way that Kubernetes checks and manages ownership and permissions for your volume.

发布者:admin,转转请注明出处:http://www.yc00.com/questions/1744843295a4596687.html

相关推荐

发表回复

评论列表(0条)

  • 暂无评论

联系我们

400-800-8888

在线咨询: QQ交谈

邮件:admin@example.com

工作时间:周一至周五,9:30-18:30,节假日休息

关注微信