Tuesday, 2021-08-03

airship-irc-bot<raliev> @sidney.shiba unfortunately, it's not possible to render cert-manager yamls using default clusterctl binary. however, I've made a few changes and compiled the one for you which prints all the cert-manager objects that's going to be installed (except CRDs). you can try it by using my clusterctl image, in this case you have to modify following line: https://github.com/airshipit/airshipctl/blob/master/manifests/phases/executors.yaml#L54101:57
airship-irc-botin this way: `image: quay.io/raliev12/clusterctl:latest`01:57
airship-irc-bot<sidney.shiba> @raliev Thanks for creating this image. Unfortunately, the images complains that it was built for v1alpha3 and need for v1lalpha4. See error below:  `$ airshipctl phase run clusterctl-init-target --debug` `[airshipctl] 2021/08/03 14:40:53 opendev.org/airship/airshipctl/pkg/phase/executors/clusterctl.go:123: Building cluster-api provider component documents from kustomize path at14:47
airship-irc-bot'/home/capz/projects/airshipctl/manifests/function/capz/v0.5.1'` `[airshipctl] 2021/08/03 14:40:53 opendev.org/airship/airshipctl/pkg/phase/executors/clusterctl.go:123: Building cluster-api provider component documents from kustomize path at '/home/capz/projects/airshipctl/manifests/function/cabpk/v0.4.0'` `[airshipctl] 2021/08/03 14:40:54 opendev.org/airship/airshipctl/pkg/phase/executors/clusterctl.go:123: Building cluster-api provider14:47
airship-irc-botcomponent documents from kustomize path at '/home/capz/projects/airshipctl/manifests/function/capi/v0.4.0'` `[airshipctl] 2021/08/03 14:40:54 opendev.org/airship/airshipctl/pkg/phase/executors/clusterctl.go:123: Building cluster-api provider component documents from kustomize path at '/home/capz/projects/airshipctl/manifests/function/cacpk/v0.4.0'` `{"Message":"starting clusterctl init14:47
airship-irc-botexecutor","Operation":"ClusterctlInitStart","Timestamp":"2021-08-03T14:40:54.85273306Z","Type":"ClusterctlEvent"}` `[airshipctl] 2021/08/03 14:40:54 opendev.org/airship/airshipctl/pkg/k8s/kubeconfig/builder.go:257: Received error when extracting context, ignoring kubeconfig. Error: failed merging kubeconfig: source context 'target-cluster' does not exist in source kubeconfig` `[airshipctl] 2021/08/03 14:40:5414:47
airship-irc-botopendev.org/airship/airshipctl/pkg/k8s/kubeconfig/builder.go:167: Merging kubecontext for cluster 'target-cluster', into site kubeconfig` `#clusterctl -v5 init --kubeconfig /home/capz/.airship/kubeconfig-962183120 --kubeconfig-context target-cluster --bootstrap=kubeadm:v0.4.0 --control-plane=kubeadm:v0.4.0 --infrastructure=azure:v0.5.1 --core=cluster-api:v0.4.0` `Using configuration File="/workdir/.cluster-api/clusterctl.yaml"` `Installing the14:47
airship-irc-botclusterctl inventory CRD` `Creating CustomResourceDefinition="providers.clusterctl.cluster.x-k8s.io"` `Fetching providers` `Using Override="core-components.yaml" Provider="cluster-api" Version="v0.4.0"` `Using Override="bootstrap-components.yaml" Provider="bootstrap-kubeadm" Version="v0.4.0"` `Using Override="control-plane-components.yaml" Provider="control-plane-kubeadm" Version="v0.4.0"` `Using Override="infrastructure-components.yaml"14:47
airship-irc-botProvider="infrastructure-azure" Version="v0.5.1"Using Override="metadata.yaml" Provider="cluster-api" Version="v0.4.0"` `Error: current version of clusterctl is only compatible with v1alpha3 providers, detected v1alpha4 for provider cluster-api`14:47
airship-irc-bot<raliev> all right, I’ll create a new one with 0.4.0 version of clusterctl instead of 0.3.2214:48
airship-irc-bot<raliev> @sidney.shiba I've rebuilt the image, please use this tag: ```quay.io/raliev12/clusterctl@sha256:c1b5c1fad230b47c9fbcf56231824557989db6f6606ec59201cb6f4e6f23f0a4```15:11
airship-irc-bot<aodinokov> This is urgent. Siraj has created this: https://review.opendev.org/c/airship/treasuremap/+/803212 - this is fixing treasuremap builds (right now they are red). Drew has already +2'ed that, looking for the second core reviewer. @mattmceuen / @sean.eagan /@mb551n please look into that when you have time.15:48
airship-irc-bot<james.gu> I am seeing an error during the phase "kubectl-wait-tigera-target" with the following output:  + echo 'Wait for Calico to be deployed using tigera' Wait for Calico to be deployed using tigera + kubectl --kubeconfig /kubeconfig --context target-cluster wait --all-namespaces '--for=condition=Ready' pods --all '--timeout=2000s' The connection to the server 10.23.25.102:6443 was refused - did you specify the right host or port? The15:56
airship-irc-botconnection to the server 10.23.25.102:6443 was refused - did you specify the right host or port? The connection to the server 10.23.25.102:6443 was refused - did you specify the right host or port? The connection to the server 10.23.25.102:6443 was refused - did you specify the right host or port? The connection to the server 10.23.25.102:6443 was refused - did you specify the right host or port? The connection to the server 10.23.25.102:644315:56
airship-irc-botwas refused - did you specify the right host or port? The connection to the server 10.23.25.102:6443 was refused - did you specify the right host or port? The connection to the server 10.23.25.102:6443 was refused - did you specify the right host or port? The connection to the server 10.23.25.102:6443 was refused - did you specify the right host or port? The connection to the server 10.23.25.102:6443 was refused - did you specify the right15:56
airship-irc-bothost or port? The connection to the server 10.23.25.102:6443 was refused - did you specify the right host or port? exit status 1[airshipctl] 2021/08/03 15:25:23 opendev.org/airship/airshipctl/pkg/events/processor.go:59: Received error on event channel {yaml: line 50: could not find expected ':'} Error events received on channel, errors are: [yaml: line 50: could not find expected ':']15:56
airship-irc-bot<james.gu> I am not finding anywhere in our code that generates that error message. I am curious if anyone has seen this or similar error?15:57
airship-irc-bot<james.gu> if I run that phase manually, it will be successful. So it seems to a transitional issue.15:58
airship-irc-bot<james.gu> the deployment script alreays runs the phase plan with --debug option. Is there other logging settings that I can force it to print out the yaml content that had the complaint of missing ":"?16:00
airship-irc-bot<sidney.shiba> The first thing I noticed in the c`ert-manager` manifest from the `clusterctl-init-target` is the labels for `clusterctl` even for the `Namespace`, which are not in the `cert-manager` distribution. May be `CAPZ` relies on those labels. Will continue investigating but this is a good start. Thanks @raliev for creating this image.  `apiVersion: v1` `kind: Namespace` `metadata:`   `annotations:`    16:23
airship-irc-bot`cert-manager.clusterctl.cluster.x-k8s.io/version: v1.1.0`   `labels:`     `clusterctl.cluster.x-k8s.io: ""`     `clusterctl.cluster.x-k8s.io/core: cert-manager`   `name: cert-manager`16:23
airship-irc-bot<awander> @sidney.shiba Those labels are for lifecycle mgmt tasks -- e.g. upgrade cert-manager, move Certificates automatically, etc. Letting clusterctl install cert-manager is simplest way to go but I guess we have reasons for doing our own install.20:35
airship-irc-bot<awander> Can you check if certs are being created: ```$ kc get certificate -A NAMESPACE                           NAME                                      READY   SECRET                                            AGE capd-system                         capd-serving-cert                         True    capd-webhook-service-cert                         5d4h capi-kubeadm-bootstrap-system       capi-kubeadm-bootstrap-serving-cert       True   20:36
airship-irc-botcapi-kubeadm-bootstrap-webhook-service-cert       5d4h capi-kubeadm-control-plane-system   capi-kubeadm-control-plane-serving-cert   True    capi-kubeadm-control-plane-webhook-service-cert   5d4h capi-system                         capi-serving-cert                         True    capi-webhook-service-cert                         5d4h```20:36
airship-irc-bot<awander> Additionally, if we can build our own clusterctl (@raliev), then I would recommend printing any errors returned here. It's likely something is going wrong here.20:37
airship-irc-bot<awander> You can also try to do this and see this produces any errors: ``` kubectl apply -f cmd/clusterctl/config/assets/cert-manager-test-resources.yaml``` This is how clusterctl checks if a cert-manager is already installed. You'll need to clone the capi repo to get that file.20:41
airship-irc-bot<sidney.shiba> The certificates have been created: `$ kubectl --context target-cluster get certificate -A` `NAMESPACE                           NAME                                      READY   SECRET                                            AGE` `capi-kubeadm-bootstrap-system       capi-kubeadm-bootstrap-serving-cert       True    capi-kubeadm-bootstrap-webhook-service-cert       60m` `capi-kubeadm-control-plane-system  20:49
airship-irc-botcapi-kubeadm-control-plane-serving-cert   True    capi-kubeadm-control-plane-webhook-service-cert   60m` `capi-system                         capi-serving-cert                         True    capi-webhook-service-cert                         60m` `capz-system                         capz-serving-cert                         True    capz-webhook-service-cert                         60m`20:49
airship-irc-bot<sidney.shiba> After the `clusterctl-move`, when I describe the `machine` or `cluster`, they show not ready. See describe `cluster` below: `Status:`   `Conditions:`     `Last Transition Time:  2021-08-03T19:50:03Z`     `Reason:                WaitingForControlPlane`     `Severity:              Info`     `Status:                False`     `Type:                  Ready`     `Last Transition Time:  2021-08-03T19:50:03Z`     `Message:             20:51
airship-irc-bot Waiting for control plane provider to indicate the control plane has been initialized`     `Reason:                WaitingForControlPlaneProviderInitialized`     `Severity:              Info`     `Status:                False`     `Type:                  ControlPlaneInitialized`     `Last Transition Time:  2021-08-03T19:50:03Z`     `Reason:                WaitingForControlPlane`     `Severity:              Info`     `Status:               20:51
airship-irc-botFalse`     `Type:                  ControlPlaneReady`     `Last Transition Time:  2021-08-03T19:50:03Z`     `Reason:                WaitingForInfrastructure`     `Severity:              Info`     `Status:                False`     `Type:                  InfrastructureReady`20:51
airship-irc-bot<sidney.shiba> Results for the `describe` on the `machine` below: `$ kubectl --context target-cluster describe machine target-cluster-control-plane-dh48w -n target-infra` `...` `Status:`   `Bootstrap Ready:  true`   `Conditions:`     `Last Transition Time:  2021-08-03T19:50:04Z`     `Message:               1 of 2 completed`     `Reason:                WaitingForClusterInfrastructure`     `Severity:              Info`     `Status:              20:53
airship-irc-bot False`     `Type:                  Ready`     `Last Transition Time:  2021-08-03T19:50:03Z`     `Status:                True`     `Type:                  BootstrapReady`     `Last Transition Time:  2021-08-03T19:50:04Z`     `Reason:                WaitingForClusterInfrastructure`     `Severity:              Info`     `Status:                False`     `Type:                  InfrastructureReady`     `Last Transition Time: 20:53
airship-irc-bot2021-08-03T19:50:04Z`     `Status:                True`     `Type:                  NodeHealthy`20:53
airship-irc-bot<sidney.shiba> Reminding you that all this works when UN-tainting the `controlplane` node so `cert-manager` can be deployed during `clusterctl-init-target` phase. Now, I am not sure if this is related to the `cert-manager`.20:55
airship-irc-bot<sidney.shiba> @raliev It didn't work even after "extracting" the `cert-manager` manifest from `clusterctl-init-target` from your special `clustectl` image (thanks for that) and  implementing the difference to the `cert-manager v1.1.0` manifest in `airshipctl`.20:56
airship-irc-bot<awander> And do you see ```"Skipping installing cert-manager as it is already installed"``` in your cluster init logs?20:58
airship-irc-bot<sidney.shiba> No, it created three resources: `$ kubectl --context target-cluster apply -f cluster-api/cmd/clusterctl/client/cluster/assets/cert-manager-test-resources.yaml`  `namespace/cert-manager-test created` `issuer.cert-manager.io/test-selfsigned created` `certificate.cert-manager.io/selfsigned-cert created`21:04
airship-irc-bot<awander> Hm. so the cert-manager looks fine to me.21:08
airship-irc-bot<sidney.shiba> I will use the `cert-manager.yaml` from `capi` and see what happens but first need to know what version of `cert-manager` is this yaml file supporting?21:08
airship-irc-bot<awander> capi is installing 1.10 by default21:09
airship-irc-bot<awander> but as i said, this looks like a problem beyond cert-manager21:09
airship-irc-bot<awander> Can you do describe on the CAPZCluster?21:10
airship-irc-bot<sidney.shiba> `Events:`   `Type     Reason                         Age                   From                     Message`   `----     ------                         ----                  ----                     -------`   `Warning  ClusterReconcilerNormalFailed  5m10s (x29 over 85m)  azurecluster-reconciler  failed to reconcile cluster services: failed to get availability zones: failed to get zones for location centralus: failed to refresh21:16
airship-irc-botresource sku cache: could not list resource skus: azure.BearerAuthorizer#WithAuthorization: Failed to refresh the Token for request to https://management.azure.com/subscriptions/cb3e23d3-b697-4c4f-a1a7-529e308691e4/providers/Microsoft.Compute/skus?%!f(MISSING)ilter=location+eq+%!c(MISSING)entralus%!&(MISSING)api-version=2019-04-01: StatusCode=400 -- Original Error: adal: Refresh request failed. Status Code = '400'. Response body:21:16
airship-irc-bot{"error":"invalid_request","error_description":"Identity not found"} Endpoint http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01&client_id=4a66df50-70ee-4106-b1c9-efe90027e5eb&resource=https%!A(MISSING)%!F(MISSING)%!F(MISSING)management.azure.com%!F(MISSING)`21:16
airship-irc-bot<sidney.shiba> Full describe below: `$ kubectl --context target-cluster describe AzureCluster -n target-infra target-cluster` `Name:         target-cluster` `Namespace:    target-infra` `Labels:       cluster.x-k8s.io/cluster-name=target-cluster` `Annotations:  <none>` `API Version:  infrastructure.cluster.x-k8s.io/v1alpha4` `Kind:         AzureCluster` `Metadata:`   `Creation Timestamp:  2021-08-03T19:50:01Z`   `Finalizers:`    21:18
airship-irc-bot`azurecluster.infrastructure.cluster.x-k8s.io`   `Generation:  1`   `Managed Fields:`     `API Version:  infrastructure.cluster.x-k8s.io/v1alpha4`     `Fields Type:  FieldsV1`     `fieldsV1:`       `f:metadata:`         `f:ownerReferences:`           `k:{"uid":"49eb25c2-d84a-4dcb-9fbd-774de1785270"}:`             `.:`             `f:apiVersion:`             `f:blockOwnerDeletion:`             `f:controller:`             `f:kind:`            21:18
airship-irc-bot`f:name:`             `f:uid:`     `Manager:      manager`     `Operation:    Update`     `Time:         2021-08-03T19:27:03Z`     `API Version:  infrastructure.cluster.x-k8s.io/v1alpha4`     `Fields Type:  FieldsV1`     `fieldsV1:`       `f:status:`         `.:`         `f:conditions:`         `f:failureDomains:`           `.:`           `f:1:`             `.:`             `f:controlPlane:`           `f:2:`             `.:`            21:18
airship-irc-bot`f:controlPlane:`           `f:3:`             `.:`             `f:controlPlane:`         `f:ready:`     `Manager:      cluster-api-provider-azure-manager`     `Operation:    Update`     `Time:         2021-08-03T19:27:38Z`     `API Version:  infrastructure.cluster.x-k8s.io/v1alpha4`     `Fields Type:  FieldsV1`     `fieldsV1:`       `f:metadata:`         `f:annotations:`           `.:`          21:18
airship-irc-bot`f:kubectl.kubernetes.io/last-applied-configuration:`         `f:finalizers:`           `.:`           `v:"azurecluster.infrastructure.cluster.x-k8s.io":`         `f:labels:`           `.:`           `f:cluster.x-k8s.io/cluster-name:`         `f:ownerReferences:`           `.:`           `k:{"uid":"1d191305-6270-46fd-970e-40c8825ce983"}:`             `.:`             `f:apiVersion:`             `f:blockOwnerDeletion:`            21:18
airship-irc-bot`f:controller:`             `f:kind:`             `f:name:`             `f:uid:`       `f:spec:`         `.:`         `f:azureEnvironment:`         `f:bastionSpec:`         `f:controlPlaneEndpoint:`           `.:`           `f:host:`           `f:port:`         `f:identityRef:`           `.:`           `f:apiVersion:`           `f:kind:`           `f:name:`         `f:location:`         `f:networkSpec:`           `.:`          21:18
airship-irc-bot`f:apiServerLB:`             `.:`             `f:frontendIPs:`             `f:idleTimeoutInMinutes:`             `f:name:`             `f:sku:`             `f:type:`           `f:nodeOutboundLB:`             `.:`             `f:frontendIPs:`             `f:frontendIPsCount:`             `f:idleTimeoutInMinutes:`             `f:name:`             `f:sku:`             `f:type:`           `f:subnets:`           `f:vnet:`             `.:`         21:18
airship-irc-bot   `f:cidrBlocks:`             `f:id:`             `f:name:`             `f:resourceGroup:`             `f:tags:`               `.:`               `f:Name:`               `f:sigs.k8s.io_cluster-api-provider-azure_cluster_target-cluster:`               `f:sigs.k8s.io_cluster-api-provider-azure_role:`         `f:resourceGroup:`         `f:subscriptionID:`     `Manager:    clusterctl`     `Operation:  Update`     `Time:      21:18
airship-irc-bot2021-08-03T19:50:01Z`   `Owner References:`     `API Version:           cluster.x-k8s.io/v1alpha4`     `Block Owner Deletion:  true`     `Controller:            true`     `Kind:                  Cluster`     `Name:                  target-cluster`     `UID:                   1d191305-6270-46fd-970e-40c8825ce983`   `Resource Version:        3758`   `UID:                     b835f27e-cd4b-4400-a29b-7ad1ab1dd42a` `Spec:`   `Azure Environment: 21:18
airship-irc-botAzurePublicCloud`   `Bastion Spec:`   `Control Plane Endpoint:`     `Host:  target-cluster-25eb69a9.centralus.cloudapp.azure.com`     `Port:  6443`   `Identity Ref:`     `API Version:  infrastructure.cluster.x-k8s.io/v1alpha4`     `Kind:         AzureClusterIdentity`     `Name:         target-cluster-identity`   `Location:       centralus`   `Network Spec:`     `API Server LB:`       `Frontend I Ps:`         `Name: 21:19
airship-irc-bottarget-cluster-public-lb-frontEnd`         `Public IP:`           `Dns Name:             target-cluster-25eb69a9.centralus.cloudapp.azure.com`           `Name:                 pip-target-cluster-apiserver`       `Idle Timeout In Minutes:  4`       `Name:                     target-cluster-public-lb`       `Sku:                      Standard`       `Type:                     Public`     `Node Outbound LB:`       `Frontend I Ps:`         `Name: 21:19
airship-irc-bottarget-cluster-frontEnd`         `Public IP:`           `Name:                 pip-target-cluster-node-outbound`       `Frontend I Ps Count:      1`       `Idle Timeout In Minutes:  4`       `Name:                     target-cluster`       `Sku:                      Standard`       `Type:                     Public`     `Subnets:`       `Cidr Blocks:`         `10.0.0.0/16`       `Id:   21:19
airship-irc-bot/subscriptions/cb3e23d3-b697-4c4f-a1a7-529e308691e4/resourceGroups/capz-target-cluster-rg/providers/Microsoft.Network/virtualNetworks/capz-workload-vnet/subnets/target-cluster-controlplane-subnet`       `Name:  target-cluster-controlplane-subnet`       `Nat Gateway:`         `Ip:`           `Name:`         `Role:      control-plane`       `Route Table:`       `Security Group:`         `Name:  target-cluster-controlplane-nsg`       `Cidr21:19
airship-irc-botBlocks:`         `10.1.0.0/16`       `Id:    /subscriptions/cb3e23d3-b697-4c4f-a1a7-529e308691e4/resourceGroups/capz-target-cluster-rg/providers/Microsoft.Network/virtualNetworks/capz-workload-vnet/subnets/target-cluster-node-subnet`       `Name:  target-cluster-node-subnet`       `Nat Gateway:`         `Ip:`           `Name:`         `Role:      node`       `Route Table:`         `Name:  target-cluster-node-routetable`       `Security Group:`21:19
airship-irc-bot        `Name:  target-cluster-node-nsg`     `Vnet:`       `Cidr Blocks:`         `10.0.0.0/8`       `Id:              /subscriptions/cb3e23d3-b697-4c4f-a1a7-529e308691e4/resourceGroups/capz-target-cluster-rg/providers/Microsoft.Network/virtualNetworks/capz-workload-vnet`       `Name:            capz-workload-vnet`       `Resource Group:  capz-target-cluster-rg`       `Tags:`         `Name:                                                      21:19
airship-irc-bot    capz-workload-vnet`         `sigs.k8s.io_cluster-api-provider-azure_cluster_target-cluster:  owned`         `sigs.k8s.io_cluster-api-provider-azure_role:                    common`   `Resource Group:                                                       capz-target-cluster-rg`   `Subscription ID:                                                      cb3e23d3-b697-4c4f-a1a7-529e308691e4` `Events:`   `Type     Reason                        21:19
airship-irc-botAge                   From                     Message`   `----     ------                         ----                  ----                     -------`   `Warning  ClusterReconcilerNormalFailed  5m10s (x29 over 85m)  azurecluster-reconciler  failed to reconcile cluster services: failed to get availability zones: failed to get zones for location centralus: failed to refresh resource sku cache: could not list resource skus:21:19
airship-irc-botazure.BearerAuthorizer#WithAuthorization: Failed to refresh the Token for request to https://management.azure.com/subscriptions/cb3e23d3-b697-4c4f-a1a7-529e308691e4/providers/Microsoft.Compute/skus?%!f(MISSING)ilter=location+eq+%!c(MISSING)entralus%!&(MISSING)api-version=2019-04-01: StatusCode=400 -- Original Error: adal: Refresh request failed. Status Code = '400'. Response body: {"error":"invalid_request","error_description":"Identity not21:19
airship-irc-botfound"} Endpoint http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01&client_id=4a66df50-70ee-4106-b1c9-efe90027e5eb&resource=https%!A(MISSING)%!F(MISSING)%!F(MISSING)management.azure.com%!F(MISSING)`21:19
airship-irc-bot<awander> lol. that's a log error21:20
airship-irc-bot<awander> Take a look at the capz logs.21:21
airship-irc-bot<sidney.shiba> It seems Identity was not found in this case among other things.21:22
airship-irc-bot<awander> Did this work any point?21:23
airship-irc-bot<awander> Looks like something missing in your mafifests/configs?21:23
airship-irc-bot<sidney.shiba> It works when I "untaint" the control plane node and then execute `airshipctl phase run clusterctl-init-target`, followed by `clusterctl-move` and then `workers-target`. Just after the `clusterctl-move` I check the status of the machine and it shows `Running`.21:31
airship-irc-bot<sidney.shiba> As baremetal use `tolerations` instead, I needed to get the `cert-manager.yaml` kustomized to add these tolerations and move its deployment to `initinfra` phase.21:33
airship-irc-bot<sidney.shiba> This is the scenario that is not working and not sure now if this is an issue with `cert-manager` anymore.21:34
airship-irc-bot<awander> Yeah, could be cert-manager. CAPZ also uses cert-manager.21:36
airship-irc-bot<sidney.shiba> @raliev is it possible to add tolerations to the cert-manager in the clusterctl init process?22:22
airship-irc-bot<sidney.shiba> Hi team, does anybody know if the deployment of cert-manager was added in the `initinfra-networking-target` so tolerations could be added, instead of deploying it during the `clusterctl-init-target`?22:27
airship-irc-bot<sidney.shiba> Arvinder was helping me in troubleshooting this issue and he suggested that it is better to use cert-manager from capi so it does not de-synchronize. My guess is that cert-manager was deployed outside the clusterctl init so tolerations could be added, but may be wrong.22:33

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!