Skip to content

SPLAT-2096: Refactoring script for vSphere multi network jobs #62910

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 31, 2025

Conversation

vr4manta
Copy link
Contributor

@vr4manta vr4manta commented Mar 17, 2025

SPLAT-2096

Changes

  • Due to jobs using multi nic different, consolidating logic and simplifying the processing of multiple networks in one lease.
    • Removed the extra lease logic and migrated to using "network" field that specifies network count
    • Migrated logic for VSPHERE_MULTI_NETWORK to be applied to "network" field generically
  • Redesigned parameter VSPHERE_MULTI_NETWORKS to create unique boskos-lease-id to allow each failure domain to have unique subnet
  • Changed vcenter_portgroups to use failure domain name instead of server since it would result in duplicates when combining networks.
  • Moved network caching logic into a function to reduce duplicate code

Notes

There are some private jobs that are using a different variable for what seems like multiple subnets / networks. Due to that logic addition, our normal jobs are breaking. I'm using this PR to look into finding a better way to handle multiple networks across all of our vCenters. (#59489)

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Mar 17, 2025
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Mar 17, 2025

@vr4manta: This pull request references SPLAT-2096 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.19.0" version, but no target version was set.

In response to this:

SPLAT-2096

Changes

  • Fixed vSphere multi network jobs to use a static pool to make sure all NICs are reserved from same VCM pool

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

1 similar comment
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Mar 17, 2025

@vr4manta: This pull request references SPLAT-2096 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.19.0" version, but no target version was set.

In response to this:

SPLAT-2096

Changes

  • Fixed vSphere multi network jobs to use a static pool to make sure all NICs are reserved from same VCM pool

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@vr4manta
Copy link
Contributor Author

/pj-rehearse pull-ci-openshift-installer-main-e2e-vsphere-ovn-multi-network

@openshift-ci-robot
Copy link
Contributor

@vr4manta: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@vr4manta
Copy link
Contributor Author

/pj-rehearse pull-ci-openshift-installer-main-e2e-vsphere-ovn-multi-network

@openshift-ci-robot
Copy link
Contributor

@vr4manta: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@vr4manta
Copy link
Contributor Author

/pj-rehearse pull-ci-openshift-installer-release-4.18-e2e-vsphere-ovn-multi-network

@openshift-ci-robot
Copy link
Contributor

@vr4manta: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@vr4manta
Copy link
Contributor Author

/retest

@openshift-ci-robot
Copy link
Contributor

@vr4manta: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@vr4manta
Copy link
Contributor Author

/pj-rehearse pull-ci-openshift-installer-main-e2e-vsphere-ovn-multi-network
These rehearsals are failing due to an issue building the openstack installer image. We've reached out to see if we can get it fixed, but otherwise its hard to verify this test is working when the workflow never starts.

@openshift-ci-robot
Copy link
Contributor

@vr4manta: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

1 similar comment
@openshift-ci-robot
Copy link
Contributor

@vr4manta: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@openshift-ci-robot
Copy link
Contributor

@vr4manta: requesting more than one rehearsal in one comment is not supported. If you would like to rehearse multiple specific jobs, please separate the job names by a space in a single command.

@openshift-ci-robot
Copy link
Contributor

@vr4manta: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@vr4manta
Copy link
Contributor Author

/hold
Found issue with script that needs to be fixed.

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 20, 2025
@vr4manta
Copy link
Contributor Author

/pj-rehearse pull-ci-openshift-installer-main-e2e-vsphere-ovn-multi-network pull-ci-openshift-installer-release-4.18-e2e-vsphere-ovn-multi-network

@openshift-ci-robot
Copy link
Contributor

@vr4manta: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

2 similar comments
@openshift-ci-robot
Copy link
Contributor

@vr4manta: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@openshift-ci-robot
Copy link
Contributor

@vr4manta: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Mar 21, 2025

@vr4manta: This pull request references SPLAT-2096 which is a valid jira issue.

In response to this:

SPLAT-2096

Changes

  • Fixed vSphere multi network jobs to use a static pool to make sure all NICs are reserved from same VCM pool
  • Due to jobs using multi nic different, consolidating logic and simplifying the processing of multiple networks in one lease.

Notes

There are some private jobs that are using a different variable for what seems like multiple subnets / networks. Due to that logic addition, our normal jobs are breaking. I'm using this PR to look into finding a better way to handle multiple networks across all of our vCenters.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@vr4manta vr4manta changed the title SPLAT-2096: Added pool definition to multi network jobs SPLAT-2096: Refactoring script for vSphere multi network jobs Mar 21, 2025
@openshift-ci-robot
Copy link
Contributor

@vr4manta: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@vr4manta
Copy link
Contributor Author

/pj-rehearse pull-ci-openshift-installer-main-e2e-vsphere-ovn-multi-network

@openshift-ci-robot
Copy link
Contributor

@vr4manta: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@vr4manta
Copy link
Contributor Author

/pj-rehearse pull-ci-openshift-installer-main-e2e-vsphere-ovn-multi-network

@openshift-ci-robot
Copy link
Contributor

@vr4manta: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@vr4manta
Copy link
Contributor Author

/pj-rehearse pull-ci-openshift-installer-main-e2e-vsphere-ovn-multi-network

@openshift-ci-robot
Copy link
Contributor

@vr4manta: job(s): , pull-ci-openshift-machine-api-operator-main-e2e-vsphere-ov either don't exist or were not found to be affected, and cannot be rehearsed

@vr4manta
Copy link
Contributor Author

/pj-rehearse pull-ci-openshift-machine-api-operator-main-e2e-vsphere-ovn

@openshift-ci-robot
Copy link
Contributor

@vr4manta: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@vr4manta
Copy link
Contributor Author

/pj-rehearse periodic-ci-openshift-release-master-nightly-4.18-e2e-vsphere-ovn-upi-multi-vcenter periodic-ci-openshift-openshift-tests-private-release-4.18-amd64-nightly-vsphere-ipi-zones-multisubnets-external-lb-f28 pull-ci-openshift-installer-main-e2e-vsphere-ovn-multi-network pull-ci-openshift-machine-api-operator-main-e2e-vsphere-ovn

@openshift-ci-robot
Copy link
Contributor

@vr4manta: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@vr4manta
Copy link
Contributor Author

/pj-rehearse periodic-ci-openshift-release-master-nightly-4.18-e2e-vsphere-ovn-upi-multi-vcenter periodic-ci-openshift-openshift-tests-private-release-4.18-amd64-nightly-vsphere-ipi-zones-multisubnets-external-lb-f28 pull-ci-openshift-installer-main-e2e-vsphere-ovn-multi-network pull-ci-openshift-machine-api-operator-main-e2e-vsphere-ovn

@openshift-ci-robot
Copy link
Contributor

@vr4manta: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@vr4manta
Copy link
Contributor Author

/pj-rehearse periodic-ci-openshift-release-master-nightly-4.18-e2e-vsphere-ovn-upi-multi-vcenter periodic-ci-openshift-openshift-tests-private-release-4.18-amd64-nightly-vsphere-ipi-zones-multisubnets-external-lb-f28 pull-ci-openshift-installer-main-e2e-vsphere-ovn-multi-network pull-ci-openshift-machine-api-operator-main-e2e-vsphere-ovn

@openshift-ci-robot
Copy link
Contributor

@vr4manta: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

Copy link
Contributor

openshift-ci bot commented Mar 29, 2025

@vr4manta: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/rehearse/openshift/installer/release-4.18/e2e-vsphere-ovn-multi-network 5a4d283 link unknown /pj-rehearse pull-ci-openshift-installer-release-4.18-e2e-vsphere-ovn-multi-network

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@vr4manta
Copy link
Contributor Author

/pj-rehearse periodic-ci-openshift-release-master-nightly-4.18-e2e-vsphere-ovn-upi-multi-vcenter periodic-ci-openshift-openshift-tests-private-release-4.18-amd64-nightly-vsphere-ipi-zones-multisubnets-external-lb-f28 pull-ci-openshift-installer-main-e2e-vsphere-ovn-multi-network pull-ci-openshift-machine-api-operator-main-e2e-vsphere-ovn

@openshift-ci-robot
Copy link
Contributor

@vr4manta: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@vr4manta
Copy link
Contributor Author

/pj-rehearse periodic-ci-openshift-release-master-nightly-4.18-e2e-vsphere-ovn-upi-multi-vcenter periodic-ci-openshift-openshift-tests-private-release-4.18-amd64-nightly-vsphere-ipi-zones-multisubnets-external-lb-f28 pull-ci-openshift-installer-main-e2e-vsphere-ovn-multi-network pull-ci-openshift-machine-api-operator-main-e2e-vsphere-ovn

@openshift-ci-robot
Copy link
Contributor

@vr4manta: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@openshift-ci-robot
Copy link
Contributor

[REHEARSALNOTIFIER]
@vr4manta: the pj-rehearse plugin accommodates running rehearsal tests for the changes in this PR. Expand 'Interacting with pj-rehearse' for usage details. The following rehearsable tests have been affected by this change:

Test name Repo Type Reason
pull-ci-openshift-csi-external-attacher-master-e2e-vsphere openshift/csi-external-attacher presubmit Registry content changed
pull-ci-openshift-csi-external-attacher-release-4.20-e2e-vsphere openshift/csi-external-attacher presubmit Registry content changed
pull-ci-openshift-csi-external-attacher-release-4.19-e2e-vsphere openshift/csi-external-attacher presubmit Registry content changed
pull-ci-openshift-csi-external-attacher-release-4.18-e2e-vsphere openshift/csi-external-attacher presubmit Registry content changed
pull-ci-openshift-csi-external-attacher-release-4.17-e2e-vsphere openshift/csi-external-attacher presubmit Registry content changed
pull-ci-openshift-csi-external-attacher-master-e2e-vsphere-csi openshift/csi-external-attacher presubmit Registry content changed
pull-ci-openshift-csi-external-attacher-release-4.20-e2e-vsphere-csi openshift/csi-external-attacher presubmit Registry content changed
pull-ci-openshift-csi-external-attacher-release-4.19-e2e-vsphere-csi openshift/csi-external-attacher presubmit Registry content changed
pull-ci-openshift-csi-external-attacher-release-4.18-e2e-vsphere-csi openshift/csi-external-attacher presubmit Registry content changed
pull-ci-openshift-csi-external-attacher-release-4.17-e2e-vsphere-csi openshift/csi-external-attacher presubmit Registry content changed
pull-ci-openshift-cluster-autoscaler-operator-master-e2e-vsphere-periodic-pre openshift/cluster-autoscaler-operator presubmit Registry content changed
pull-ci-openshift-cluster-autoscaler-operator-release-4.20-e2e-vsphere-periodic-pre openshift/cluster-autoscaler-operator presubmit Registry content changed
pull-ci-openshift-cluster-autoscaler-operator-release-4.19-e2e-vsphere-periodic-pre openshift/cluster-autoscaler-operator presubmit Registry content changed
pull-ci-openshift-cluster-autoscaler-operator-release-4.18-e2e-vsphere-periodic-pre openshift/cluster-autoscaler-operator presubmit Registry content changed
pull-ci-openshift-cluster-autoscaler-operator-release-4.17-e2e-vsphere-periodic-pre openshift/cluster-autoscaler-operator presubmit Registry content changed
pull-ci-openshift-windows-machine-config-operator-master-wicd-unit-vsphere openshift/windows-machine-config-operator presubmit Registry content changed
pull-ci-openshift-windows-machine-config-operator-release-4.20-wicd-unit-vsphere openshift/windows-machine-config-operator presubmit Registry content changed
pull-ci-openshift-windows-machine-config-operator-release-4.19-wicd-unit-vsphere openshift/windows-machine-config-operator presubmit Registry content changed
pull-ci-openshift-windows-machine-config-operator-release-4.18-wicd-unit-vsphere openshift/windows-machine-config-operator presubmit Registry content changed
pull-ci-openshift-windows-machine-config-operator-release-4.17-wicd-unit-vsphere openshift/windows-machine-config-operator presubmit Registry content changed
pull-ci-openshift-windows-machine-config-operator-release-4.16-wicd-unit-vsphere openshift/windows-machine-config-operator presubmit Registry content changed
pull-ci-openshift-windows-machine-config-operator-release-4.15-wicd-unit-vsphere openshift/windows-machine-config-operator presubmit Registry content changed
pull-ci-openshift-windows-machine-config-operator-release-4.14-wicd-unit-vsphere openshift/windows-machine-config-operator presubmit Registry content changed
pull-ci-openshift-windows-machine-config-operator-release-4.13-wicd-unit-vsphere openshift/windows-machine-config-operator presubmit Registry content changed
pull-ci-openshift-windows-machine-config-operator-release-4.12-wicd-unit-vsphere openshift/windows-machine-config-operator presubmit Registry content changed

A total of 2060 jobs have been affected by this change. The above listing is non-exhaustive and limited to 25 jobs.

A full list of affected jobs can be found here

Interacting with pj-rehearse

Comment: /pj-rehearse to run up to 5 rehearsals
Comment: /pj-rehearse skip to opt-out of rehearsals
Comment: /pj-rehearse {test-name}, with each test separated by a space, to run one or more specific rehearsals
Comment: /pj-rehearse more to run up to 10 rehearsals
Comment: /pj-rehearse max to run up to 25 rehearsals
Comment: /pj-rehearse auto-ack to run up to 5 rehearsals, and add the rehearsals-ack label on success
Comment: /pj-rehearse list to get an up-to-date list of affected jobs
Comment: /pj-rehearse abort to abort all active rehearsals
Comment: /pj-rehearse network-access-allowed to allow rehearsals of tests that have the restrict_network_access field set to false. This must be executed by an openshift org member who is not the PR author

Once you are satisfied with the results of the rehearsals, comment: /pj-rehearse ack to unblock merge. When the rehearsals-ack label is present on your PR, merge will no longer be blocked by rehearsals.
If you would like the rehearsals-ack label removed, comment: /pj-rehearse reject to re-block merging.

@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Mar 31, 2025

@vr4manta: This pull request references SPLAT-2096 which is a valid jira issue.

In response to this:

SPLAT-2096

Changes

  • Due to jobs using multi nic different, consolidating logic and simplifying the processing of multiple networks in one lease.
  • Removed the extra lease logic and migrated to using "network" field that specifies network count
  • Migrated logic for VSPHERE_MULTI_NETWORK to be applied to "network" field generically
  • Redesigned parameter VSPHERE_MULTI_NETWORKS to create unique boskos-lease-id to allow each failure domain to have unique subnet
  • Changed vcenter_portgroups to use failure domain name instead of server since it would result in duplicates when combining networks.
  • Moved network caching logic into a function to reduce duplicate code

Notes

There are some private jobs that are using a different variable for what seems like multiple subnets / networks. Due to that logic addition, our normal jobs are breaking. I'm using this PR to look into finding a better way to handle multiple networks across all of our vCenters. (#59489)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@vr4manta
Copy link
Contributor Author

/unhold

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 31, 2025
@rvanderp3
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Mar 31, 2025
Copy link
Contributor

openshift-ci bot commented Mar 31, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: rvanderp3, vr4manta

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@vr4manta
Copy link
Contributor Author

/pj-rehearse ack

@openshift-ci-robot
Copy link
Contributor

@vr4manta: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@openshift-ci-robot openshift-ci-robot added the rehearsals-ack Signifies that rehearsal jobs have been acknowledged label Mar 31, 2025
@openshift-merge-bot openshift-merge-bot bot merged commit c1096b3 into openshift:master Mar 31, 2025
13 of 14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. rehearsals-ack Signifies that rehearsal jobs have been acknowledged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants